Commit Graph

4278 Commits

Author SHA1 Message Date
Patrick Ohly f89e4c08cf DRA CEL: add missing size estimator
Not implementing a size estimator had the effect that strings retrieved from
the attributes were treated as "unknown size", leading to wildly overestimating
the cost and validation errors even for even simple expressions like this:

    device.attributes["qat.intel.com"].services.matches("[^a]?sym")

Maximum number of elements in maps and the maximum length of the driver name
string were also ignored resp. missing. Pre-defined types like
apiservercel.StringType must be avoided because they are defined as having
a zero maximum size.
2025-01-17 14:27:49 +01:00
Patrick Ohly a5de75458e DRA API: bump maximum size of ReservedFor to 256
The original limit of 32 seemed sufficient for a single GPU on a node. But for
shared non-local resources it is too low. For example, a ResourceClaim might be
used to allocate an interconnect channel that connects all pods of a workload
running on several different nodes, in which case the number of pods can be
considerably larger.

256 is high enough for currently planned systems. If we need something even
higher in the future, an alternative approach might be needed to avoid
scalability problems.

Normally, increasing such a limit would have to be done incrementally over two
releases. In this case we decided on
Slack (https://kubernetes.slack.com/archives/CJUQN3E4T/p1734593174791519) to
make an exception and apply this change to current master for 1.33 and backport
it to the next 1.32.x patch release for production usage.

This breaks downgrades to a 1.32 release without this change if there are
ResourceClaims with a number of consumers > 32 in ReservedFor. In practice,
this breakage is very unlikely because there are no workloads yet which need so
many consumers and such downgrades to a previous patch release are also
unlikely. Downgrades to 1.31 already weren't supported when using DRA v1beta1.
2025-01-09 14:27:03 +01:00
Kubernetes Prow Robot e4c1f980b7
Merge pull request #128932 from pohly/dra-node-selector-validation
DRA API: validate node selector labels
2024-11-22 20:22:55 +00:00
AxeZhan 3075a9ae96 DRA API: validate node selector labels
Previously, ValidateNodeSelector did not check that labels are valid. Now it
does for resource.k8s.io, regardless whether an object already was created with
invalid labels in an earlier Kubernetes release. Theoretically this is a
breaking change and could cause problems during an upgrade, but that is highly
unlikely in practice.

In contrast to node affinity, DRA does not ignore parse errors
(= uses NewNodeSelector, not NewLazyErrorNodeSelector), so invalid labels would
have been found instead of being silently ignored.

Even if some object has invalid labels, this only affects an alpha -> beta
upgrade which isn't guaranteed to work seamlessly.
2024-11-22 09:10:02 +01:00
Paco Xu 03a15fa65d
Revert "[FG:InPlacePodVerticalScaling] Graduate to Beta" 2024-11-20 14:55:29 +08:00
Kubernetes Prow Robot 252e9cbb23
Merge pull request #128754 from vivzbansal/sidecar-3
Add AllowSidecarResizePolicy to relax resize policy validation check of sidecar containers
2024-11-12 20:28:48 +00:00
vivzbansal 95591abd02 Add AllowSidecarResizePolicy to relax resize policy validation check of sidecar containers 2024-11-12 05:08:51 +00:00
Tim Allclair 2935b106dc Set default ResizePolicy in fuzzer for roundtrip tests 2024-11-11 12:44:33 -08:00
Kubernetes Prow Robot 2691a29eac
Merge pull request #128683 from AnishShah/validation
[FG:InPlacePodVerticalScaling] Disallow removing requests & limits for Burstable pods.
2024-11-08 09:08:43 +00:00
Kubernetes Prow Robot c25f5eefe4
Merge pull request #128407 from ndixita/pod-level-resources
[PodLevelResources] Pod Level Resources Feature Alpha
2024-11-08 07:10:50 +00:00
Kubernetes Prow Robot 45260fd76a
Merge pull request #127857 from Jefftree/cle-v1alpha2
Coordinated Leader Election add v1alpha2
2024-11-08 07:10:43 +00:00
Anish Shah 7680f0f293 api: reject removing requsets & limits for Burstable pods. 2024-11-07 21:06:54 -08:00
Kubernetes Prow Robot 3232e2ffc0
Merge pull request #128687 from tallclair/allocated-status
[FG:InPlacePodVerticalScaling] Fix AllocatedResources feature gate annotation
2024-11-08 04:12:49 +00:00
ndixita b30e6c8b0e keeping the qos code as-is for the existing case when pod-level resources are not set
Signed-off-by: ndixita <ndixita@google.com>
2024-11-08 03:00:55 +00:00
ndixita 26f11c4586 QOS changes for Pod Level resources 2024-11-08 03:00:54 +00:00
ndixita 8a8dc27b4e Adding the logic to validate pod-level resources as following:
1. The effective container requests cannot be greater than pod-level requests
2. Inidividual container limits cannot be greater than pod-level limits
3. Only CPU & Memory are supported at pod-level
4. Inplace container resources updates are not supported if pod-level resources are set
Note: effective container requests cannot be greater than pod-level limits is supported by transitivity. Effective container requests <= pod-level requests && pod-level requests <= pod-level limits; Therefore effective container requests <= pod-level limits

Signed-off-by: ndixita <ndixita@google.com>
2024-11-08 03:00:54 +00:00
ndixita a2ddde877c Adding the logic to set default pod-level request as following:
1. If pod-level limit is set, pod-level request is unset and container-level request is set: derive pod-level request from container-level requests
2. If pod-level limit is set, pod-level request is unset and container-level request is unset: set pod-level request equal to pod-level limit
2024-11-08 03:00:54 +00:00
ndixita 85488b5f10 Generated files and compatability data from API changes 2024-11-08 03:00:50 +00:00
ndixita d7f488b5e3 API changes for Pod Level Resources
1. Add Resources struct to PodSpec struct in both external and internal API packages
2. Adding feature gate and logic for dropping disabled fields for Pod Level Resources
KEP: enhancements/keps/sig-node/2837-pod-level-resource-spec
2024-11-08 02:45:04 +00:00
Jefftree e86c38b249 generated 2024-11-08 02:27:20 +00:00
Jefftree 0ce7b688a6 v1alpha2 LeaseCandidate API 2024-11-08 02:27:19 +00:00
Kubernetes Prow Robot 4cf2818f96
Merge pull request #128240 from LionelJouin/KEP-4817
DRA: Implementation of ResourceClaim.Status.Devices (KEP-4817)
2024-11-08 02:21:24 +00:00
Kubernetes Prow Robot 46b3d9b320
Merge pull request #128186 from sreeram-venkitesh/117767-in-place-pod-vertical-scaling-version-skew
Updated version skew strategy for InPlacePodVerticalScaling
2024-11-08 02:21:14 +00:00
Tim Allclair 8661f743a3 Fix AllocatedResources feature gate annotation 2024-11-07 16:31:25 -08:00
Kubernetes Prow Robot 3300aa1783
Merge pull request #128247 from mattcary/autodelete-ga
Promote StatefulSetAutoDeletePVC to stable in 1.32
2024-11-07 22:20:43 +00:00
Lionel Jouin 118356175d [KEP-4817] Add limits on conditions and IPs + fix documentation
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 22:18:53 +01:00
Sreeram Venkitesh 851dbf25e5 Added unit tests 2024-11-08 01:17:05 +05:30
Kubernetes Prow Robot 9660e5c4cd
Merge pull request #127360 from knight42/feat/split-stdout-stderr-server-side
API: add a new `Stream` field to `PodLogOptions`
2024-11-07 19:44:45 +00:00
Kubernetes Prow Robot 50362ac7d0 Promote StatefulSetAutoDeletePVC to stable for 1.32. 2024-11-07 09:43:49 -08:00
Kubernetes Prow Robot ef37cb503b
Merge pull request #128634 from thockin/remove_PodHostIPs_gate_for_1.32
Remove PodHostIPs feature gates
2024-11-07 13:47:54 +00:00
Lionel Jouin d28b50e0a0 [KEP-4817] make update
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 10:36:09 +01:00
Lionel Jouin 39f55e1cd0 [KEP-4817] Add data length limit (from #128601)
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 10:35:29 +01:00
Lionel Jouin 4b76ba1a87 [KEP-4817] Rename Addresses to IPs
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin 43d23b8994 [KEP-4817] Use structured.MakeDeviceID
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin 8ab33b8413 [KEP-4817] Improve NetworkData Validation
* Add max length for InterfaceName and HardwareAddress
* Prevent duplicated Addresses

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin a062f91106 [KEP-4817] Fixes based on review
* Rename HWAddress to HardwareAddress
* Fix condition validation
* Remove feature gate validation
* Fix drop field on disabled feature gate

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin 5df47a64d3 [KEP-4817] Remove unnecessary DeepCopy in validation tests
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin cb9ee1d4fe [KEP-4817] Remove pointer on Data, InterfaceName and HWAddress fields
Adapat validation and tests based on these API changes

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:51 +01:00
Lionel Jouin 5d7a16b0a5 [KEP-4817] improve testing
* Test feature-gate enabled/disabled for validation
* Test pkg/registry/resource/resourceclaim
* Add Data and NetworkData to integration test

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin 4bd62e5234 [KEP-4817] Fix fuzz API tests and ./hack/update-featuregates.sh
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin 3e595db0af [KEP-4817] API, validation and feature-gate
* Add status
* Add validation to check if fields are correct (Network field, device
  has been allocated))
* Add feature-gate
* Drop field if feature-gate not set

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:17 +01:00
Sreeram Venkitesh 8f1e69bbb0 Fix verify-gofmt.sh 2024-11-07 13:28:40 +05:30
Lan Liang 6e5a3cde50
Remove PodHostIPs feature gates.
Signed-off-by: Lan Liang <gcslyp@gmail.com>
2024-11-06 23:10:36 -08:00
Kubernetes Prow Robot 6cc3570466
Merge pull request #128190 from HarshalNeelkamal/external-jwt
Add plugin and key-cache for ExternalJWTSigner integration
2024-11-07 06:29:45 +00:00
Sreeram Venkitesh 1739ee2ba9 Removed duplicated tests after rebase 2024-11-07 11:38:54 +05:30
Sreeram Venkitesh 385d2b198c Fixes from review, updated tests cases 2024-11-07 11:34:58 +05:30
Sreeram Venkitesh 7d1d7182f3 Update function name and remove feature gate check 2024-11-07 11:29:11 +05:30
Sreeram Venkitesh 4dae42a796 Updated version skew strategy for InPlacePodVerticalScaling 2024-11-07 11:29:07 +05:30
Jian Zeng 4193824215
chore: update generated code
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2024-11-07 13:52:16 +08:00
Jian Zeng 0793f6577f
feat: update conversion helpers
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2024-11-07 13:52:01 +08:00