Commit Graph

21 Commits

Author SHA1 Message Date
Patrick Ohly f3fef01e79 DRA API: AdminAccess in DeviceRequestAllocationResult
Drivers need to know that because admin access may also grant additional
permissions. The allocator needs to ignore such results when determining which
devices are considered as allocated.

In both cases it is conceptually cleaner to not rely on the content of the
ClaimSpec.
2024-10-29 09:50:07 +01:00
Kubernetes Prow Robot 3690cb7f9a
Merge pull request #128101 from pohly/dra-api-cel-cost-limit
DRA API: implement CEL cost limit
2024-10-26 20:18:52 +01:00
Patrick Ohly d53cb79cec DRA cel: enforce runtime limit by default again
As pointed out during code review, the CEL cost estimates are not considered
perfectly reliable. Therefore it is better to also do runtime checks.

Some downstream users might decide to allow CEL expressions to run
longer. Therefore the cost limit is now part of an Options struct.
kube-scheduler uses the default cost limit defined in the resource.k8s.io API,
which is the same cost limit that also the apiserver uses during validation.
2024-10-23 21:24:45 +02:00
Patrick Ohly f548fc2264 DRA API: implement CEL cost limit
The main purpose is to protect against denial-of-service attacks.  Scheduling
time depends a lot on unpredictable factors and expected scheduling time also
varies, so no attempt is made to limit the overall time spent on evaluating CEL
expressions per claim.
2024-10-23 21:24:45 +02:00
Patrick Ohly f84eb5ecf8 DRA: remove "classic DRA"
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.

The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
2024-10-16 23:09:50 +02:00
Patrick Ohly 91d7882e86 DRA: new API for 1.31
This is a complete revamp of the original API. Some of the key
differences:
- refocused on structured parameters and allocating devices
- support for constraints across devices
- support for allocating "all" or a fixed amount
  of similar devices in a single request
- no class for ResourceClaims, instead individual
  device requests are associated with a mandatory
  DeviceClass

For the sake of simplicity, optional basic types (ints, strings) where the null
value is the default are represented as values in the API types. This makes Go
code simpler because it doesn't have to check for nil (consumers) and values
can be set directly (producers). The effect is that in protobuf, these fields
always get encoded because `opt` only has an effect for pointers.

The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new
"request" field. This is considered acceptable because the entire `claims`
field in the pod spec is still alpha.

The implementation is complete enough to bring up the apiserver.
Adapting other components follows.
2024-07-22 18:09:34 +02:00
Patrick Ohly 8a629b9f15 DRA: remove "sharable" from claim allocation result
Now all claims are shareable up to the limit imposed by the size of the
"reserverFor" array.

This is one of the agreed simplifications for 1.31.
2024-07-21 17:28:14 +02:00
Patrick Ohly de5742ae83 DRA: remove immediate allocation
As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate
allocation is one of those features which can be removed because it makes no
sense for structured parameters and the justification for classic DRA is weak.
2024-07-21 17:28:14 +02:00
carlory bce0335ea6 DRA: enhance validation for the ResourceClaimParametersReference and ResourceClassParametersReference with the following rules:
1. `apiGroup`: If set, it must be a valid DNS subdomain (e.g. 'example.com').
2. `kind` and `name`: It must be valid path segment name. It may not be '.' or '..' and it may not contain '/' and '%' characters.
2024-06-07 17:18:10 +08:00
Patrick Ohly a0add8d2c7 dra api: NodeResourceModel -> ResourceModel
When renaming NodeResourceSlice to ResourceSlice, the embedded
[Node]ResourceModel also should have been renamed.
2024-03-14 18:07:36 +01:00
Patrick Ohly 0b6a0d686a dra api: rename NodeResourceSlice -> ResourceSlice
While currently those objects only get published by the kubelet for node-local
resources, this could change once we also support network-attached
resources. Dropping the "Node" prefix enables such a future extension.

The NodeName in ResourceSlice and StructuredResourceHandle then becomes
optional. The kubelet still needs to provide one and it must match its own node
name, otherwise it doesn't have permission to access ResourceSlice objects.
2024-03-07 22:22:55 +01:00
Patrick Ohly d4d5ade7f5 dra: add "named resources" structured parameter model
Like the current device plugin interface, a DRA driver using this model
announces a list of resource instances. In contrast to device plugins, this
list is made available to the scheduler together with attributes that can be
used to select suitable instances when they are not all alike.

Because this is the first structured parameter model, some checks that
previously were not possible, in particular "is one structured parameter field
set", now gets enabled. Adding another structured parameter model will be
similar.

The applyconfigs code generator assumes that all types in an API are defined in
a single package. If it wasn't for that, it would be possible to place the
"named resources" types in separate packages, which makes their names in the Go
code more natural and provides an indication of their stability level because
the package name could include a version.
2024-03-07 22:21:16 +01:00
Patrick Ohly 39bbcedbca dra api: add structured parameters
NodeResourceSlice will be used by kubelet to publish resource information on
behalf of DRA drivers on the node. NodeName and DriverName in
NodeResourceSlice must be immutable. This simplifies tracking the different
objects because what they are for cannot change after creation.

The new field in ResourceClass tells scheduler and autoscaler that they are
expected to handle allocation.

ResourceClaimParameters and ResourceClassParameters are new types for telling
in-tree components how to handle claims.
2024-03-07 16:15:31 +01:00
Kubernetes Prow Robot ae36991498
Merge pull request #116332 from klueska/extend-resourceclaimstatus
Update resource.AllocationResult with a slice of ResourceHandlers
2023-03-14 19:26:50 -07:00
Kubernetes Prow Robot f315a4669a
Merge pull request #116576 from pohly/dra-core-validation
api: extend validation of dynamic resource allocation fields in PodSpec
2023-03-14 16:34:48 -07:00
Kevin Klues da0b75f8f9 Update validation for recent changes to resource.k8s.io/v1alpha2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-14 22:34:18 +00:00
Patrick Ohly e97531b349 api: extend validation of dynamic resource allocation fields in PodSpec
The generated ResourceClaim name and the names of the ResourceClaimTemplate and
ResourceClaim referenced by a pod must be valid according to the resource API,
otherwise the pod cannot start.

Checking this was removed from the original implementation out of concerns
about validating fields in core against limitations imposed by a separate,
alpha API.  But as this was pointed out again in
https://github.com/kubernetes/kubernetes/pull/116254#discussion_r1134010324
it gets added back.

The same strings that worked before still work now. In particular, the
constraints for a spec.resourceClaim.name are still the same (DNS label).
2023-03-14 11:58:41 +01:00
Patrick Ohly fec5233668 api: resource.k8s.io PodScheduling -> PodSchedulingContext
The name "PodScheduling" was unusual because in contrast to most other names,
it was impossible to put an article in front of it. Now PodSchedulingContext is
used instead.
2023-03-14 10:18:08 +01:00
Patrick Ohly 508cd60760 dynamic resource allocation: avoid apiserver complaint about list content
This fixes the following warning (error?) in the apiserver:

E0126 18:10:38.665239   16370 fieldmanager.go:210] "[SHOULD NOT HAPPEN] failed to update managedFields" err="failed to convert new object (test/claim-84; resource.k8s.io/v1alpha1, Kind=ResourceClaim) to smd typed: .status.reservedFor: element 0: associative list without keys has an element that's a map type" VersionKind="/, Kind=" namespace="test" name="claim-84"

The root cause is the same as in e50e8a0c919c0e02dc9a0ffaebb685d5348027b4:
nothing in Kubernetes outright complains about a list of items where the item
type is comparable in Go, but not a simple type. This nonetheless isn't
supposed to be done in the API and can causes problems elsewhere.

For the ReservedFor field, everything seems to work okay except for the
warning. However, it's better to follow conventions and use a map. This is
possible in this case because UID is guaranteed to be a unique key.

Validation is now stricter than before, which is a good thing: previously,
two entries with the same UID were allowed as long as some other field was
different, which wasn't a situation that should have been allowed.
2023-01-27 11:33:05 +01:00
Patrick Ohly 8018ab7cd9 api: fully validate PotentialNodes and SuitableNodes
This is in response to review feedback. Checking for valid node names and the
set property catches programming mistakes in the components that have write
permission.
2022-11-10 20:23:50 +01:00
Patrick Ohly 5cca60f0b8 api: dynamic resource allocation API
This adds a new resource.k8s.io API group with v1alpha1 as version. It contains
four new types: resource.ResourceClaim, resource.ResourceClass, resource.ResourceClaimTemplate, and
resource.PodScheduling.
2022-11-10 20:08:24 +01:00