kubernetes

Commit Graph

Author	SHA1	Message	Date
upodroid	dedd4df0a2	fetch cni plugins from GitHub releases	2024-12-18 19:48:06 +01:00
Paco Xu	59dfb0e779	skip if cri proxy is disabled/undefined	2024-11-19 11:17:07 +08:00
Laura Lorenz	9ab0d81d76	Now that sleep is shorter, only expect to reach 3 within 30s Focused too much on the container restart one in commit that fixed that Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-13 01:39:58 +00:00
Laura Lorenz	59f9858086	Move function specific to container restart test inline Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 23:59:30 +00:00
Laura Lorenz	529d5ba9d3	Don't overly indirect image name Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 23:34:57 +00:00
Laura Lorenz	8e7b2af712	Use a better util Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 23:30:03 +00:00
Laura Lorenz	285d433dea	Clearer image pull test and utils Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 23:30:00 +00:00
Laura Lorenz	e03d0f60ef	Orient tests to run faster, but tolerate infra slowdowns up to 5 minutes Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 21:48:28 +00:00
Laura Lorenz	d293c5088f	Fix spelling Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 21:12:20 +00:00
Laura Lorenz	1da8ca816e	Extract restart number properly Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 20:00:11 +00:00
Laura Lorenz	2732d57e33	Missed refactor of container name here Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 19:50:11 +00:00
Laura Lorenz	e6059d7386	Fix typecheck and verify Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 19:48:38 +00:00
Laura Lorenz	f032068ef7	Focus on restart numbers instead of timing Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 07:12:24 +00:00
Laura Lorenz	bad037b505	Formatting Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 04:48:10 +00:00
Laura Lorenz	15bae1eadf	Add container restart test too Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 04:30:46 +00:00
Laura Lorenz	fc4ac5efeb	Move image pull backoff test to be with other image pull tests Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 01:27:44 +00:00
Laura Lorenz	2479d91f2a	Fix test to count pull tries Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-12 01:27:34 +00:00
Laura Lorenz	6ef05dbd01	The idea of how this test should work Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-11 17:55:41 +00:00
Laura Lorenz	6337a28a68	Organize into its own context Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-11 17:55:41 +00:00
Laura Lorenz	f913b7afe8	Adding imagepull backoff test Signed-off-by: Laura Lorenz <lauralorenz@google.com>	2024-11-11 17:55:41 +00:00
Kubernetes Prow Robot	1dd81aa1c9	Merge pull request #126653 from zhifei92/fix-podstatus fix the issue of losing the pending phase after a node restart.	2024-11-07 21:06:54 +00:00
Kubernetes Prow Robot	ef37cb503b	Merge pull request #128634 from thockin/remove_PodHostIPs_gate_for_1.32 Remove PodHostIPs feature gates	2024-11-07 13:47:54 +00:00
zhifei92	bed96b4eb6	fix: fix the issue of losing the pending phase after a node restart.	2024-11-07 21:10:11 +08:00
Lan Liang	6e5a3cde50	Remove PodHostIPs feature gates. Signed-off-by: Lan Liang <gcslyp@gmail.com>	2024-11-06 23:10:36 -08:00
Kubernetes Prow Robot	6cc3570466	Merge pull request #128190 from HarshalNeelkamal/external-jwt Add plugin and key-cache for ExternalJWTSigner integration	2024-11-07 06:29:45 +00:00
Kubernetes Prow Robot	c462d4c8e5	Merge pull request #126096 from utam0k/support-disabling-oom-group-kill kubelet: new kubelet config option for disabling group oom kill	2024-11-07 06:29:36 +00:00
Harshal Neelkamal	6fdacf0411	Add plugin and key-cache for ExternalJWTSigner integration	2024-11-07 03:16:23 +00:00
utam0k	4f909c14a0	kubelet: new kubelet config option for disabling group oom kill Signed-off-by: utam0k <k0ma@utam0k.jp>	2024-11-07 12:03:04 +09:00
Kubernetes Prow Robot	48c65d1870	Merge pull request #128576 from bart0sh/PR166-refactor-kubelet-stop-and-restart e2e_node: refactor Kubelet stopping and restarting	2024-11-06 20:10:40 +00:00
Patrick Ohly	33ea278c51	DRA: use v1beta1 API No code is left which depends on the v1alpha3, except of course the code implementing that version.	2024-11-06 13:03:19 +01:00
Ed Bartosh	3aa95dafea	e2e_node: refactor stopping and restarting kubelet Moved Kubelet health checks from test cases to the stopKubelet API. This should make the API cleaner and easier to use.	2024-11-06 11:34:48 +02:00
Kubernetes Prow Robot	98b4ee6bfa	Merge pull request #126525 from dshebib/addSidecarE2EImgTest Restart sidecar container when the image has changed	2024-11-06 00:35:35 +00:00
Kubernetes Prow Robot	f64eeb523d	Merge pull request #128096 from bart0sh/PR161-e2e_node-consolidate-NFSServer-APIs e2e_node: consolidated NFSServer APIs.	2024-11-05 00:33:35 +00:00
Abhijit Hoskeri	d86debe500	e2e_node: Pass e2eCriProxy instead of updating global. e2eCriProxy is defined in a _test.go and referenced in a non-test file. This confuses gopls. It's also clearer to future readers.	2024-11-02 17:40:49 -07:00
Kubernetes Prow Robot	453efd7a4b	Merge pull request #121604 from pacoxu/image-pull-e2e [node-e2e] add test cases for serialize and parallel image pulling	2024-10-31 08:01:26 +00:00
Paco Xu	82df7a7d82	use cri proxy injector for parallel pulling image tests	2024-10-31 14:50:50 +08:00
Kubernetes Prow Robot	daef8c2419	Merge pull request #127266 from pohly/dra-admin-access-in-status DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate	2024-10-30 03:41:25 +00:00
Kubernetes Prow Robot	5fcef4f79d	Merge pull request #128422 from bart0sh/PR163-density-e2e_node-adjust-limits density test: adjust CPU and memory limits	2024-10-30 02:37:31 +00:00
Kubernetes Prow Robot	a339a36a36	Merge pull request #127506 from ffromani/cpu-pool-size-metrics node: metrics: add metrics about cpu pool sizes	2024-10-30 00:17:24 +00:00
Ed Bartosh	04f7a86001	density test: adjust CPU and memory limits Adjusted limits based on recent job log: I1028 20:05:42.079182 1002 resource_usage_test.go:199] Resource usage: container cpu(cores) memory_working_set(MB) memory_rss(MB) "kubelet" 0.024 22.17 14.20 "runtime" 0.041 409.70 84.21 I1028 20:05:42.079274 1002 resource_usage_test.go:206] CPU usage of containers: container 50th% 90th% 95th% 99th% 100th% "/" N/A N/A N/A N/A N/A "runtime" 0.014 0.834 0.834 0.834 1.083 "kubelet" 0.023 0.093 0.093 0.093 0.164 Increasing 95th percentile for runtime CPU usage should also make pull-kubernetes-node-kubelet-containerd-flaky less flaky.	2024-10-30 00:48:56 +02:00
Patrick Ohly	f3fef01e79	DRA API: AdminAccess in DeviceRequestAllocationResult Drivers need to know that because admin access may also grant additional permissions. The allocator needs to ignore such results when determining which devices are considered as allocated. In both cases it is conceptually cleaner to not rely on the content of the ClaimSpec.	2024-10-29 09:50:07 +01:00
Kubernetes Prow Robot	685b8b3ba1	Merge pull request #126981 from kannon92/stable-empty-dir-promotion KEP-1967: promote size backed memory volumes to stable	2024-10-29 01:00:54 +00:00
Kubernetes Prow Robot	1d8828ce70	Merge pull request #128091 from saschagrunert/cni-plugins Update cni-plugins to v1.6.0	2024-10-27 03:01:06 +00:00
Francesco Romani	14ec0edd10	node: metrics: add metrics about cpu pool sizes Add metrics about the sizing of the cpu pools. Currently the cpumanager maintains 2 cpu pools: - shared pool: this is where all pods with non-exclusive cpu allocation run - exclusive pool: this is the union of the set of exclusive cpus allocated to containers, if any (requires static policy in use). By reporting the size of the pools, the users (humans or machines) can get better insights and more feedback about how the resources actually allocated to the workload and how the node resources are used.	2024-10-24 15:35:51 +02:00
Kubernetes Prow Robot	8c7160205d	Merge pull request #127922 from PiotrProkop/topology-manager-policy-options-e2e add e2e tests for prefer-closest-numa-nodes TopologyManagerPolicyOption	2024-10-24 14:17:03 +01:00
PiotrProkop	a6eb3281cc	add e2e tests for prefer-closest-numa-nodes TopologyManagerPolicyOption suboptimal allocation Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2024-10-24 11:45:39 +02:00
Ed Bartosh	2ac5dfe379	e2e_node: check container metrics conditionally When PodAndContainerStatsFromCRI FG is enabled, Kubelet tries to get list of metrics from the CRI runtime using CRI API 'ListMetricDescriptors'. As this API is not implemented in neither CRI-O nor Containerd versions used in the test-infra, ResourceMetrics test case fails to gather certain container metrics. Excluding container metrics from the expected list of metrics if PodAndContainerStatsFromCRI is enabled should solve the issue.	2024-10-23 21:08:36 +03:00
Kubernetes Prow Robot	c6669ea7d6	Merge pull request #127155 from ffromani/alignment-metrics node: metrics: add resource alignment metrics	2024-10-23 09:54:58 +01:00
Francesco Romani	c025861e0c	node: metrics: add resource alignment metrics In order to improve the observability of the resource management in kubelet, cpu allocation and NUMA alignment, we add more metrics to report if resource alignment is in effect. The more precise reporting would probably be using pod status, but this would require more invasive and riskier changes, and possibly extra interactions to the APIServer. We start adding metrics to report if containers got their compute resources aligned. If metrics are growing, the assingment is working as expected; If metrics stay consistent, perhaps at zero, no resource alignment is done. Extra fixes brought by this work - retroactively add labels for existing tests - running metrics test demands precision accounting to avoid flakes; ensure the node state is restored pristine between each test, to minimize the aforementioned risk of flakes. - The test pod command line was wrong, with this the pod could not reach Running state. That gone unnoticed so far because no test using this utility function actually needed a pod in running state. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-10-23 08:05:38 +02:00
Davanum Srinivas	abbc5ad346	Copy limited pieces of code we use from runc's apparmor and utils packages Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2024-10-22 09:56:22 -04:00

1 2 3 4 5 ...

2990 Commits