Run Deployment
This page shows how to leverage Kueue’s scheduling and resource management capabilities when running Deployments. Although Kueue does not yet support managing a Deployment as a single Workload, it’s still possible to leverage Kueue’s scheduling and resource management capabilities for the individual Pods of the Deployment.
We demonstrate how to support scheduling Deployments in Kueue based on the Plain Pod integration, where every Pod from a Deployment is represented as a single independent Plain Pod. This approach allows independent resource management for the Pods, and thus scale-out and scale-in of the Deployment.
This guide is for serving users that have a basic understanding of Kueue. For more information, see Kueue’s overview.
Before you begin
-
Learn how to install Kueue with a custom manager configuration.
-
Follow steps in Run Plain Pods to learn how to enable the
v1/pod
integration and how to configure it using thepodOptions
field. -
Check Administer cluster quotas for details on the initial Kueue setup.
Running a Deployment admitted by Kueue
When running Deployment on Kueue, take into consideration the following aspects:
a. Queue selection
The target local queue should be specified in the spec.template.metadata.labels
section of the Deployment configuration.
Since Kueue’s scheduling and resource management will be applied to the individual Pods of the Deployment,
the queue name should be specified at the Pod level.
spec:
template:
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
b. Configure the resource needs
The resource needs of the workload can be configured in the spec.template.spec.containers
.
- resources:
requests:
cpu: 3
c. Scaling
You may perform scale up or scale down operations on Deployments.
On scale-in, the excess Pods are deleted, and the quota is freed.
On scale-out, new Pods are created, and remain suspended until their corresponding workloads get admitted.
If there is not enough quota in your cluster, the Deployment might run only a subset of Pods.
So, if your workloads are business-critical,
you can consider reserving the quota only for the serving workloads by the ClusterQueue lendingLimit
.
The lendingLimit
allows you to rapidly scale out the critical serving workload.
For more lendingLimit
details, please see the ClusterQueue page.
d. Limitations
- The scope for Deployments is implied by the pod integration’s namespace selector. There’s no independent control for deployments.
Example
Here is a sample Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
kueue.x-k8s.io/queue-name: user-queue
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.27
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
You can create the Deployment using the following command:
kubectl create -f sample-deployment.yaml
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.