Resource Quotas and Limit Ranges are common ways to limit the number of pods (or resources used by pods) in Kubernetes clusters. However, when using Jobs for big-data or machine-learning pipelines it might be desirable to also start considering the rate which pods are created, especially if jobs are short-lived and there’s a concern that the control plane might be overwhelmed.
The first line of defence should be configuring the API server flags --max-requests-inflight
and --max-mutating-requests-inflight
, followed by configuring API Priority and Fairness, which allows for fine grained requests to be deprioritised (and ultimately rate limited) relative to other requests. Finally, the alpha Event Rate Limit can put a ceiling on the number of requests per second sent to the API server on a given namespace, for example.
Thinking about a final line of defence, I decided to explore implementing an admission webhook that would be configured (through a ValidatingWebhookConfiguration
) to intercept all pod creation requests and enforce a rate limit.
var limiter = rate.NewLimiter(rate.Every(10*time.Second), 1)
func validatingHandler(c *gin.Context) {
var review admissionv1.AdmissionReview
if err := c.Bind(&review); err != nil {
return
}
allowed := limiter.Allow()
var status, msg string
if allowed {
status = metav1.StatusSuccess
} else {
status = metav1.StatusFailure
msg = "rate limit exceeded"
}
review.Response = &admissionv1.AdmissionResponse{
UID: review.Request.UID,
Allowed: allowed,
Result: &metav1.Status{
Status: status,
Message: msg,
},
}
c.JSON(200, review)
}
Using golang.org/x/time/rate
, we keep a limiter that allows one request every 10 seconds. If the request is allowed, we return StatusSuccess
, otherwise we return a StatusFailure
which will prevent the pod from being created.
The configuration itself, defines a rule that narrows the scope to only pod creation with a ‘fail open’ failure policy:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: k8slimiter-pod-creation
annotations:
cert-manager.io/inject-ca-from: k8slimiter/k8slimiter-certificate
webhooks:
- name: k8slimiter-pod-creation.k8slimiter.svc
admissionReviewVersions:
- v1
clientConfig:
service:
name: k8slimiter-service
namespace: k8slimiter
path: "/validate"
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
failurePolicy: Ignore
sideEffects: None
With those in place, creating pods in quick succession leads to the expected rate limiting behaviour:
$ kubectl run "tmp-pod-$(date +%s)" --restart Never --image debian:12-slim -- sleep 1
pod/tmp-pod-1698005111 created
$ kubectl run "tmp-pod-$(date +%s)" --restart Never --image debian:12-slim -- sleep 1
Error from server: admission webhook "k8slimiter-pod-creation.k8slimiter.svc" denied the request: rate limit exceeded
A full working example can be found on arturhoo/k8slimiter, which leverages Gin and cert-manager
to achieve a minimal and straightforward admission webhook setup.