K8s Dynamic Admission Controllers in Go: A Production Deep Dive
Beyond RBAC: Enforcing Custom Governance with Admission Controllers
In any mature Kubernetes environment, the limitations of declarative, role-based access control (RBAC) and Pod Security Standards (PSS) become apparent. While essential for authentication and baseline workload security, they cannot enforce arbitrary, business-specific logic. Consider these common platform engineering requirements:
Ingress resources must contain a kubernetes.io/ingress.class annotation.Deployment must have a team label for cost allocation and ownership tracking.Pod can be created using an image with the :latest tag.PersistentVolumeClaim resources must request storage from a specific StorageClass.Attempting to enforce these rules through CI pipeline linting is a partial solution, but it fails to prevent direct kubectl actions or GitOps drift. The canonical Kubernetes-native solution for enforcing such policies is the Dynamic Admission Controller.
This article is not an introduction. It assumes you understand the basic concept of admission controllers. We will dive directly into building, deploying, and managing a production-ready ValidatingAdmissionWebhook in Go, focusing on the operational realities of high-availability, security, and maintainability.
Mutating vs. Validating: A Deliberate Choice
Kubernetes offers two types of admission webhooks:
MutatingAdmissionWebhook: Intercepts an object before it's persisted and can modify it. Useful for injecting sidecars, setting default annotations, or modifying security contexts.ValidatingAdmissionWebhook: Intercepts an object after any mutations and decides whether to admit or reject it based on its final state. It cannot modify the object.For policy enforcement, ValidatingAdmissionWebhook is almost always the correct choice. It provides a clear, binary pass/fail decision, making its behavior predictable and easy to reason about. Mutating objects to enforce policy can lead to surprising side effects and complex interactions between multiple webhooks. Our focus will be exclusively on building a robust validator.
The AdmissionReview API Contract
At its core, a dynamic admission webhook is a simple HTTPS server that understands a specific JSON payload: the AdmissionReview object. The API server sends a POST request with an AdmissionReview body, and the webhook must respond with an AdmissionReview body.
Let's dissect the crucial components.
The Request (admission.k8s.io/v1.AdmissionReview)
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"request": {
"uid": "a834d28e-db1a-428a-b4f0-dec395e59e19",
"kind": { "group": "apps", "version": "v1", "kind": "Deployment" },
"resource": { "group": "apps", "version": "v1", "resource": "deployments" },
"userInfo": {
"username": "system:serviceaccount:default:my-app",
"groups": ["system:serviceaccounts", "system:authenticated"]
},
"object": {
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": { ... },
"spec": { ... }
},
"oldObject": null, // Populated on UPDATE and DELETE
"operation": "CREATE"
}
}
Key fields for a validator:
request.uid: A unique identifier for this specific request. Your response must include this exact UID.request.kind: The Group-Version-Kind (GVK) of the object being reviewed.request.object: A runtime.RawExtension containing the full JSON representation of the submitted object. This is what you need to deserialize and validate.request.userInfo: Crucial for policies based on the identity of the actor performing the operation.request.operation: CREATE, UPDATE, DELETE, or CONNECT. Your logic may differ based on the operation.The Response (admission.k8s.io/v1.AdmissionReview)
Your webhook's response must mirror the request structure, populating the response field.
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"response": {
"uid": "a834d28e-db1a-428a-b4f0-dec395e59e19", // Must match the request UID
"allowed": false,
"status": {
"code": 403, // Or another appropriate HTTP status code
"message": "Deployment must include a 'team' label."
}
}
}
response.uid: Copied directly from request.uid.response.allowed: The boolean result of your validation.response.status: If allowed is false, this provides a human-readable message and a status code that will be relayed to the user via kubectl.Building the Go Webhook Server
Let's implement a webhook that enforces our policy: All Deployment and StatefulSet resources must have a team label.
Project Setup
mkdir label-validator
cd label-validator
go mod init github.com/your-org/label-validator
go get k8s.io/[email protected]
go get k8s.io/[email protected]
The Core HTTP Handler
This is the heart of the webhook. It deserializes the request, applies logic, and serializes the response. We'll use the standard net/http library.
main.go:
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
admissionv1 "k8s.io/api/admission/v1"
appsv1 "k8s.io/api/apps/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/serializer"
)
var (
universalDeserializer = serializer.NewCodecFactory(runtime.NewScheme()).UniversalDeserializer()
)
// admissionResponse is a helper to create an AdmissionResponse.
func admissionResponse(uid types.UID, allowed bool, message string) *admissionv1.AdmissionReview {
statusCode := int32(200)
if !allowed {
statusCode = 403 // Forbidden
}
return &admissionv1.AdmissionReview{
TypeMeta: metav1.TypeMeta{
APIVersion: "admission.k8s.io/v1",
Kind: "AdmissionReview",
},
Response: &admissionv1.AdmissionResponse{
UID: uid,
Allowed: allowed,
Result: &metav1.Status{
Code: statusCode,
Message: message,
},
},
}
}
// validateTeamLabel is the core validation logic.
func validateTeamLabel(ar admissionv1.AdmissionReview) *admissionv1.AdmissionReview {
// The request object is a raw JSON blob. We need to decode it.
raw := ar.Request.Object.Raw
var labels map[string]string
var kind string
switch ar.Request.Kind.Kind {
case "Deployment":
var deployment appsv1.Deployment
if _, _, err := universalDeserializer.Decode(raw, nil, &deployment); err != nil {
return admissionResponse(ar.Request.UID, false, fmt.Sprintf("could not deserialize deployment: %v", err))
}
labels = deployment.ObjectMeta.Labels
kind = "Deployment"
case "StatefulSet":
var statefulSet appsv1.StatefulSet
if _, _, err := universalDeserializer.Decode(raw, nil, &statefulSet); err != nil {
return admissionResponse(ar.Request.UID, false, fmt.Sprintf("could not deserialize statefulset: %v", err))
}
labels = statefulSet.ObjectMeta.Labels
kind = "StatefulSet"
default:
// This should not happen if the ValidatingWebhookConfiguration is configured correctly.
return admissionResponse(ar.Request.UID, true, "") // Allow other resources
}
if labels == nil {
return admissionResponse(ar.Request.UID, false, fmt.Sprintf("%s is missing labels entirely", kind))
}
if _, ok := labels["team"]; !ok {
return admissionResponse(ar.Request.UID, false, fmt.Sprintf("%s must have a 'team' label", kind))
}
return admissionResponse(ar.Request.UID, true, "")
}
// handleValidation is the main HTTP handler.
func handleValidation(w http.ResponseWriter, r *http.Request) {
body, err := ioutil.ReadAll(r.Body)
if err != nil {
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, "could not read request body: %v", err)
return
}
var admissionReviewReq admissionv1.AdmissionReview
if _, _, err := universalDeserializer.Decode(body, nil, &admissionReviewReq); err != nil {
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, "could not deserialize request: %v", err)
return
}
if admissionReviewReq.Request == nil {
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, "malformed admission review: request is nil")
return
}
admissionReviewResp := validateTeamLabel(admissionReviewReq)
respBytes, err := json.Marshal(admissionReviewResp)
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
fmt.Fprintf(w, "could not marshal response: %v", err)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(respBytes)
}
func main() {
// Note: In a real deployment, you would get the TLS cert and key paths from flags or env vars.
certPath := "/etc/webhook/certs/tls.crt"
keyPath := "/etc/webhook/certs/tls.key"
http.HandleFunc("/validate", handleValidation)
fmt.Println("Server starting on port 8443...")
if err := http.ListenAndServeTLS(":8443", certPath, keyPath, nil); err != nil {
panic(fmt.Sprintf("failed to start server: %v", err))
}
}
This code is complete and handles the full request/response lifecycle. Note the use of universalDeserializer to correctly parse the Kubernetes object from the raw JSON, a common point of failure in naive implementations.
Production TLS: The `cert-manager` Pattern
The Kubernetes API server must communicate with your webhook over HTTPS, and it must trust the webhook's certificate. Managing these TLS certificates is the single biggest operational hurdle.
While you can use openssl to generate a self-signed CA and server certs, this is a brittle, manual process. The production standard is to use cert-manager to automate certificate provisioning and rotation.
How it Works:
cert-manager in your cluster.Issuer or ClusterIssuer resource that defines how to obtain certificates (e.g., from a self-signed CA, Let's Encrypt, or Vault).cert-manager has a special webhook that watches for ValidatingWebhookConfiguration and MutatingWebhookConfiguration resources.cert-manager.io/inject-ca-from, and cert-manager will automatically provision a certificate, store it in a Secret, and inject the CA bundle into the configuration.Implementation Steps:
1. Install cert-manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml
2. Create a Self-Signed ClusterIssuer:
For internal webhooks, a self-signed CA managed by cert-manager is a secure and robust pattern.
issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: self-signed-issuer
spec:
selfSigned: {}
kubectl apply -f issuer.yaml
3. Create a Certificate Resource:
This resource tells cert-manager to issue a certificate for our webhook's Service and store it in a Secret.
certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: label-validator-cert
namespace: default # Or your webhook's namespace
spec:
# The secret name where the certificate will be stored
secretName: label-validator-tls
# The DNS name must match the Kubernetes Service FQDN
# <service-name>.<namespace>.svc
dnsNames:
- label-validator-svc.default.svc
# Reference our self-signed issuer
issuerRef:
name: self-signed-issuer
kind: ClusterIssuer
kubectl apply -f certificate.yaml
After a few moments, cert-manager will create a secret named label-validator-tls containing tls.crt, tls.key, and ca.crt.
Deployment to Kubernetes
Now we tie everything together with Kubernetes manifests.
1. Dockerfile (Multi-stage build):
# Build stage
FROM golang:1.20-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o webhook .
# Final stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY --from=builder /app/webhook .
# The webhook binary will be run by a non-root user for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
CMD ["./webhook"]
2. Deployment and Service:
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: label-validator
labels:
app: label-validator
spec:
replicas: 2 # Always run at least 2 for HA
selector:
matchLabels:
app: label-validator
template:
metadata:
labels:
app: label-validator
spec:
containers:
- name: webhook
image: your-registry/label-validator:v1.0.0
ports:
- containerPort: 8443
name: webhook-tls
volumeMounts:
- name: tls-certs
mountPath: /etc/webhook/certs
readOnly: true
volumes:
- name: tls-certs
secret:
secretName: label-validator-tls # Mount the secret created by cert-manager
---
apiVersion: v1
kind: Service
metadata:
name: label-validator-svc
spec:
selector:
app: label-validator
ports:
- port: 443
targetPort: webhook-tls
3. The ValidatingWebhookConfiguration:
This is the final, crucial piece that registers our webhook with the API server.
webhook-config.yaml
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: label-validator-webhook
annotations:
# This annotation tells cert-manager to inject the CA bundle from our secret
cert-manager.io/inject-ca-from: "default/label-validator-cert"
webhooks:
- name: label-validator.your-domain.com
clientConfig:
# The caBundle will be populated by cert-manager.
# caBundle: LS0t....
service:
name: label-validator-svc
namespace: default
path: "/validate"
port: 443
rules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments", "statefulsets"]
scope: "*"
# CRITICAL: Failure policy
failurePolicy: Fail
# Set sideEffects to None because our webhook has no side effects on other components.
sideEffects: None
# The admission review versions the webhook supports.
admissionReviewVersions: ["v1"]
Deploy these manifests, and your webhook is live.
Advanced Considerations & Edge Cases
A naive implementation stops here. A production system requires deeper thinking.
The `failurePolicy` Dilemma: `Fail` vs. `Ignore`
failurePolicy: Fail: If the API server cannot reach your webhook (due to network issues, webhook crash, timeout), the admission request will be rejected. This guarantees your policies are always enforced, but it also means webhook downtime can block cluster operations (e.g., kube-system pods can't be updated, deployments fail). This is a high-availability risk.failurePolicy: Ignore: If the webhook is unreachable, the API server will allow the request to proceed. This ensures cluster stability but means your security policies can be bypassed during an outage. This is a security risk.Production Strategy:
Fail for security-critical webhooks. The risk of bypassing a security control is usually greater than the risk of temporary API unavailability.apiserver_admission_webhook_admission_duration_seconds) are invaluable here.namespaceSelector to exclude critical namespaces like kube-system from your webhook if its logic is not relevant to them. This prevents your webhook from blocking core cluster components.# In ValidatingWebhookConfiguration
...
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system", "cert-manager"]
Performance and Timeouts
Your webhook is in the synchronous, critical path of API requests. It must be fast.
- The API server has a configurable timeout (default is 10 seconds, but often set lower). Your webhook must respond well within this window.
ConfigMap that the webhook can read from quickly.Testing Strategies
Never test a new admission controller on a live cluster. The blast radius is the entire cluster.
validateTeamLabel function. You can construct AdmissionReview structs in your test code and assert the AdmissionResponse is correct. This is fast and easy.envtest): The controller-runtime project (used by Kubebuilder and Operator-SDK) provides a library called envtest. It spins up a real, temporary kube-apiserver and etcd on your local machine. You can run your webhook server, configure it against this temporary API server, and use a real Kubernetes client to create/update objects and verify they are correctly allowed or denied. This provides high-fidelity testing without needing a full cluster.ValidatingWebhookConfiguration that uses a namespaceSelector to target only a specific test namespace. This contains the webhook's impact. Run a suite of kubectl commands or client-go scripts against that namespace to verify end-to-end behavior.Conclusion
Dynamic Admission Controllers are a powerful tool for platform engineers to enforce custom governance and security policies that go far beyond Kubernetes' built-in primitives. While the core concept is a simple webhook, a production-grade implementation requires careful consideration of the entire lifecycle: robust and automated TLS management, a deliberate high-availability strategy centered on the failurePolicy, rigorous multi-layered testing, and a constant focus on performance. By adopting patterns like cert-manager for certificate automation and envtest for integration testing, you can build resilient, secure, and maintainable admission controllers that become a cornerstone of your cluster governance strategy.