K8s Dynamic Admission Controllers in Go: A Production Deep Dive

13 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond RBAC: Enforcing Custom Governance with Admission Controllers

In any mature Kubernetes environment, the limitations of declarative, role-based access control (RBAC) and Pod Security Standards (PSS) become apparent. While essential for authentication and baseline workload security, they cannot enforce arbitrary, business-specific logic. Consider these common platform engineering requirements:

  • All Ingress resources must contain a kubernetes.io/ingress.class annotation.
  • Every Deployment must have a team label for cost allocation and ownership tracking.
  • No Pod can be created using an image with the :latest tag.
  • All PersistentVolumeClaim resources must request storage from a specific StorageClass.
  • Attempting to enforce these rules through CI pipeline linting is a partial solution, but it fails to prevent direct kubectl actions or GitOps drift. The canonical Kubernetes-native solution for enforcing such policies is the Dynamic Admission Controller.

    This article is not an introduction. It assumes you understand the basic concept of admission controllers. We will dive directly into building, deploying, and managing a production-ready ValidatingAdmissionWebhook in Go, focusing on the operational realities of high-availability, security, and maintainability.

    Mutating vs. Validating: A Deliberate Choice

    Kubernetes offers two types of admission webhooks:

  • MutatingAdmissionWebhook: Intercepts an object before it's persisted and can modify it. Useful for injecting sidecars, setting default annotations, or modifying security contexts.
  • ValidatingAdmissionWebhook: Intercepts an object after any mutations and decides whether to admit or reject it based on its final state. It cannot modify the object.
  • For policy enforcement, ValidatingAdmissionWebhook is almost always the correct choice. It provides a clear, binary pass/fail decision, making its behavior predictable and easy to reason about. Mutating objects to enforce policy can lead to surprising side effects and complex interactions between multiple webhooks. Our focus will be exclusively on building a robust validator.

    The AdmissionReview API Contract

    At its core, a dynamic admission webhook is a simple HTTPS server that understands a specific JSON payload: the AdmissionReview object. The API server sends a POST request with an AdmissionReview body, and the webhook must respond with an AdmissionReview body.

    Let's dissect the crucial components.

    The Request (admission.k8s.io/v1.AdmissionReview)

    json
    {
      "apiVersion": "admission.k8s.io/v1",
      "kind": "AdmissionReview",
      "request": {
        "uid": "a834d28e-db1a-428a-b4f0-dec395e59e19",
        "kind": { "group": "apps", "version": "v1", "kind": "Deployment" },
        "resource": { "group": "apps", "version": "v1", "resource": "deployments" },
        "userInfo": {
          "username": "system:serviceaccount:default:my-app",
          "groups": ["system:serviceaccounts", "system:authenticated"]
        },
        "object": {
          "apiVersion": "apps/v1",
          "kind": "Deployment",
          "metadata": { ... },
          "spec": { ... }
        },
        "oldObject": null, // Populated on UPDATE and DELETE
        "operation": "CREATE"
      }
    }

    Key fields for a validator:

  • request.uid: A unique identifier for this specific request. Your response must include this exact UID.
  • request.kind: The Group-Version-Kind (GVK) of the object being reviewed.
  • request.object: A runtime.RawExtension containing the full JSON representation of the submitted object. This is what you need to deserialize and validate.
  • request.userInfo: Crucial for policies based on the identity of the actor performing the operation.
  • request.operation: CREATE, UPDATE, DELETE, or CONNECT. Your logic may differ based on the operation.
  • The Response (admission.k8s.io/v1.AdmissionReview)

    Your webhook's response must mirror the request structure, populating the response field.

    json
    {
      "apiVersion": "admission.k8s.io/v1",
      "kind": "AdmissionReview",
      "response": {
        "uid": "a834d28e-db1a-428a-b4f0-dec395e59e19", // Must match the request UID
        "allowed": false,
        "status": {
          "code": 403, // Or another appropriate HTTP status code
          "message": "Deployment must include a 'team' label."
        }
      }
    }
  • response.uid: Copied directly from request.uid.
  • response.allowed: The boolean result of your validation.
  • response.status: If allowed is false, this provides a human-readable message and a status code that will be relayed to the user via kubectl.
  • Building the Go Webhook Server

    Let's implement a webhook that enforces our policy: All Deployment and StatefulSet resources must have a team label.

    Project Setup

    bash
    mkdir label-validator
    cd label-validator
    go mod init github.com/your-org/label-validator
    go get k8s.io/[email protected]
    go get k8s.io/[email protected]

    The Core HTTP Handler

    This is the heart of the webhook. It deserializes the request, applies logic, and serializes the response. We'll use the standard net/http library.

    main.go:

    go
    package main
    
    import (
    	"encoding/json"
    	"fmt"
    	"io/ioutil"
    	"net/http"
    
    	admissionv1 "k8s.io/api/admission/v1"
    	appsv1 "k8s.io/api/apps/v1"
    	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/apimachinery/pkg/runtime"
    	"k8s.io/apimachinery/pkg/runtime/serializer"
    )
    
    var (
    	universalDeserializer = serializer.NewCodecFactory(runtime.NewScheme()).UniversalDeserializer()
    )
    
    // admissionResponse is a helper to create an AdmissionResponse.
    func admissionResponse(uid types.UID, allowed bool, message string) *admissionv1.AdmissionReview {
    	statusCode := int32(200)
    	if !allowed {
    		statusCode = 403 // Forbidden
    	}
    	return &admissionv1.AdmissionReview{
    		TypeMeta: metav1.TypeMeta{
    			APIVersion: "admission.k8s.io/v1",
    			Kind:       "AdmissionReview",
    		},
    		Response: &admissionv1.AdmissionResponse{
    			UID:     uid,
    			Allowed: allowed,
    			Result: &metav1.Status{
    				Code:    statusCode,
    				Message: message,
    			},
    		},
    	}
    }
    
    // validateTeamLabel is the core validation logic.
    func validateTeamLabel(ar admissionv1.AdmissionReview) *admissionv1.AdmissionReview {
    	// The request object is a raw JSON blob. We need to decode it.
    	raw := ar.Request.Object.Raw
    	var labels map[string]string
    	var kind string
    
    	switch ar.Request.Kind.Kind {
    	case "Deployment":
    		var deployment appsv1.Deployment
    		if _, _, err := universalDeserializer.Decode(raw, nil, &deployment); err != nil {
    			return admissionResponse(ar.Request.UID, false, fmt.Sprintf("could not deserialize deployment: %v", err))
    		}
    		labels = deployment.ObjectMeta.Labels
    		kind = "Deployment"
    	case "StatefulSet":
    		var statefulSet appsv1.StatefulSet
    		if _, _, err := universalDeserializer.Decode(raw, nil, &statefulSet); err != nil {
    			return admissionResponse(ar.Request.UID, false, fmt.Sprintf("could not deserialize statefulset: %v", err))
    		}
    		labels = statefulSet.ObjectMeta.Labels
    		kind = "StatefulSet"
    	default:
    		// This should not happen if the ValidatingWebhookConfiguration is configured correctly.
    		return admissionResponse(ar.Request.UID, true, "") // Allow other resources
    	}
    
    	if labels == nil {
    		return admissionResponse(ar.Request.UID, false, fmt.Sprintf("%s is missing labels entirely", kind))
    	}
    
    	if _, ok := labels["team"]; !ok {
    		return admissionResponse(ar.Request.UID, false, fmt.Sprintf("%s must have a 'team' label", kind))
    	}
    
    	return admissionResponse(ar.Request.UID, true, "")
    }
    
    // handleValidation is the main HTTP handler.
    func handleValidation(w http.ResponseWriter, r *http.Request) {
    	body, err := ioutil.ReadAll(r.Body)
    	if err != nil {
    		w.WriteHeader(http.StatusBadRequest)
    		fmt.Fprintf(w, "could not read request body: %v", err)
    		return
    	}
    
    	var admissionReviewReq admissionv1.AdmissionReview
    	if _, _, err := universalDeserializer.Decode(body, nil, &admissionReviewReq); err != nil {
    		w.WriteHeader(http.StatusBadRequest)
    		fmt.Fprintf(w, "could not deserialize request: %v", err)
    		return
    	}
    
    	if admissionReviewReq.Request == nil {
    		w.WriteHeader(http.StatusBadRequest)
    		fmt.Fprintf(w, "malformed admission review: request is nil")
    		return
    	}
    
    	admissionReviewResp := validateTeamLabel(admissionReviewReq)
    
    	respBytes, err := json.Marshal(admissionReviewResp)
    	if err != nil {
    		w.WriteHeader(http.StatusInternalServerError)
    		fmt.Fprintf(w, "could not marshal response: %v", err)
    		return
    	}
    
    	w.Header().Set("Content-Type", "application/json")
    	w.Write(respBytes)
    }
    
    func main() {
    	// Note: In a real deployment, you would get the TLS cert and key paths from flags or env vars.
    	certPath := "/etc/webhook/certs/tls.crt"
    	keyPath := "/etc/webhook/certs/tls.key"
    
    	http.HandleFunc("/validate", handleValidation)
    	fmt.Println("Server starting on port 8443...")
    	if err := http.ListenAndServeTLS(":8443", certPath, keyPath, nil); err != nil {
    		panic(fmt.Sprintf("failed to start server: %v", err))
    	}
    }

    This code is complete and handles the full request/response lifecycle. Note the use of universalDeserializer to correctly parse the Kubernetes object from the raw JSON, a common point of failure in naive implementations.

    Production TLS: The `cert-manager` Pattern

    The Kubernetes API server must communicate with your webhook over HTTPS, and it must trust the webhook's certificate. Managing these TLS certificates is the single biggest operational hurdle.

    While you can use openssl to generate a self-signed CA and server certs, this is a brittle, manual process. The production standard is to use cert-manager to automate certificate provisioning and rotation.

    How it Works:

  • You install cert-manager in your cluster.
  • You create an Issuer or ClusterIssuer resource that defines how to obtain certificates (e.g., from a self-signed CA, Let's Encrypt, or Vault).
  • You deploy your webhook. cert-manager has a special webhook that watches for ValidatingWebhookConfiguration and MutatingWebhookConfiguration resources.
  • You annotate your webhook configuration with cert-manager.io/inject-ca-from, and cert-manager will automatically provision a certificate, store it in a Secret, and inject the CA bundle into the configuration.
  • Implementation Steps:

    1. Install cert-manager:

    bash
    kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml

    2. Create a Self-Signed ClusterIssuer:

    For internal webhooks, a self-signed CA managed by cert-manager is a secure and robust pattern.

    issuer.yaml

    yaml
    apiVersion: cert-manager.io/v1
    kind: ClusterIssuer
    metadata:
      name: self-signed-issuer
    spec:
      selfSigned: {}

    kubectl apply -f issuer.yaml

    3. Create a Certificate Resource:

    This resource tells cert-manager to issue a certificate for our webhook's Service and store it in a Secret.

    certificate.yaml

    yaml
    apiVersion: cert-manager.io/v1
    kind: Certificate
    metadata:
      name: label-validator-cert
      namespace: default # Or your webhook's namespace
    spec:
      # The secret name where the certificate will be stored
      secretName: label-validator-tls
      
      # The DNS name must match the Kubernetes Service FQDN
      # <service-name>.<namespace>.svc
      dnsNames:
      - label-validator-svc.default.svc
    
      # Reference our self-signed issuer
      issuerRef:
        name: self-signed-issuer
        kind: ClusterIssuer

    kubectl apply -f certificate.yaml

    After a few moments, cert-manager will create a secret named label-validator-tls containing tls.crt, tls.key, and ca.crt.

    Deployment to Kubernetes

    Now we tie everything together with Kubernetes manifests.

    1. Dockerfile (Multi-stage build):

    dockerfile
    # Build stage
    FROM golang:1.20-alpine AS builder
    WORKDIR /app
    COPY go.mod go.sum ./
    RUN go mod download
    COPY . .
    RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o webhook .
    
    # Final stage
    FROM alpine:latest
    RUN apk --no-cache add ca-certificates
    WORKDIR /app
    COPY --from=builder /app/webhook .
    
    # The webhook binary will be run by a non-root user for security
    RUN addgroup -S appgroup && adduser -S appuser -G appgroup
    USER appuser
    
    CMD ["./webhook"]

    2. Deployment and Service:

    deployment.yaml

    yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: label-validator
      labels:
        app: label-validator
    spec:
      replicas: 2 # Always run at least 2 for HA
      selector:
        matchLabels:
          app: label-validator
      template:
        metadata:
          labels:
            app: label-validator
        spec:
          containers:
          - name: webhook
            image: your-registry/label-validator:v1.0.0
            ports:
            - containerPort: 8443
              name: webhook-tls
            volumeMounts:
            - name: tls-certs
              mountPath: /etc/webhook/certs
              readOnly: true
          volumes:
          - name: tls-certs
            secret:
              secretName: label-validator-tls # Mount the secret created by cert-manager
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: label-validator-svc
    spec:
      selector:
        app: label-validator
      ports:
      - port: 443
        targetPort: webhook-tls

    3. The ValidatingWebhookConfiguration:

    This is the final, crucial piece that registers our webhook with the API server.

    webhook-config.yaml

    yaml
    apiVersion: admissionregistration.k8s.io/v1
    kind: ValidatingWebhookConfiguration
    metadata:
      name: label-validator-webhook
      annotations:
        # This annotation tells cert-manager to inject the CA bundle from our secret
        cert-manager.io/inject-ca-from: "default/label-validator-cert"
    webhooks:
    - name: label-validator.your-domain.com
      clientConfig:
        # The caBundle will be populated by cert-manager.
        # caBundle: LS0t....
        service:
          name: label-validator-svc
          namespace: default
          path: "/validate"
          port: 443
      rules:
      - apiGroups: ["apps"]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["deployments", "statefulsets"]
        scope: "*"
      # CRITICAL: Failure policy
      failurePolicy: Fail
      # Set sideEffects to None because our webhook has no side effects on other components.
      sideEffects: None
      # The admission review versions the webhook supports.
      admissionReviewVersions: ["v1"]

    Deploy these manifests, and your webhook is live.

    Advanced Considerations & Edge Cases

    A naive implementation stops here. A production system requires deeper thinking.

    The `failurePolicy` Dilemma: `Fail` vs. `Ignore`

  • failurePolicy: Fail: If the API server cannot reach your webhook (due to network issues, webhook crash, timeout), the admission request will be rejected. This guarantees your policies are always enforced, but it also means webhook downtime can block cluster operations (e.g., kube-system pods can't be updated, deployments fail). This is a high-availability risk.
  • failurePolicy: Ignore: If the webhook is unreachable, the API server will allow the request to proceed. This ensures cluster stability but means your security policies can be bypassed during an outage. This is a security risk.
  • Production Strategy:

  • Always use Fail for security-critical webhooks. The risk of bypassing a security control is usually greater than the risk of temporary API unavailability.
  • Ensure High Availability. Run at least 2-3 replicas of your webhook pod, spread across different nodes using pod anti-affinity.
  • Monitor Aggressively. Set up alerts for webhook latency and error rates. The API server metrics (apiserver_admission_webhook_admission_duration_seconds) are invaluable here.
  • Use a namespaceSelector to exclude critical namespaces like kube-system from your webhook if its logic is not relevant to them. This prevents your webhook from blocking core cluster components.
  • yaml
    # In ValidatingWebhookConfiguration
    ... 
      namespaceSelector:
        matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values: ["kube-system", "cert-manager"]

    Performance and Timeouts

    Your webhook is in the synchronous, critical path of API requests. It must be fast.

    • The API server has a configurable timeout (default is 10 seconds, but often set lower). Your webhook must respond well within this window.
  • Avoid external calls. Do not call other APIs, databases, or services from within your webhook handler. The latency and failure modes are too unpredictable. If you need external data, use a caching controller that populates a local data store or ConfigMap that the webhook can read from quickly.
  • Benchmark your logic. A simple label check is trivial. A policy that involves complex regex or data structure traversal could add milliseconds. At scale, this matters.
  • Testing Strategies

    Never test a new admission controller on a live cluster. The blast radius is the entire cluster.

  • Unit Tests (Go): Write standard Go tests for your validateTeamLabel function. You can construct AdmissionReview structs in your test code and assert the AdmissionResponse is correct. This is fast and easy.
  • Integration Tests (envtest): The controller-runtime project (used by Kubebuilder and Operator-SDK) provides a library called envtest. It spins up a real, temporary kube-apiserver and etcd on your local machine. You can run your webhook server, configure it against this temporary API server, and use a real Kubernetes client to create/update objects and verify they are correctly allowed or denied. This provides high-fidelity testing without needing a full cluster.
  • E2E Testing (Staging Cluster): In a staging cluster, deploy your webhook with a ValidatingWebhookConfiguration that uses a namespaceSelector to target only a specific test namespace. This contains the webhook's impact. Run a suite of kubectl commands or client-go scripts against that namespace to verify end-to-end behavior.
  • Conclusion

    Dynamic Admission Controllers are a powerful tool for platform engineers to enforce custom governance and security policies that go far beyond Kubernetes' built-in primitives. While the core concept is a simple webhook, a production-grade implementation requires careful consideration of the entire lifecycle: robust and automated TLS management, a deliberate high-availability strategy centered on the failurePolicy, rigorous multi-layered testing, and a constant focus on performance. By adopting patterns like cert-manager for certificate automation and envtest for integration testing, you can build resilient, secure, and maintainable admission controllers that become a cornerstone of your cluster governance strategy.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles