Kubernetes Finalizers: Advanced Patterns for Stateful Teardown

16 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Deletion Fallacy: Why Standard Garbage Collection Fails External Resources

As a seasoned Kubernetes engineer, you understand the power of the declarative model. You define a desired state in a Custom Resource (CR), and your operator's reconciliation loop makes it a reality. But what happens when that reality extends beyond the cluster's API server? Consider an operator managing S3Bucket custom resources. When a developer executes kubectl delete s3bucket my-production-bucket, Kubernetes dutifully removes the S3Bucket object from etcd. The problem? The actual S3 bucket in AWS remains, now an orphaned, potentially costly resource.

This is the core challenge that standard Kubernetes garbage collection, primarily designed around OwnerReferences for in-cluster objects, cannot solve. The controller manager has no inherent knowledge of the external world. Deleting the CR object is a fire-and-forget operation from its perspective.

This is where Finalizers become a non-negotiable component of any production-grade operator managing external state. A finalizer is simply a string key added to an object's metadata.finalizers list. Its presence acts as a lock, signaling to the Kubernetes API server: "Do not fully delete this object yet. A controller is performing pre-delete cleanup tasks."

When a user requests deletion of an object with a finalizer, the API server doesn't immediately remove it. Instead, it populates the object's metadata.deletionTimestamp field. This is the signal for our operator. The object now exists in a read-only, terminating state. Our reconciliation loop detects this timestamp, executes the necessary external cleanup logic (e.g., deleting the S3 bucket via the AWS API), and only upon successful completion, removes its finalizer string from the list. Once the finalizers list is empty, Kubernetes completes the object's deletion.

This article moves beyond this high-level concept and dives into the intricate implementation details, edge cases, and performance considerations you'll face when building robust, stateful operators in Go with Kubebuilder.


Part 1: A Canonical Finalizer Implementation in Go

Let's build the core logic for an S3Bucket controller. We'll assume a standard Kubebuilder project setup. The heart of our logic resides within the Reconcile function.

Our CRD spec might look like this:

yaml
# config/crd/bases/cloud.my.domain_s3buckets.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: s3buckets.cloud.my.domain
spec:
  group: cloud.my.domain
  names:
    kind: S3Bucket
    listKind: S3BucketList
    plural: s3buckets
    singular: s3bucket
  scope: Namespaced
  versions:
  - name: v1alpha1
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              bucketName:
                type: string
              region:
                type: string
              isPublic:
                type: boolean
            required:
            - bucketName
            - region

The core of the controller logic is a two-pronged approach within the Reconcile method, keyed off the presence of the DeletionTimestamp.

go
// controllers/s3bucket_controller.go

import (
    // ... other imports
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
    "sigs.k8s.io/controller-runtime/pkg/log"

    cloudv1alpha1 "github.com/my-org/s3-operator/api/v1alpha1"
)

// A constant for our finalizer name. This is convention and makes the code cleaner.
const s3BucketFinalizer = "cloud.my.domain/finalizer"

func (r *S3BucketReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    logger := log.FromContext(ctx)

    // 1. Fetch the S3Bucket instance
    s3Bucket := &cloudv1alpha1.S3Bucket{}
    if err := r.Get(ctx, req.NamespacedName, s3Bucket); err != nil {
        if client.IgnoreNotFound(err) != nil {
            logger.Error(err, "unable to fetch S3Bucket")
            return ctrl.Result{}, err
        }
        logger.Info("S3Bucket resource not found. Ignoring since object must be deleted")
        return ctrl.Result{}, nil
    }

    // 2. The core finalizer logic branch
    if s3Bucket.ObjectMeta.DeletionTimestamp.IsZero() {
        // The object is NOT being deleted, so we add our finalizer if it doesn't exist.
        if !controllerutil.ContainsFinalizer(s3Bucket, s3BucketFinalizer) {
            logger.Info("Adding finalizer to S3Bucket")
            controllerutil.AddFinalizer(s3Bucket, s3BucketFinalizer)
            if err := r.Update(ctx, s3Bucket); err != nil {
                return ctrl.Result{}, err
            }
        }

        // This is where your normal reconciliation logic goes.
        // e.g., check if the S3 bucket exists, if not, create it.
        // We'll stub this out for now.
        if err := r.ensureS3BucketExists(ctx, s3Bucket); err != nil {
            // Update status, etc.
            return ctrl.Result{}, err
        }

    } else {
        // The object IS being deleted
        if controllerutil.ContainsFinalizer(s3Bucket, s3BucketFinalizer) {
            logger.Info("Performing cleanup for S3Bucket")
            // Our actual finalizer logic
            if err := r.handleFinalizer(ctx, s3Bucket); err != nil {
                // If the cleanup fails, we don't remove the finalizer.
                // This will cause the reconciliation to be re-queued, and we'll try again.
                logger.Error(err, "failed to handle finalizer")
                return ctrl.Result{}, err
            }

            // Cleanup was successful, remove the finalizer
            logger.Info("Removing finalizer from S3Bucket after successful cleanup")
            controllerutil.RemoveFinalizer(s3Bucket, s3BucketFinalizer)
            if err := r.Update(ctx, s3Bucket); err != nil {
                return ctrl.Result{}, err
            }
        }
    }

    return ctrl.Result{}, nil
}

// handleFinalizer contains the actual logic to clean up the external resource.
func (r *S3BucketReconciler) handleFinalizer(ctx context.Context, s3Bucket *cloudv1alpha1.S3Bucket) error {
    logger := log.FromContext(ctx)

    // This is a placeholder for your AWS SDK call
    logger.Info("Attempting to delete external S3 bucket", "bucketName", s3Bucket.Spec.BucketName)

    // Use the AWS SDK to check if the bucket exists and delete it.
    // This logic MUST be idempotent.
    exists, err := r.S3Client.BucketExists(ctx, s3Bucket.Spec.BucketName)
    if err != nil {
        return fmt.Errorf("failed to check if S3 bucket exists: %w", err)
    }

    if exists {
        if err := r.S3Client.DeleteBucket(ctx, s3Bucket.Spec.BucketName); err != nil {
            // Important: Return an error here to trigger a requeue.
            return fmt.Errorf("failed to delete S3 bucket: %w", err)
        }
        logger.Info("Successfully deleted external S3 bucket")
    } else {
        logger.Info("External S3 bucket already deleted, nothing to do.")
    }
    
    return nil
}

// ensureS3BucketExists is the placeholder for the normal reconciliation logic.
func (r *S3BucketReconciler) ensureS3BucketExists(ctx context.Context, s3Bucket *cloudv1alpha1.S3Bucket) error {
    // Placeholder: check if bucket exists, if not create it.
    // ...
    return nil
}

Analysis of the Pattern

  • Finalizer Name: Using a domain-qualified constant like cloud.my.domain/finalizer prevents collisions with other controllers that might be operating on the same object.
  • DeletionTimestamp.IsZero(): This is the canonical way to check if an object is undergoing deletion. If it's zero, the object is alive. If it's non-zero, deletion has been initiated.
  • controllerutil Helpers: The controller-runtime library provides ContainsFinalizer, AddFinalizer, and RemoveFinalizer. These helpers simplify the otherwise tedious slice manipulation of the ObjectMeta.Finalizers field.
  • Atomicity of Updates: Notice the flow: add finalizer, then Update(). Reconcile. Do work. Remove finalizer, then Update(). Each change to the finalizer list must be persisted to the API server. If the Update() call fails after adding the finalizer, the next reconciliation will simply re-verify that it's present and continue.
  • Error Handling is Key: In the else block, if handleFinalizer() returns an error, we do not remove the finalizer. We return the error to controller-runtime, which will requeue the request. The object will remain in a Terminating state until our cleanup logic succeeds.

  • Part 2: Advanced Edge Cases and Production Hardening

    The simple implementation above works for the happy path. Production environments are anything but. Let's explore the critical edge cases.

    Edge Case 1: Idempotency in Cleanup Logic

    Your reconciliation loop can be triggered multiple times for the same deletion event due to cluster state changes or requeues. Your cleanup logic must be idempotent.

    Problem: If your handleFinalizer function blindly tries to delete the S3 bucket, the second time it runs (after a requeue), the AWS API will return a NoSuchBucket error. If you treat this as a fatal error, your controller will get stuck in a perpetual retry loop, never removing the finalizer.

    Solution: The cleanup logic must first check for the existence of the external resource. If it doesn't exist, the cleanup should be considered a success.

    Our handleFinalizer already demonstrates this:

    go
    // ... inside handleFinalizer
    exists, err := r.S3Client.BucketExists(ctx, s3Bucket.Spec.BucketName)
    // ... error handling ...
    
    if exists {
        // ... attempt deletion ...
    } else {
        // Already gone, this is a success condition for the finalizer.
        logger.Info("External S3 bucket already deleted, nothing to do.")
    }
    return nil

    This ensures that even if the reconciliation loop runs ten times during the deletion process, it will only attempt the API call once and will correctly report success on subsequent runs, allowing the finalizer to be removed.

    Edge Case 2: Handling Cleanup Failures and Requeue Strategy

    What if the AWS API is down or returns a transient error (e.g., ThrottlingException)?

    Problem: Simply returning err from Reconcile triggers controller-runtime's default exponential backoff retry mechanism. While this is good, sometimes you need more control, especially for known transient issues.

    Solution: Implement a more nuanced requeue strategy. Instead of just returning the error, you can inspect it and decide whether to requeue immediately or after a specific delay.

    go
    // ... inside the deletion branch
    if err := r.handleFinalizer(ctx, s3Bucket); err != nil {
        logger.Error(err, "failed to handle finalizer")
    
        // Example: Check for a specific, transient AWS error
        var throttlingErr *types.ThrottlingException
        if errors.As(err, &throttlingErr) {
            logger.Info("AWS API is throttling. Requeuing after 30 seconds.")
            // Return a result object to requeue after a specific delay
            return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
        }
    
        // For other errors, use the default backoff
        return ctrl.Result{}, err
    }

    This gives you fine-grained control over the retry loop, preventing your operator from hammering a struggling downstream API while still ensuring eventual consistency.

    Edge Case 3: The "Stuck" Finalizer and Manual Intervention

    Problem: A bug in your controller or a permanent external issue (e.g., credentials revoked) could cause the handleFinalizer to fail indefinitely. The CR will be stuck in the Terminating state forever, and kubectl delete --force won't work.

    This is a feature, not a bug. It prevents data loss or orphaned resources. However, an administrator needs a "break glass" procedure.

    Solution: The administrator must manually patch the object to remove the finalizer. This is a dangerous operation that should be performed only when the operator is confirmed to be non-functional and the external resource has been cleaned up manually.

    The Command:

    bash
    # Get the current object YAML to see the finalizers
    kubectl get s3bucket my-stuck-bucket -n my-namespace -o yaml
    
    # Manually patch the object to remove the finalizer
    # Replace 'cloud.my.domain/finalizer' with your actual finalizer name
    kubectl patch s3bucket my-stuck-bucket -n my-namespace --type 'json' -p='[{"op': 'remove', 'path': '/metadata/finalizers/0'}]'
    
    # Note: The index '0' assumes your finalizer is the first in the list.
    # You might need to adjust this or remove the entire array if it's the only one.
    # A safer way to remove a specific finalizer from a list:
    # kubectl patch s3bucket my-stuck-bucket -n my-namespace --type=merge -p '{"metadata":{"finalizers":[]}}'
    # (This removes ALL finalizers, be careful)

    Your operator's documentation must include this procedure and clearly state the risks involved. It's also wise to add metrics and alerts to detect CRs that have been in a Terminating state for an extended period (e.g., > 1 hour).

    Edge Case 4: Race Conditions with Spec Updates

    Problem: What happens if a user updates the spec of a CR (e.g., s3Bucket.Spec.BucketName) at the exact moment another user requests its deletion?

    Solution: The Kubernetes API server and the controller-runtime framework handle this gracefully. When a deletion is requested, the deletionTimestamp is set. Any subsequent UPDATE operations on the object will be rejected by the API server, except for updates to metadata (like removing a finalizer) and status.

    This means your reconciliation logic doesn't need to worry about the spec changing mid-cleanup. Once the deletionTimestamp is set, the spec is effectively frozen. Your handleFinalizer can safely read from s3Bucket.Spec.BucketName knowing it's the version that existed at the time of deletion.


    Part 3: Performance and Scalability

    In a large-scale environment with thousands of CRs, the performance of your finalizer logic can become a bottleneck.

    Controller Concurrency

    The controller manager runs with a default MaxConcurrentReconciles. Let's say it's set to 10. If your handleFinalizer function takes 5 seconds to complete an AWS API call, and 10 CRs are deleted simultaneously, all 10 reconciliation workers will be busy for 5 seconds. No other S3Bucket CRs (being created, updated, or deleted) can be reconciled during this time.

    Considerations:

    * Long-Running Cleanup: If your cleanup involves a complex, multi-step process, consider an asynchronous pattern. The handleFinalizer could create a Kubernetes Job to perform the cleanup and then immediately return. The operator would then watch for the Job's completion status before removing the finalizer. This frees up the reconciler worker immediately.

    * External API Rate Limiting: If 1000 CRs are deleted at once, your operator might make 1000 simultaneous API calls to your cloud provider, triggering rate limiting. Implement client-side rate limiting in your external API client (e.g., using a token bucket algorithm) or serialize cleanup operations through a dedicated queue within the operator.

    Example: Asynchronous Cleanup with Status Updates

    To make the process more observable, we can update the CR's status subresource during finalization.

    First, add a status to your CRD:

    go
    // api/v1alpha1/s3bucket_types.go
    
    type S3BucketStatus struct {
        // +optional
        State string `json:"state,omitempty"`
        // +optional
        Message string `json:"message,omitempty"`
    }
    
    // ... in S3Bucket struct
    Status S3BucketStatus `json:"status,omitempty"`

    Now, update the finalizer handler to reflect the state:

    go
    // ... inside the deletion branch
    if controllerutil.ContainsFinalizer(s3Bucket, s3BucketFinalizer) {
        // Update status to indicate cleanup has started
        s3Bucket.Status.State = "Terminating"
        s3Bucket.Status.Message = "Removing external resources"
        if err := r.Status().Update(ctx, s3Bucket); err != nil {
            // Even if status update fails, we should proceed with cleanup
            logger.Error(err, "failed to update S3Bucket status during finalization")
        }
    
        if err := r.handleFinalizer(ctx, s3Bucket); err != nil {
            // On failure, update status again
            s3Bucket.Status.Message = fmt.Sprintf("Cleanup failed: %v", err)
            _ = r.Status().Update(ctx, s3Bucket) // Best effort update
    
            return ctrl.Result{}, err
        }
    
        // ... remove finalizer ...
    }

    This provides invaluable observability for platform administrators. When they see a resource stuck in Terminating, they can kubectl describe it and immediately see the error message from the last failed cleanup attempt in its status.


    Conclusion: Finalizers as a Contract

    Implementing finalizers correctly transforms your operator from a simple resource creator into a true lifecycle manager. It establishes a contract between your controller and the Kubernetes API server, ensuring that your business logic is an integral part of the object's deletion flow.

    For senior engineers, mastering this pattern is not optional; it is the cornerstone of building reliable, production-ready operators that can be trusted with critical infrastructure. The key takeaways are:

  • Always Gate on DeletionTimestamp: This is the universal signal to switch from normal reconciliation to cleanup mode.
  • Idempotency is Non-Negotiable: Your cleanup logic will run multiple times. Design it to succeed gracefully if the resource is already gone.
  • Errors Must Prevent Finalizer Removal: A failed cleanup must result in a requeue. The finalizer is the lock that guarantees eventual consistency.
  • Plan for Failure: Document the manual removal process for stuck finalizers and implement monitoring to detect them.
  • Consider Asynchronous Patterns: For long-running cleanup tasks, offload the work to avoid blocking your controller's reconciliation workers.
  • By internalizing these advanced patterns, you can build controllers that safely and robustly manage the entire lifecycle of any resource, whether it lives inside or outside your Kubernetes cluster.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles