Idempotent Kubernetes Operators: The Finalizer Pattern Deep Dive

17 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Flaw in a Simple Reconciliation Loop

As a senior engineer working with Kubernetes, you understand the power of the operator pattern. The core of an operator is its reconciliation loop, a control process that continuously drives the current state of the system toward a desired state defined in a Custom Resource (CR). For stateless applications managed entirely within the cluster, this model is remarkably effective.

However, the moment your operator needs to manage a resource outside the Kubernetes cluster—a managed database on AWS RDS, a bucket in GCS, or a topic in a Confluent Cloud Kafka cluster—the complexity skyrockets. A simple reconciliation loop that only handles creation and updates contains a critical, production-dooming flaw: it cannot gracefully handle deletion.

Consider this scenario: a user creates a CloudDatabase CR. Your operator sees it, calls the cloud provider's API, and provisions a new PostgreSQL instance. The user later runs kubectl delete clouddatabase my-prod-db. What happens?

  • The Kubernetes API server receives the delete request.
  • The CloudDatabase object is immediately removed from etcd.
    • Your operator receives a 'delete' event, but the object it needs to inspect (containing the database ID, cloud region, etc.) is already gone.

    Your operator is now powerless. It cannot call the cloud provider's API to deprovision the PostgreSQL instance because it no longer has the necessary information. The result is an orphaned resource—a costly, running database that you're still paying for, completely disconnected from any Kubernetes-managed state.

    This is where the Finalizer Pattern becomes not just a best practice, but an absolute requirement for building reliable, stateful operators.

    This article is not an introduction to operators. It assumes you are familiar with Go, Kubebuilder or Operator SDK, and the basic reconciliation concept. We will focus exclusively on architecting a production-grade, idempotent reconciliation loop that correctly implements finalizers to guarantee resource cleanup.


    Architecting for Idempotency: The Foundation

    Before we introduce finalizers, we must ensure our core reconciliation logic is idempotent. An operation is idempotent if applying it multiple times produces the same result as applying it once. In a Kubernetes operator, the Reconcile function may be called many times for the same CR due to cluster events, controller restarts, or failed updates. If your logic isn't idempotent, you risk creating duplicate external resources or performing unnecessary, expensive API calls.

    Let's define the state machine for our CloudDatabase operator. The desired state is in CloudDatabase.spec, and the observed state is in CloudDatabase.status and the external cloud provider.

    The Non-Idempotent Trap

    A naive implementation might look like this:

    go
    // DO NOT USE THIS IN PRODUCTION
    func (r *CloudDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
        log := log.FromContext(ctx)
        var cloudDB customv1.CloudDatabase
        if err := r.Get(ctx, req.NamespacedName, &cloudDB); err != nil {
            return ctrl.Result{}, client.IgnoreNotFound(err)
        }
    
        // Naive check: if we don't have a DB ID in status, create one.
        if cloudDB.Status.DatabaseID == "" {
            log.Info("Creating a new CloudDatabase instance")
            instanceID, err := r.CloudProvider.CreateDatabase(ctx, cloudDB.Spec.Engine, cloudDB.Spec.Size)
            if err != nil {
                log.Error(err, "Failed to create external database")
                return ctrl.Result{}, err
            }
    
            // The critical race condition is here!
            cloudDB.Status.DatabaseID = instanceID
            cloudDB.Status.State = "Creating"
            if err := r.Status().Update(ctx, &cloudDB); err != nil {
                log.Error(err, "Failed to update CloudDatabase status after creation")
                // If this update fails, the next reconcile will re-run the creation logic!
                return ctrl.Result{}, err
            }
        }
    
        return ctrl.Result{}, nil
    }

    The flaw is subtle but deadly. If the r.Status().Update call fails for any reason (e.g., a temporary API server outage, etcd contention), the Reconcile function will return an error and be re-queued. On the next run, cloudDB.Status.DatabaseID will still be empty, and the operator will call r.CloudProvider.CreateDatabase again, creating a duplicate database.

    The Correct Idempotent Pattern

    The correct approach is to always check the actual state of the external world before taking any action.

    • Fetch the CR.
  • Check if the external resource actually exists by querying the cloud provider, perhaps using a unique tag or name derived from the CR's UID.
  • Compare the desired state (spec) with the actual state (from the cloud provider).
    • Take action only if there is a delta.
  • Update the CR status with the observed state.
  • Here is the refactored, idempotent creation logic:

    go
    import (
        "context"
        "fmt"
        apierrors "k8s.io/apimachinery/pkg/api/errors"
        "sigs.k8s.io/controller-runtime/pkg/client"
        "sigs.k8s.io/controller-runtime/pkg/log"
        // ... other imports
    )
    
    // Assume r.CloudProvider is an interface for our external service
    type CloudProviderAPI interface {
        GetDatabase(ctx context.Context, instanceID string) (*DatabaseInstance, error)
        FindDatabaseByCR(ctx context.Context, cr *customv1.CloudDatabase) (*DatabaseInstance, error)
        CreateDatabase(ctx context.Context, cr *customv1.CloudDatabase) (*DatabaseInstance, error)
        UpdateDatabase(ctx context.Context, instanceID string, cr *customv1.CloudDatabase) error
        DeleteDatabase(ctx context.Context, instanceID string) error
    }
    
    func (r *CloudDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
        log := log.FromContext(ctx)
        var cloudDB customv1.CloudDatabase
        if err := r.Get(ctx, req.NamespacedName, &cloudDB); err != nil {
            // Ignore not-found errors, since they can't be fixed by an immediate requeue.
            return ctrl.Result{}, client.IgnoreNotFound(err)
        }
    
        // First, check if the external resource exists.
        // We use a deterministic way to find it, e.g., via tags based on CR UID.
        externalDB, err := r.CloudProvider.FindDatabaseByCR(ctx, &cloudDB)
        if err != nil && !IsExternalResourceNotFound(err) { // IsExternalResourceNotFound is a custom error check
            log.Error(err, "Failed to query external database state")
            // Requeue with backoff if the external API is failing.
            return ctrl.Result{RequeueAfter: time.Minute}, err
        }
    
        // Case 1: External resource does not exist. We need to create it.
        if externalDB == nil {
            log.Info("External database not found. Creating...")
            newDB, err := r.CloudProvider.CreateDatabase(ctx, &cloudDB)
            if err != nil {
                log.Error(err, "Failed to create external database")
                cloudDB.Status.State = "ErrorCreating"
                cloudDB.Status.Message = err.Error()
                _ = r.Status().Update(ctx, &cloudDB) // Best-effort status update
                return ctrl.Result{}, err
            }
    
            log.Info("Successfully created external database", "DatabaseID", newDB.ID)
            cloudDB.Status.DatabaseID = newDB.ID
            cloudDB.Status.State = "Provisioned"
            cloudDB.Status.Endpoint = newDB.Endpoint
            cloudDB.Status.Message = ""
            if err := r.Status().Update(ctx, &cloudDB); err != nil {
                // If status update fails, the next reconcile will find the DB and correct the status.
                // This is now safe and idempotent.
                return ctrl.Result{}, err
            }
            return ctrl.Result{}, nil
        }
    
        // Case 2: External resource exists. We need to check for drift.
        log.Info("External database found", "DatabaseID", externalDB.ID)
    
        // Sync status if it's missing (e.g., operator restarted)
        if cloudDB.Status.DatabaseID == "" {
            cloudDB.Status.DatabaseID = externalDB.ID
            cloudDB.Status.State = externalDB.State
            cloudDB.Status.Endpoint = externalDB.Endpoint
        }
    
        // Drift detection: Compare spec with actual state
        if cloudDB.Spec.Size != externalDB.Size {
            log.Info("Drift detected. Updating database size.", "Expected", cloudDB.Spec.Size, "Actual", externalDB.Size)
            if err := r.CloudProvider.UpdateDatabase(ctx, externalDB.ID, &cloudDB); err != nil {
                log.Error(err, "Failed to update external database")
                cloudDB.Status.State = "ErrorUpdating"
                cloudDB.Status.Message = err.Error()
                _ = r.Status().Update(ctx, &cloudDB)
                return ctrl.Result{}, err
            }
            cloudDB.Status.State = "Updating"
            cloudDB.Status.Message = "Database size is being updated."
        } else {
            cloudDB.Status.State = "Provisioned"
            cloudDB.Status.Message = ""
        }
    
        // Always update status at the end of a successful reconcile
        if err := r.Status().Update(ctx, &cloudDB); err != nil {
            return ctrl.Result{}, err
        }
    
        return ctrl.Result{}, nil
    }

    This logic is robust. If any step fails, the next reconciliation will re-evaluate the state of the world from scratch and converge correctly without causing side effects.


    Implementing the Finalizer Pattern for Graceful Deletion

    Now we can address the deletion problem. A finalizer is a key in the metadata.finalizers list of a Kubernetes object. When you add a finalizer to an object, you are telling the Kubernetes garbage collector, "Do not delete this object from etcd until this specific finalizer key is removed."

    When a user tries to delete an object with a finalizer, the API server doesn't delete it. Instead, it sets the metadata.deletionTimestamp field to the current time. This is the signal for our operator to perform its cleanup logic.

    Our workflow will be:

  • Add Finalizer: When a CloudDatabase CR is first created, our operator adds its own unique finalizer (e.g., clouddatabases.custom.example.com/finalizer) to the object.
  • Detect Deletion: In the reconciliation loop, we check if deletionTimestamp is set.
  • Perform Cleanup: If the timestamp is set, we call the cloud provider's API to delete the external database.
  • Remove Finalizer: Once the external database is confirmed to be deleted, we remove our finalizer from the CR. This signals to Kubernetes that our cleanup is complete.
  • Garbage Collection: With the finalizer removed, the Kubernetes garbage collector is now free to delete the CloudDatabase CR from etcd.
  • Full Reconciler with Finalizer Logic

    Let's integrate this into our Reconcile function.

    go
    import (
        // ... previous imports
        "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
    )
    
    const cloudDatabaseFinalizer = "clouddatabases.custom.example.com/finalizer"
    
    func (r *CloudDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
        log := log.FromContext(ctx)
        var cloudDB customv1.CloudDatabase
        if err := r.Get(ctx, req.NamespacedName, &cloudDB); err != nil {
            return ctrl.Result{}, client.IgnoreNotFound(err)
        }
    
        // ------------------------------------------------------------------
        // 1. DELETION LOGIC (FINALIZER)
        // ------------------------------------------------------------------
        // Check if the object is being deleted
        if !cloudDB.ObjectMeta.DeletionTimestamp.IsZero() {
            // The object is being deleted
            if controllerutil.ContainsFinalizer(&cloudDB, cloudDatabaseFinalizer) {
                log.Info("Performing finalizer cleanup for CloudDatabase")
    
                // Our cleanup logic: delete the external resource
                if err := r.deleteExternalResources(ctx, &cloudDB); err != nil {
                    // If cleanup fails, we don't remove the finalizer. 
                    // The reconciliation will be retried.
                    log.Error(err, "Failed to delete external resources")
                    return ctrl.Result{}, err
                }
    
                log.Info("External resources deleted successfully. Removing finalizer.")
                // Once cleanup is successful, remove the finalizer
                controllerutil.RemoveFinalizer(&cloudDB, cloudDatabaseFinalizer)
                if err := r.Update(ctx, &cloudDB); err != nil {
                    return ctrl.Result{}, err
                }
            }
    
            // Stop reconciliation as the item is being deleted
            return ctrl.Result{}, nil
        }
    
        // ------------------------------------------------------------------
        // 2. ADD FINALIZER (if it doesn't exist)
        // ------------------------------------------------------------------
        if !controllerutil.ContainsFinalizer(&cloudDB, cloudDatabaseFinalizer) {
            log.Info("Adding finalizer for CloudDatabase")
            controllerutil.AddFinalizer(&cloudDB, cloudDatabaseFinalizer)
            if err := r.Update(ctx, &cloudDB); err != nil {
                return ctrl.Result{}, err
            }
        }
    
        // ------------------------------------------------------------------
        // 3. REGULAR RECONCILIATION LOGIC (IDEMPOTENT)
        // ------------------------------------------------------------------
        externalDB, err := r.CloudProvider.FindDatabaseByCR(ctx, &cloudDB)
        // ... (the rest of the idempotent logic from the previous section) ...
        // ... (create if not exists, update if drift detected) ...
    
        return ctrl.Result{}, nil
    }
    
    // deleteExternalResources encapsulates the cleanup logic
    func (r *CloudDatabaseReconciler) deleteExternalResources(ctx context.Context, cloudDB *customv1.CloudDatabase) error {
        log := log.FromContext(ctx)
    
        // We need the external DB ID to delete it. It should be in the status.
        if cloudDB.Status.DatabaseID == "" {
            log.Info("DatabaseID not found in status. Assuming external resource was never created or already deleted.")
            return nil
        }
    
        log.Info("Deleting external database", "DatabaseID", cloudDB.Status.DatabaseID)
        err := r.CloudProvider.DeleteDatabase(ctx, cloudDB.Status.DatabaseID)
    
        // Edge Case: If the resource is already gone in the cloud provider, 
        // we should treat it as a success and proceed with finalizer removal.
        if err != nil && IsExternalResourceNotFound(err) {
            log.Info("External database already deleted.")
            return nil
        }
    
        return err
    }

    This structure is now robust. The reconciliation logic is cleanly separated:

    • Handle deletion first. If the object is marked for deletion, we only care about cleanup.
    • If not being deleted, ensure our finalizer is present. This is our guarantee that we'll get a chance to clean up later.
    • Proceed with the normal, idempotent create/update logic.

    Advanced Edge Cases and Production Hardening

    Building a truly production-grade operator requires thinking about what can go wrong.

    Partial Failures During Cleanup

    What if r.CloudProvider.DeleteDatabase fails due to a transient network error? Our deleteExternalResources function returns an error, the Reconcile function returns an error, and the request is re-queued. Controller-runtime provides exponential backoff by default, so we won't hammer the cloud provider's API. On the next attempt, the logic will run again. The finalizer remains until the deletion call succeeds, preventing the CR from being removed prematurely.

    External Resource Deleted Manually

    A common operational issue is when an engineer manually deletes the external resource via the cloud console. Our operator's finalizer is still on the CR, so kubectl delete will hang.

    Our deleteExternalResources function handles this gracefully. When it calls r.CloudProvider.DeleteDatabase, the provider will return a "Not Found" error. We have a custom error check (IsExternalResourceNotFound) to detect this specific case. If the resource is already gone, we consider our job done, return nil, and allow the finalizer to be removed. This unblocks the CR deletion.

    Controller Concurrency and Rate Limiting

    By default, controller-runtime managers can run multiple reconciliations in parallel (maxConcurrentReconciles). Because our logic is idempotent, this is safe from a correctness standpoint. However, you could still hit API rate limits on your cloud provider. If you see this, you might:

    * Lower maxConcurrentReconciles in your main.go file.

    * Implement client-side rate-limiting in your CloudProviderAPI implementation.

    * Ensure your Reconcile function returns ctrl.Result{RequeueAfter: ...} with a sensible delay when it detects a rate-limiting error from the provider.

    Optimizing Reconciliations with Predicate Functions

    Your operator's controller watches for changes to CloudDatabase objects. By default, it will trigger a reconciliation for almost any change, including changes to status or metadata that your controller might have made itself. This can lead to unnecessary reconciliation loops.

    We can use predicate functions to filter which events trigger a reconciliation. A common optimization is to ignore status-only updates, as the operator itself is usually the only writer of the status subresource.

    In your controller setup (main.go or a dedicated setup file):

    go
    import (
        "sigs.k8s.io/controller-runtime/pkg/predicate"
        "sigs.k8s.io/controller-runtime/pkg/event"
    )
    
    // IgnoreStatusUpdates predicate filters out updates to the status subresource
    func IgnoreStatusUpdates() predicate.Predicate {
        return predicate.Funcs{
            UpdateFunc: func(e event.UpdateEvent) bool {
                // Ignore updates to CR status in which case metadata.Generation does not change
                return e.ObjectOld.GetGeneration() != e.ObjectNew.GetGeneration()
            },
        }
    }
    
    // In your controller setup
    err = ctrl.NewControllerManagedBy(mgr).
        For(&customv1.CloudDatabase{}).
        WithEventFilter(IgnoreStatusUpdates()).
        Complete(r)
    

    Kubernetes increments the metadata.generation field only when the spec of an object changes. By filtering events to only reconcile when the generation has changed, we effectively ignore status updates and other metadata-only changes, significantly reducing the load on the controller and external APIs.

    Conclusion

    Moving from a simple operator to a production-ready one requires a deep focus on idempotency and lifecycle management. The Finalizer Pattern is the canonical, battle-tested solution within the Kubernetes ecosystem for managing external resources with guaranteed cleanup.

    By architecting your reconciliation loop with these principles:

  • Assume Nothing: Always query the actual state of the external world before taking action.
  • Gate Deletion: Use a finalizer to prevent the CR from being deleted until you have performed your cleanup.
  • Handle Deletion Explicitly: Check for the deletionTimestamp as the first step in your reconciliation logic.
  • Plan for Failure: Handle errors gracefully, account for manual out-of-band changes, and use requeueing with backoff for transient issues.
  • Optimize: Use predicate functions to prevent wasteful reconciliation cycles.
  • you can build robust, reliable operators that safely manage critical, stateful infrastructure, bridging the gap between the Kubernetes API and the outside world.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles