Kubernetes Operators: Finalizers for Stateful Resource Deletion

14 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Orphaned Resource Problem: Why Standard Deletion Fails

As a senior engineer building on Kubernetes, you understand the power of the Operator Pattern. It allows us to extend the Kubernetes API, teaching the cluster how to manage complex, often stateful, applications. We define a desired state in a Custom Resource (CR), and the operator's controller works tirelessly to make that state a reality. This is the core of the reconciliation loop.

However, a critical gap emerges when our operator manages resources that live outside the Kubernetes cluster—a managed PostgreSQL instance in RDS, a BigQuery dataset, or a DNS record in Cloudflare. The standard Kubernetes garbage collection mechanism is designed for in-cluster objects. When you run kubectl delete my-custom-resource, Kubernetes removes the object from etcd. If that object was the owner of a Deployment and a Service, those child objects are automatically garbage collected.

But what about the RDS instance created on its behalf? Kubernetes has no knowledge of it. The reconcile request for the now-deleted CR simply stops, and the controller moves on. The result is an orphaned resource: a running, and often costly, database instance with no corresponding Kubernetes object to manage it. This is not just a resource leak; it's a critical reliability and cost-management failure in a production system.

This is the problem that finalizers solve. A finalizer is a mechanism that tells the Kubernetes API server: "Do not fully delete this object yet. There is external cleanup work that must be completed first." It allows our controller to intercept the deletion process, perform the necessary off-cluster actions, and then, and only then, give Kubernetes the green light to remove the object from etcd.

This article will walk through a production-grade implementation of an operator that manages an ExternalDatabase CRD, focusing specifically on the robust implementation of finalizers for graceful and guaranteed cleanup.

The Anatomy of Our `ExternalDatabase` Operator

To ground our discussion, let's define the components. We're building an operator to manage a fictional database-as-a-service.

1. The `ExternalDatabase` Custom Resource Definition (CRD)

Our CRD defines the schema for our custom resource. The spec declares the user's desired state, and the status is where our controller will report the observed state of the world.

api/v1/externaldatabase_types.go

go
package v1

import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// ExternalDatabaseSpec defines the desired state of ExternalDatabase
type ExternalDatabaseSpec struct {
	// Name of the database to be created
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=3
	Name string `json:"name"`

	// Engine specifies the database engine (e.g., "postgres", "mysql")
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:Enum=postgres;mysql
	Engine string `json:"engine"`

	// DeletionPolicy determines what happens to the external resource when the CR is deleted.
	// "Delete" will delete the external resource. "Retain" will leave it.
	// +kubebuilder:validation:Enum=Delete;Retain
	// +kubebuilder:default:=Delete
	DeletionPolicy string `json:"deletionPolicy,omitempty"`
}

// ExternalDatabaseStatus defines the observed state of ExternalDatabase
type ExternalDatabaseStatus struct {
	// DBID is the unique identifier for the database in the external system.
	DBID string `json:"dbid,omitempty"`

	// Conditions represent the latest available observations of the resource's state.
	// +optional
	// +patchMergeKey=type
	// +patchStrategy=merge
	Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:printcolumn:name="DBID",type="string",JSONPath=".status.dbid"
//+kubebuilder:printcolumn:name="Ready",type="string",JSONPath=".status.conditions[?(@.type==\"Ready\")].status"

// ExternalDatabase is the Schema for the externaldatabases API
type ExternalDatabase struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   ExternalDatabaseSpec   `json:"spec,omitempty"`
	Status ExternalDatabaseStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// ExternalDatabaseList contains a list of ExternalDatabase
type ExternalDatabaseList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []ExternalDatabase `json:"items"`
}

func init() {
	SchemeBuilder.Register(&ExternalDatabase{}, &ExternalDatabaseList{})
}

Key elements here for senior engineers:

  • status.Conditions: We are using the standard metav1.Condition type. This is a best practice that makes our operator's status immediately understandable to standard Kubernetes tooling (kubectl wait, etc.) and other controllers.
  • Subresource Status: The //+kubebuilder:subresource:status marker ensures that changes to the .status field can only be made through the /status subresource, preventing controllers from accidentally overwriting user-defined .spec changes.
  • DeletionPolicy: This gives users control, a common pattern in production systems (e.g., PersistentVolume reclaim policies).
  • 2. The Controller and its Reconciliation Loop

    Our controller's core logic lives in the Reconcile method. This method is invoked by the controller-runtime framework whenever there's a change to an ExternalDatabase resource (or a secondary resource it's watching).

    Here is the skeleton of our controller. We'll flesh this out with the finalizer logic.

    internal/controller/externaldatabase_controller.go

    go
    package controller
    
    import (
    	"context"
    
    	"k8s.io/apimachinery/pkg/runtime"
    	ctrl "sigs.k8s.io/controller-runtime"
    	"sigs.k8s.io/controller-runtime/pkg/client"
    	"sigs.k8s.io/controller-runtime/pkg/log"
    
    	databasev1 "finalizer-demo/api/v1"
    )
    
    // ExternalDatabaseReconciler reconciles a ExternalDatabase object
    type ExternalDatabaseReconciler struct {
    	client.Client
    	Scheme *runtime.Scheme
    	// A mock client for our external DB service
    	// In a real implementation, this would be a proper client.
    	ExternalDBClient ExternalDatabaseAPI
    }
    
    //+kubebuilder:rbac:groups=database.example.com,resources=externaldatabases,verbs=get;list;watch;create;update;patch;delete
    //+kubebuilder:rbac:groups=database.example.com,resources=externaldatabases/status,verbs=get;update;patch
    //+kubebuilder:rbac:groups=database.example.com,resources=externaldatabases/finalizers,verbs=update
    
    func (r *ExternalDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    	_ = log.FromContext(ctx)
    
    	// Business logic will go here
    
    	return ctrl.Result{}, nil
    }
    
    // SetupWithManager sets up the controller with the Manager.
    func (r *ExternalDatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
    	return ctrl.NewControllerManagedBy(mgr).
    		For(&databasev1.ExternalDatabase{}).
    		Complete(r)
    }

    The Core Pattern: A Finalizer-Aware Reconciliation Loop

    The entire strategy hinges on structuring the Reconcile function to handle two distinct states: the object is being deleted, or the object is not being deleted. The presence of a deletionTimestamp on the object's metadata is the definitive signal.

    Let's define our finalizer's name.

    go
    const externalDatabaseFinalizer = "database.example.com/finalizer"

    Here is the high-level structure of our Reconcile function, which we will now implement piece by piece.

    go
    func (r *ExternalDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    	logger := log.FromContext(ctx)
    
    	// 1. Fetch the ExternalDatabase instance
    	db := &databasev1.ExternalDatabase{}
    	if err := r.Get(ctx, req.NamespacedName, db); err != nil {
    		// Handle not-found errors, which can occur after deletion
    		return ctrl.Result{}, client.IgnoreNotFound(err)
    	}
    
    	// 2. Check if the object is being deleted
    	if !db.ObjectMeta.DeletionTimestamp.IsZero() {
    		// The object is being deleted
    		return r.reconcileDelete(ctx, db)
    	}
    
    	// 3. Ensure our finalizer is present if the object is not being deleted
    	if !controllerutil.ContainsFinalizer(db, externalDatabaseFinalizer) {
    		logger.Info("Adding finalizer for ExternalDatabase")
    		controllerutil.AddFinalizer(db, externalDatabaseFinalizer)
    		if err := r.Update(ctx, db); err != nil {
    			return ctrl.Result{}, err
    		}
    	}
    
    	// 4. The object is not being deleted, so run the normal reconciliation logic
    	return r.reconcileNormal(ctx, db)
    }

    This structure is critical. It immediately branches the logic based on the deletion state.

    Step 1: Adding the Finalizer

    When a new CR is created, its deletionTimestamp is zero. Our first task is to add our finalizer to its metadata. This acts as a registration, telling Kubernetes we need to be involved in its deletion.

    go
    // Part of the main Reconcile function
    
    // 3. Ensure our finalizer is present if the object is not being deleted
    if !controllerutil.ContainsFinalizer(db, externalDatabaseFinalizer) {
        logger.Info("Adding finalizer for ExternalDatabase")
        controllerutil.AddFinalizer(db, externalDatabaseFinalizer)
        if err := r.Update(ctx, db); err != nil {
            logger.Error(err, "Failed to add finalizer")
            return ctrl.Result{}, err
        }
        // After adding the finalizer, we return to trigger another reconcile. 
        // This is a good practice to ensure the state is consistent before proceeding.
        return ctrl.Result{Requeue: true}, nil
    }

    We use the controllerutil helpers, which are part of controller-runtime, to safely add the finalizer. Notice the return ctrl.Result{Requeue: true}, nil. While not strictly necessary, it's a defensive pattern to ensure the next reconciliation cycle operates on an object that is guaranteed to have the finalizer.

    Step 2: The Normal Reconciliation Logic (`reconcileNormal`)

    This is the "happy path" logic that runs when the CR is being created or updated. Its job is to converge the state of the external world with the spec.

    go
    func (r *ExternalDatabaseReconciler) reconcileNormal(ctx context.Context, db *databasev1.ExternalDatabase) (ctrl.Result, error) {
    	logger := log.FromContext(ctx)
    
    	// If the external DB ID is not in our status, it means we need to create it.
    	if db.Status.DBID == "" {
    		logger.Info("Creating external database", "name", db.Spec.Name)
    		
    		// This is our mock API call
    		dbID, err := r.ExternalDBClient.CreateDatabase(ctx, db.Spec.Name, db.Spec.Engine)
    		if err != nil {
    			logger.Error(err, "Failed to create external database")
    			// Update status to reflect the failure
    			meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{
    				Type:    "Ready",
    				Status:  metav1.ConditionFalse,
    				Reason:  "ProvisionFailed",
    				Message: err.Error(),
    			})
    			if updateErr := r.Status().Update(ctx, db); updateErr != nil {
    				return ctrl.Result{}, updateErr
    			}
    			// Return error to trigger exponential backoff retry
    			return ctrl.Result{}, err
    		}
    
    		// Creation was successful. Update the status.
    		db.Status.DBID = dbID
    		meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{
    			Type:   "Ready",
    			Status: metav1.ConditionTrue,
    			Reason: "Provisioned",
    			Message: "External database provisioned successfully",
    		})
    
    		logger.Info("Successfully created external database", "DBID", dbID)
    		if err := r.Status().Update(ctx, db); err != nil {
    			logger.Error(err, "Failed to update ExternalDatabase status")
    			return ctrl.Result{}, err
    		}
    	}
    
    	// Here you could add logic for handling updates to the spec, state drift, etc.
    	// For this example, we'll assume the spec is immutable.
    
    	logger.Info("Reconciliation successful")
    	return ctrl.Result{}, nil
    }

    Key Production Patterns:

  • Idempotency: The logic is wrapped in if db.Status.DBID == "". If the reconciler runs again after a successful creation, it will see the DBID in the status and skip the creation step. This is essential for stability.
  • Status Conditions: We use meta.SetStatusCondition to provide rich, machine-readable status updates. This is far superior to a simple phase: "Ready" string.
  • Error Handling: On failure, we update the status with the error and return the original error. controller-runtime will see the non-nil error and requeue the request with exponential backoff, preventing us from hammering a failing external API.
  • Step 3: The Deletion Logic (`reconcileDelete`)

    This is the heart of the finalizer pattern. This function is only called when !db.ObjectMeta.DeletionTimestamp.IsZero() is true.

    When a user runs kubectl delete, the API server does two things:

  • It sets the deletionTimestamp to the current time.
    • It triggers a reconcile event.

    It does not remove the object from etcd, because our finalizer is present.

    go
    func (r *ExternalDatabaseReconciler) reconcileDelete(ctx context.Context, db *databasev1.ExternalDatabase) (ctrl.Result, error) {
    	logger := log.FromContext(ctx)
    
    	// Check if our finalizer is the one we should be handling
    	if controllerutil.ContainsFinalizer(db, externalDatabaseFinalizer) {
    		logger.Info("Handling deletion for ExternalDatabase")
    
    		// Respect the DeletionPolicy
    		if db.Spec.DeletionPolicy == "Retain" {
    			logger.Info("DeletionPolicy is Retain, skipping external resource deletion")
    		} else {
    			// Our core cleanup logic
    			if err := r.ExternalDBClient.DeleteDatabase(ctx, db.Status.DBID); err != nil {
    				// If the external deletion fails, we must return an error.
    				// This ensures the reconcile loop will be retried, and the finalizer won't be removed.
    				logger.Error(err, "Failed to delete external database; will retry")
                    
                    // You could update status here to indicate DeletionFailed
                    meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{
                        Type:    "Ready",
                        Status:  metav1.ConditionFalse,
                        Reason:  "DeletionFailed",
                        Message: err.Error(),
                    })
                    if updateErr := r.Status().Update(ctx, db); updateErr != nil {
                        return ctrl.Result{}, updateErr
                    }
    
    				return ctrl.Result{}, err
    			}
    		}
    
    		// If cleanup was successful (or skipped), we can remove the finalizer.
    		logger.Info("External resource cleanup successful, removing finalizer")
    		controllerutil.RemoveFinalizer(db, externalDatabaseFinalizer)
    		if err := r.Update(ctx, db); err != nil {
    			return ctrl.Result{}, err
    		}
    	}
    
    	// Stop reconciliation as the item is being deleted
    	return ctrl.Result{}, nil
    }

    This logic is the crux of the pattern:

  • Check for Finalizer: We double-check that our finalizer is present before acting.
  • Execute Cleanup: We call our external API to delete the database. This step must be idempotent. If the external resource is already gone, DeleteDatabase should return success, not an error.
  • Handle Cleanup Failure: If DeleteDatabase returns a transient error, we return the error to the framework. The CR's deletion is now blocked. The finalizer remains, and the controller will retry the deletion after a backoff period. This guarantees cleanup.
  • Remove Finalizer on Success: Only after the external resource is successfully deleted do we remove the finalizer from the CR's metadata and update it. Once the API server sees an object with a deletionTimestamp and an empty finalizer list, it completes the deletion, and the object is removed from etcd.
  • Advanced Edge Cases and Production Considerations

    Simple examples stop here. Production systems require deeper thought.

    Edge Case: Controller Pod Crashes During Deletion

    Imagine this sequence:

  • User runs kubectl delete.
  • Controller starts reconcileDelete.
  • The call to r.ExternalDBClient.DeleteDatabase succeeds.
  • The controller pod crashes before it can run controllerutil.RemoveFinalizer.
  • Is this a problem? No. This is why the pattern is so robust. When the controller restarts, it will receive a reconcile request for the ExternalDatabase CR (which still exists because the finalizer was never removed). The reconcileDelete logic will run again. The call to DeleteDatabase must be idempotent; it should see the database is already gone and return success. The controller will then proceed to remove the finalizer, and the deletion completes. Your external API client must handle a delete call for a non-existent resource gracefully.

    Edge Case: Finalizer gets "Stuck"

    What if the external API is permanently unavailable, or a bug in your cleanup logic prevents it from ever succeeding? The CR will be stuck in a Terminating state forever. This is a common operational issue.

    Mitigation Strategies:

  • Excellent Monitoring: Your controller should have metrics for reconciliation errors and CRs stuck in a terminating state for an extended period.
  • Manual Intervention: An administrator may need to manually intervene. This usually involves either fixing the external issue or manually editing the CR (kubectl edit externaldatabase my-db) to remove the finalizer. This is a last resort, as it will orphan the external resource.
  • Timeouts: You could build a timeout mechanism into your reconcileDelete logic. If deletion fails for over 24 hours, you could update a status condition to DeletionFailedPermanently and stop retrying, alerting an operator.
  • Performance: `MaxConcurrentReconciles`

    In your main.go, the controller is set up like this:

    go
    // main.go
    err = (&controller.ExternalDatabaseReconciler{
        Client: mgr.GetClient(),
        Scheme: mgr.GetScheme(),
    }).SetupWithManager(mgr)

    This can be configured:

    go
    // main.go
    err = ctrl.NewControllerManagedBy(mgr).
        For(&databasev1.ExternalDatabase{}).
        WithOptions(controller.Options{MaxConcurrentReconciles: 5}).
        Complete(&controller.ExternalDatabaseReconciler{...})

    By default, MaxConcurrentReconciles is 1. If your operator manages thousands of resources and the external API is fast, you can increase this to process reconciles in parallel. However, if your external API has strict rate limits, you might need to keep this at 1 or implement a client-side rate limiter to avoid being throttled.

    Complete Implementation Example

    Here is a mock external API client to make the example runnable.

    internal/controller/mock_external_api.go

    go
    package controller
    
    import (
    	"context"
    	"fmt"
    	"sync"
    
    	"github.com/google/uuid"
    )
    
    // A mock client that simulates an external Database-as-a-Service API
    
    type ExternalDatabaseAPI interface {
    	CreateDatabase(ctx context.Context, name, engine string) (string, error)
    	DeleteDatabase(ctx context.Context, dbID string) error
    	GetDatabaseStatus(ctx context.Context, dbID string) (string, error)
    }
    
    type mockDBClient struct {
    	// In-memory map to simulate the external database store
    	mu    sync.Mutex
    	dbs   map[string]string // map[dbID]status
    }
    
    func NewMockDBClient() ExternalDatabaseAPI {
    	return &mockDBClient{
    		dbs: make(map[string]string),
    	}
    }
    
    func (c *mockDBClient) CreateDatabase(ctx context.Context, name, engine string) (string, error) {
    	c.mu.Lock()
    	defer c.mu.Unlock()
    
    	// Simulate potential transient errors
    	if name == "fail-creation" {
    		return "", fmt.Errorf("API error: failed to provision database cluster")
    	}
    
    	dbID := uuid.New().String()
    	c.dbs[dbID] = "available"
    	fmt.Printf("[Mock API] Created database %s with ID %s\n", name, dbID)
    	return dbID, nil
    }
    
    func (c *mockDBClient) DeleteDatabase(ctx context.Context, dbID string) error {
    	c.mu.Lock()
    	defer c.mu.Unlock()
    
    	// Idempotency: If the DB doesn't exist, it's a success from a cleanup perspective.
    	if _, ok := c.dbs[dbID]; !ok {
    		fmt.Printf("[Mock API] Delete called for non-existent DB ID %s. Treating as success.\n", dbID)
    		return nil
    	}
    
    	delete(c.dbs, dbID)
    	fmt.Printf("[Mock API] Deleted database with ID %s\n", dbID)
    	return nil
    }
    
    func (c *mockDBClient) GetDatabaseStatus(ctx context.Context, dbID string) (string, error) {
    	c.mu.Lock()
    	defer c.mu.Unlock()
    
    	if status, ok := c.dbs[dbID]; ok {
    		return status, nil
    	}
    	return "", fmt.Errorf("database with ID %s not found", dbID)
    }

    This simple mock demonstrates the critical idempotent nature of the DeleteDatabase call.

    Conclusion

    The finalizer pattern is not merely a feature of Kubernetes; it is the cornerstone of writing a reliable operator that manages external, stateful resources. By intercepting the deletion process, your controller gains the ability to perform crucial cleanup tasks, preventing orphaned resources and ensuring the integrity of your system.

    A production-ready finalizer implementation can be summarized by these principles:

  • Branch Early: Structure your Reconcile loop to immediately check for the deletionTimestamp.
  • Register Intent: Add your finalizer to a resource as soon as you begin managing it.
  • Idempotent Cleanup: Your external resource deletion logic must succeed even if the resource is already gone.
  • Block on Failure: Never remove the finalizer if the cleanup logic fails. Return an error to leverage the controller's backoff and retry mechanism.
  • Remove on Success: The final act of a successful cleanup must be the removal of the finalizer, which cedes control back to Kubernetes garbage collection.
  • Mastering this pattern moves you from writing basic controllers that simply create resources to building robust, self-healing, and production-grade operators that can be trusted to manage the complete lifecycle of critical infrastructure.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles