K8s Operators: Finalizers for Stateful External Resource Cleanup
The Declarative-Imperative Impedance Mismatch
As senior engineers building on Kubernetes, we embrace its declarative nature. We define the desired state in a manifest, and controllers work to make reality match that state. This works beautifully for resources native to the cluster. However, the moment an Operator needs to manage a resource outside the Kubernetes API—a managed database on AWS RDS, a bucket in GCS, or a project in GitLab—we hit a fundamental impedance mismatch.
Creating and updating these external resources is a solved problem within the Operator's reconciliation loop. The controller observes the Custom Resource (CR), checks the state of the external resource via its imperative API, and issues commands (CreateDatabase, UpdateInstance) to converge the state. The real challenge, and a common source of production failures, lies in deletion.
When a user runs kubectl delete mycr my-instance, Kubernetes's default behavior is to simply remove the object from etcd. If your Operator is managing an RDS instance tied to that CR, what tells the Operator to call the AWS API to terminate that database? Without a specific mechanism, the CR vanishes, the reconciliation loop for it ceases, and you are left with an orphaned, and expensive, cloud resource.
This article dissects the canonical Kubernetes pattern for solving this problem: finalizers. We will not cover the basics of what an Operator is. We assume you understand CRDs, controllers, and the core reconciliation concept. Instead, we will focus exclusively on architecting and implementing a production-grade, finalizer-driven cleanup process for stateful external resources.
The Race Condition of Deletion Without Finalizers
Before implementing the solution, it's critical to understand why the naive approach fails. A common first attempt is to check for the DeletionTimestamp on the CR object within the Reconcile function.
The Kubernetes API server, upon receiving a delete request, sets the metadata.deletionTimestamp field on the object. This signals that the object is marked for deletion. A controller can watch for this field being non-nil and trigger its cleanup logic.
Here's what that flawed logic looks like:
// DO NOT USE THIS IN PRODUCTION - FLAWED EXAMPLE
func (r *ManagedDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)
var managedDatabase v1alpha1.ManagedDatabase
if err := r.Get(ctx, req.NamespacedName, &managedDatabase); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// Check if the object is being deleted
if managedDatabase.ObjectMeta.DeletionTimestamp != nil {
log.Info("Object is being deleted, cleaning up external resource")
// Flawed: This is a race condition!
if err := r.ExternalDBService.DeleteDatabase(managedDatabase.Spec.DBName); err != nil {
log.Error(err, "Failed to delete external database")
return ctrl.Result{}, err // Requeue on failure
}
log.Info("Successfully deleted external resource")
return ctrl.Result{}, nil // Cleanup done
}
// ... normal create/update logic ...
return ctrl.Result{}, nil
}
Why does this fail?
The Kubernetes garbage collector runs asynchronously. Once the deletionTimestamp is set, the object is eligible for deletion. The controller gets a reconciliation event, but there is no guarantee that it will complete its cleanup logic before the garbage collector removes the CR from etcd. If the controller is slow, under heavy load, or restarts at the wrong moment, the CR will be gone before the DeleteDatabase call can be made or successfully completed. The result is an orphaned resource.
Finalizers: A Deletion Gatekeeper
A finalizer is a simple concept with powerful implications. It is a string key added to the metadata.finalizers list of an object. When the Kubernetes API server sees a delete request for an object that has one or more finalizers in its list, it does not delete the object. Instead, it performs two actions:
metadata.deletionTimestamp to the current time.- It leaves the object in the API, making it available to controllers.
The object is now in a "terminating" state. It will remain in the API, and will not be garbage collected, until its metadata.finalizers list is empty.
This behavior transforms the deletion process from a race condition into a predictable, stateful workflow:
kubectl delete manageddatabase my-db.metadata.finalizers: ["db.example.com/finalizer"]. It sets metadata.deletionTimestamp and stops.my-db. It checks the object and sees deletionTimestamp is non-nil.finalizers list and updates the object in the Kubernetes API.my-db. It notices that deletionTimestamp is set and the finalizers list is now empty. The conditions for deletion are met, and the object is finally removed from etcd.This mechanism guarantees that your controller has the opportunity to complete its work before the CR disappears.
Production-Grade Implementation with `controller-runtime`
Let's build a robust implementation using Go and the popular controller-runtime library, which is part of the Kubebuilder and Operator SDK frameworks.
First, we define our Custom Resource Definition for a ManagedDatabase.
Code Block 1: ManagedDatabase CRD (abbreviated)
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: manageddatabases.db.example.com
spec:
group: db.example.com
names:
kind: ManagedDatabase
listKind: ManagedDatabaseList
plural: manageddatabases
singular: manageddatabase
scope: Namespaced
versions:
- name: v1alpha1
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
enum: ["postgres", "mysql"]
version:
type: string
size:
type: string
enum: ["small", "medium", "large"]
status:
type: object
properties:
dbInstanceId:
type: string
phase:
type: string
endpoint:
type: string
served: true
storage: true
subresources:
status: {}
Next, we define our reconciler struct and the main Reconcile function. Note the finalizerName constant for clarity and to prevent typos.
Code Block 2: Reconciler Struct and Main Reconcile Function
package controllers
import (
"context"
"time"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
dbv1alpha1 "my-operator/api/v1alpha1"
"my-operator/internal/externaldb"
)
const managedDatabaseFinalizer = "db.example.com/finalizer"
// ManagedDatabaseReconciler reconciles a ManagedDatabase object
type ManagedDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
ExternalDBService externaldb.Service // Interface to the external cloud DB provider
}
func (r *ManagedDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
// 1. Fetch the ManagedDatabase instance
instance := &dbv1alpha1.ManagedDatabase{}
if err := r.Get(ctx, req.NamespacedName, instance); err != nil {
if client.IgnoreNotFound(err) != nil {
logger.Error(err, "Unable to fetch ManagedDatabase")
return ctrl.Result{}, err
}
logger.Info("ManagedDatabase resource not found. Ignoring since object must be deleted.")
return ctrl.Result{}, nil
}
// 2. Examine if the object is under deletion
if !instance.ObjectMeta.DeletionTimestamp.IsZero() {
// The object is being deleted
return r.reconcileDelete(ctx, instance)
}
// 3. The object is not being deleted, so register the finalizer and reconcile
return r.reconcileNormal(ctx, instance)
}
Our Reconcile function now acts as a router, delegating to reconcileNormal or reconcileDelete based on the presence of the DeletionTimestamp.
The Normal Path: Adding the Finalizer
When a new CR is created or an existing one is updated, our first order of business is to ensure our finalizer is present. If it's not, we add it. This operation must complete before we attempt to create the external resource. This prevents a scenario where we successfully create the database but the controller crashes before adding the finalizer, leaving the CR unprotected.
Code Block 3: reconcileNormal Implementation
func (r *ManagedDatabaseReconciler) reconcileNormal(ctx context.Context, instance *dbv1alpha1.ManagedDatabase) (ctrl.Result, error) {
logger := log.FromContext(ctx)
// Add finalizer if it doesn't exist
if !controllerutil.ContainsFinalizer(instance, managedDatabaseFinalizer) {
logger.Info("Adding Finalizer for the ManagedDatabase")
controllerutil.AddFinalizer(instance, managedDatabaseFinalizer)
if err := r.Update(ctx, instance); err != nil {
logger.Error(err, "Failed to update ManagedDatabase with finalizer")
return ctrl.Result{}, err
}
}
// --- Your regular reconciliation logic goes here ---
// Check if the external DB exists. If not, create it.
db, err := r.ExternalDBService.GetDatabase(instance.Status.DBInstanceId)
if err != nil {
if externaldb.IsNotFound(err) {
logger.Info("Creating external database")
newInstanceId, err := r.ExternalDBService.CreateDatabase(instance.Spec.Engine, instance.Spec.Size)
if err != nil {
logger.Error(err, "Failed to create external database")
// Update status with failure condition
return ctrl.Result{}, err
}
// Update status with new instance ID and phase
instance.Status.DBInstanceId = newInstanceId
instance.Status.Phase = "Creating"
if err := r.Status().Update(ctx, instance); err != nil {
return ctrl.Result{}, err
}
// Requeue to check status later
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}
return ctrl.Result{}, err
}
// ... logic to check for updates, update status, etc. ...
logger.Info("External database already exists and is in sync.")
instance.Status.Phase = "Ready"
instance.Status.Endpoint = db.Endpoint
if err := r.Status().Update(ctx, instance); err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
}
The Deletion Path: Idempotent Cleanup
This is the core of the finalizer pattern. The reconcileDelete function is called only when the object is terminating. Its sole responsibilities are to perform the cleanup and, upon success, remove the finalizer.
Code Block 4: reconcileDelete Implementation
func (r *ManagedDatabaseReconciler) reconcileDelete(ctx context.Context, instance *dbv1alpha1.ManagedDatabase) (ctrl.Result, error) {
logger := log.FromContext(ctx)
if controllerutil.ContainsFinalizer(instance, managedDatabaseFinalizer) {
// Our finalizer is present, so let's handle external dependency deletion
logger.Info("Performing finalizer cleanup for ManagedDatabase")
if err := r.deleteExternalResources(ctx, instance); err != nil {
// If the cleanup fails, we don't remove the finalizer.
// This ensures we retry on the next reconciliation.
logger.Error(err, "Failed to delete external resources")
return ctrl.Result{}, err
}
// Once external resources are cleaned up, remove the finalizer.
logger.Info("External resources deleted, removing finalizer")
controllerutil.RemoveFinalizer(instance, managedDatabaseFinalizer)
if err := r.Update(ctx, instance); err != nil {
logger.Error(err, "Failed to remove finalizer")
return ctrl.Result{}, err
}
}
// Stop reconciliation as the item is being deleted
return ctrl.Result{}, nil
}
// deleteExternalResources handles the actual deletion of the cloud database.
// CRITICAL: This function must be idempotent.
func (r *ManagedDatabaseReconciler) deleteExternalResources(ctx context.Context, instance *dbv1alpha1.ManagedDatabase) error {
logger := log.FromContext(ctx)
// If there's no instance ID in status, the external resource was likely never created.
if instance.Status.DBInstanceId == "" {
logger.Info("DBInstanceId is empty, assuming external resource was never created.")
return nil
}
logger.Info("Deleting external database", "DBInstanceId", instance.Status.DBInstanceId)
err := r.ExternalDBService.DeleteDatabase(instance.Status.DBInstanceId)
if err != nil {
// If the resource is already gone, we can consider this a success.
if externaldb.IsNotFound(err) {
logger.Info("External database already deleted.")
return nil
}
// For any other error, we must return it to trigger a retry.
return err
}
return nil
}
Advanced Scenarios and Edge Case Handling
Writing the happy path is easy. A production-ready controller is defined by how it handles failures.
Edge Case 1: External API is Down
Imagine the cloud provider's API is returning 503 Service Unavailable when deleteExternalResources is called. Our current code will return an error, and controller-runtime will requeue the reconciliation request with exponential backoff. This is the correct and desired behavior. The CR will remain in the Terminating state, visible via kubectl get, until the external API is healthy again and the cleanup can succeed. The finalizer acts as a lock, preventing the CR from disappearing and the resource from being orphaned.
Edge Case 2: Idempotency is Non-Negotiable
Consider this sequence:
deleteExternalResources successfully calls the cloud API to delete the database.controllerutil.RemoveFinalizer.When the Operator restarts, it will reconcile the CR again and call deleteExternalResources. If this function is not idempotent, it might fail when trying to delete a resource that's already gone. That's why the if externaldb.IsNotFound(err) check is so critical. It treats a "not found" error as a success, allowing the finalizer to be removed and the process to complete.
Your external service client must be able to distinguish between a "not found" error and other transient or permanent API errors.
Edge Case 3: The Stuck Finalizer
What if the external resource cannot be deleted due to a permanent issue? For example, a user applied a deletion lock on the RDS instance directly in the AWS console, or the Operator's IAM permissions were revoked.
In this scenario, the deleteExternalResources function will consistently fail, and the CR will be stuck in the Terminating state forever. This is a situation that requires manual intervention. An SRE or platform engineer would need to:
kubectl patch manageddatabase my-db --type='json' -p='[{"op": "remove", "path": "/metadata/finalizers"}]'
This is a powerful but dangerous command. It should only be used after confirming that the cleanup work the finalizer was guarding has been completed. Providing clear logging in your reconcileDelete function is essential for making this diagnosis possible.
Performance and Scalability
In a large cluster, your Operator might be managing thousands of CRs, and many could be deleted at once.
* Requeue Strategy: For transient external API errors (like rate limiting), returning a hard error that triggers immediate requeue with exponential backoff can be aggressive. A more gentle approach is to return ctrl.Result{RequeueAfter: time.Minute}. This tells the controller to wait a fixed duration before trying again, reducing pressure on both the Kubernetes API server and the external service.
* Controller Concurrency: The MaxConcurrentReconciles option on the controller manager determines how many Reconcile functions can run in parallel. If you have a cleanup process that is slow (e.g., waiting for a database to terminate can take minutes), a low concurrency setting means that a few deletions can starve all other reconciliation work (creates, updates). You must tune this value based on the expected latency of your external API calls and the number of CRs you expect to manage.
* Client-Side Rate Limiting: If the external service has a strict API request limit, your Operator can easily overwhelm it during a storm of events (like a namespace deletion that triggers cascading deletes of your CRs). It is a best practice to build client-side rate limiting into your ExternalDBService client, using a token bucket algorithm (e.g., golang.org/x/time/rate).
Conclusion
Finalizers are not an optional feature for any Kubernetes Operator that manages resources outside the cluster; they are a fundamental requirement for robust, leak-free automation. By acting as a gatekeeper for CR deletion, they transform a non-deterministic race condition into a reliable, stateful workflow.
A production-grade finalizer implementation goes beyond the basic mechanics. It demands an idempotent cleanup function that correctly interprets external API errors, a clear strategy for handling stuck resources, and careful consideration of performance under load. By mastering this pattern, you can build Operators that safely extend the power of the Kubernetes declarative model to any system, ensuring that what kubectl apply creates, kubectl delete can reliably destroy.