Idempotent Reconciliation Loops in K8s Operators with Finalizers
The Flaw in Naive Kubernetes Reconciliation
For any engineer who has built a Kubernetes operator beyond a trivial hello-world example, the core reconciliation loop seems straightforward. An event triggers the Reconcile function, which compares the desired state (defined in the Custom Resource) to the actual state of the world and takes action to converge them. This works beautifully for creation and updates. The true complexity, however, surfaces during deletion.
A naive reconciliation loop, when faced with a kubectl delete my-cr, has a fundamental race condition. The Kubernetes API server receives the delete request and immediately removes the Custom Resource (CR) object from etcd. By the time your operator's Reconcile function is triggered (if it's triggered at all for the delete event), the object it's supposed to be reconciling is already gone. There's no state, no spec, and no context left to perform a graceful cleanup of the external resources your CR was managing—be it a cloud database, a DNS record, or a user in an external system.
This leads to orphaned resources, security vulnerabilities, and mounting cloud bills. The core problem is that Kubernetes's default deletion is a fire-and-forget operation. To build a robust, production-grade operator, we must intercept this process and make it stateful. This is precisely the problem that Finalizers solve.
A Simple Controller's Deletion Failure
Let's model this failure with a simple ExternalDatabase operator. Its job is to create a database instance in a hypothetical cloud provider's API when a CR is created.
CRD Definition (api/v1/externaldatabase_types.go):
package v1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// ExternalDatabaseSpec defines the desired state of ExternalDatabase
type ExternalDatabaseSpec struct {
DBName string `json:"dbName"`
User string `json:"user"`
Password string `json:"password"`
}
// ExternalDatabaseStatus defines the observed state of ExternalDatabase
type ExternalDatabaseStatus struct {
Provisioned bool `json:"provisioned"`
DBID string `json:"dbId,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// ExternalDatabase is the Schema for the externaldatabases API
type ExternalDatabase struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ExternalDatabaseSpec `json:"spec,omitempty"`
Status ExternalDatabaseStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// ExternalDatabaseList contains a list of ExternalDatabase
type ExternalDatabaseList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ExternalDatabase `json:"items"`
}
func init() {
SchemeBuilder.Register(&ExternalDatabase{}, &ExternalDatabaseList{})
}
Naive Reconciler (controllers/externaldatabase_controller.go):
package controllers
import (
"context"
// ... other imports
"k8s.io/apimachinery/pkg/api/errors"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
dbv1 "my.domain/db-operator/api/v1"
)
// ExternalDatabaseReconciler reconciles a ExternalDatabase object
type ExternalDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
// A mock client for our cloud provider
CloudDBAPI *mock.CloudClient
}
func (r *ExternalDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
dbInstance := &dbv1.ExternalDatabase{}
err := r.Get(ctx, req.NamespacedName, dbInstance)
if err != nil {
if errors.IsNotFound(err) {
// CR was deleted. How do we get the DB ID to clean up?
// We can't. The object is gone.
logger.Info("ExternalDatabase resource not found. Ignoring since object must be deleted.")
return ctrl.Result{}, nil
}
logger.Error(err, "Failed to get ExternalDatabase")
return ctrl.Result{}, err
}
// If status doesn't show provisioned, create the external DB
if !dbInstance.Status.Provisioned {
dbID, err := r.CloudDBAPI.CreateDatabase(dbInstance.Spec.DBName, dbInstance.Spec.User, dbInstance.Spec.Password)
if err != nil {
logger.Error(err, "Failed to create external database")
return ctrl.Result{}, err
}
dbInstance.Status.Provisioned = true
dbInstance.Status.DBID = dbID
if err := r.Status().Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to update ExternalDatabase status")
return ctrl.Result{}, err
}
logger.Info("Successfully created external database", "DBID", dbID)
}
return ctrl.Result{}, nil
}
When a user runs kubectl delete externaldatabase my-db, the r.Get call will return a NotFound error. The log message Ignoring since object must be deleted is a surrender. The external database with the ID stored in dbInstance.Status.DBID is now an orphan, forever running in your cloud account.
The Finalizer Pattern: A Deletion Gatekeeper
A finalizer is simply a string added to an object's metadata.finalizers list. It acts as a locking mechanism. As long as this list is not empty, the Kubernetes API server will not fully delete the object. Instead, it performs a "soft deletion":
metadata.deletionTimestamp is set to the current time.- The object is now considered to be in a "terminating" state.
- The object remains visible to API requests (and thus to your operator).
- The operator's reconciliation loop is triggered for this state change.
Your operator is now responsible for performing cleanup logic and then, as the very last step, removing its finalizer from the list. Once the finalizers list is empty and the deletionTimestamp is set, the Kubernetes garbage collector finally purges the object from etcd.
This pattern transforms deletion from a single, atomic action into a multi-step, stateful process that your controller can manage.
The Finalizer Workflow in Detail
ExternalDatabase CR.deletionTimestamp is nil (it's not being deleted) and that its specific finalizer (e.g., db.my.domain/finalizer) is not present. Its first action is to add this finalizer and update the object. It then returns a Requeue result to process the object again in its new state.DBID.kubectl delete externaldatabase my-db.deletionTimestamp but leaves the object in place because the finalizer is still there.if object.GetDeletionTimestamp() != nil. This condition is now true. The reconciler executes its cleanup logic, using the DBID from the status to call the cloud provider's API to delete the database.metadata.finalizers list and updates the object.deletionTimestamp and an empty finalizers list. It now completes the deletion, removing the object from etcd.Production-Grade Implementation with Kubebuilder
Let's refactor our ExternalDatabaseReconciler to correctly implement this pattern. We'll use the controller-runtime/pkg/controller/controllerutil package, which provides helpers for managing finalizers.
controllers/externaldatabase_controller.go (with Finalizers):
package controllers
import (
"context"
"time"
dbv1 "my.domain/db-operator/api/v1"
"my.domain/db-operator/internal/mock"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
)
// A unique name for our finalizer. It's a best practice to use a
// domain-qualified name to avoid conflicts with other controllers.
const databaseFinalizer = "db.my.domain/finalizer"
// ExternalDatabaseReconciler reconciles a ExternalDatabase object
type ExternalDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
CloudDBAPI *mock.CloudClient
}
func (r *ExternalDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx).WithValues("externaldatabase", req.NamespacedName)
dbInstance := &dbv1.ExternalDatabase{}
if err := r.Get(ctx, req.NamespacedName, dbInstance); err != nil {
if errors.IsNotFound(err) {
logger.Info("ExternalDatabase resource not found. Object may have been deleted after finalizer removal.")
return ctrl.Result{}, nil
}
logger.Error(err, "Failed to get ExternalDatabase")
return ctrl.Result{}, err
}
// Check if the instance is being deleted
isMarkedForDeletion := dbInstance.GetDeletionTimestamp() != nil
if isMarkedForDeletion {
if controllerutil.ContainsFinalizer(dbInstance, databaseFinalizer) {
// Run our finalization logic. If it fails, we'll retry on the next reconcile.
if err := r.finalizeExternalDatabase(ctx, logger, dbInstance); err != nil {
// Don't remove the finalizer if cleanup fails.
// The controller will retry later.
return ctrl.Result{}, err
}
// Cleanup was successful. Remove the finalizer.
logger.Info("External database cleaned up successfully. Removing finalizer.")
controllerutil.RemoveFinalizer(dbInstance, databaseFinalizer)
if err := r.Update(ctx, dbInstance); err != nil {
return ctrl.Result{}, err
}
}
// Stop reconciliation as the item is being deleted
return ctrl.Result{}, nil
}
// The object is not being deleted, so we add our finalizer if it doesn't exist.
if !controllerutil.ContainsFinalizer(dbInstance, databaseFinalizer) {
logger.Info("Adding finalizer for ExternalDatabase")
controllerutil.AddFinalizer(dbInstance, databaseFinalizer)
if err := r.Update(ctx, dbInstance); err != nil {
return ctrl.Result{}, err
}
// Requeue immediately after adding the finalizer to ensure the next reconcile loop sees it.
return ctrl.Result{Requeue: true}, nil
}
// This is our main reconciliation logic for creation and updates.
if !dbInstance.Status.Provisioned {
logger.Info("Provisioning external database")
dbID, err := r.CloudDBAPI.CreateDatabase(dbInstance.Spec.DBName, dbInstance.Spec.User, dbInstance.Spec.Password)
if err != nil {
logger.Error(err, "Failed to create external database")
return ctrl.Result{}, err
}
dbInstance.Status.Provisioned = true
dbInstance.Status.DBID = dbID
if err := r.Status().Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to update ExternalDatabase status")
return ctrl.Result{}, err
}
logger.Info("Successfully provisioned external database", "DBID", dbID)
}
return ctrl.Result{}, nil
}
// finalizeExternalDatabase contains the logic to clean up external resources.
func (r *ExternalDatabaseReconciler) finalizeExternalDatabase(ctx context.Context, logger logr.Logger, db *dbv1.ExternalDatabase) error {
logger.Info("Starting finalization for ExternalDatabase", "DBID", db.Status.DBID)
if db.Status.DBID == "" {
logger.Info("External database ID is not set, nothing to clean up.")
return nil
}
err := r.CloudDBAPI.DeleteDatabase(db.Status.DBID)
if err != nil {
// This is a critical edge case we will discuss below.
// If the resource is already gone, we should not fail.
if mock.IsNotFoundError(err) {
logger.Info("External database already deleted, cleanup is considered successful.")
return nil
}
logger.Error(err, "Failed to delete external database during finalization")
return err
}
logger.Info("Successfully deleted external database", "DBID", db.Status.DBID)
return nil
}
// SetupWithManager sets up the controller with the Manager.
func (r *ExternalDatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&dbv1.ExternalDatabase{}).
Complete(r)
}
This structure is fundamentally more robust. The clear separation of logic—handling deletion, then adding the finalizer, then normal reconciliation—makes the controller's behavior predictable and resilient.
Advanced Edge Cases and Idempotency
A working finalizer is just the start. Production environments are chaotic. Operators crash, networks fail, and APIs return unexpected errors. Our logic must be idempotent and resilient to these failures.
Edge Case 1: Partial Cleanup Failure
Problem: What happens if the call to r.CloudDBAPI.DeleteDatabase fails due to a transient network error or a temporary 503 from the cloud provider's API?
Solution: Our current code already handles this correctly. The finalizeExternalDatabase function returns an error. This causes the reconciliation to fail, and controller-runtime will automatically requeue the request with an exponential backoff. The finalizer is not removed. On the next attempt, finalizeExternalDatabase is called again. Because the external deletion operation is idempotent (deleting an already-deleted resource should either succeed or return a specific error), the system will eventually self-heal.
Key Principle: The finalizer must only be removed after you have absolute certainty that the external resource is gone. Never remove the finalizer optimistically.
Edge Case 2: Operator Crash During Cleanup
Problem: This is a more subtle but critical failure mode. Consider this sequence:
finalizeExternalDatabase successfully calls r.CloudDBAPI.DeleteDatabase.controllerutil.RemoveFinalizer and r.Update(ctx, dbInstance).Solution: When the operator restarts, it will receive a reconcile event for the ExternalDatabase CR, which still has its deletionTimestamp and finalizer. The finalizeExternalDatabase function will be called again.
This is where the idempotency of our cleanup logic becomes paramount. The r.CloudDBAPI.DeleteDatabase function will now be called for a resource that no longer exists. A well-behaved API will return a 404 Not Found error. Our code must correctly interpret this specific error not as a failure, but as a success condition for cleanup.
As shown in the code snippet:
err := r.CloudDBAPI.DeleteDatabase(db.Status.DBID)
if err != nil {
// THIS IS THE CRITICAL PART
if mock.IsNotFoundError(err) { // Assume mock.IsNotFoundError checks for a 404-like error
logger.Info("External database already deleted, cleanup is considered successful.")
return nil // Return success!
}
logger.Error(err, "Failed to delete external database during finalization")
return err // For any other error, retry.
}
By treating NotFound as success, we make our finalizer logic resilient to crashes. The operator can safely re-run the cleanup, confirm the resource is gone, and then proceed to remove the finalizer, allowing the CR to be garbage collected.
Edge Case 3: Finalizer Stuck on an Unrecoverable Error
Problem: What if the cloud API is permanently broken, or the credentials used by the operator have expired? The cleanup logic will fail indefinitely, and the finalizer will never be removed. The CR will be stuck in a Terminating state forever.
Solution: This is less of a coding problem and more of an operational one. The solution involves monitoring and alerting.
kubectl patch externaldatabase my-db --type json --patch='[ { "op": "remove", "path": "/metadata/finalizers" } ]'
This manual step should be a last resort, as it breaks the automated guarantee provided by the operator. It's a necessary escape hatch for unrecoverable situations.
Performance and Optimization Considerations
While correctness is paramount, performance matters for operators running at scale.
Requeue Strategies
Understanding the return value of Reconcile is key:
* ctrl.Result{}, nil: Reconciliation was successful. Don't requeue unless an external event (a watch on the CR or owned resources) occurs.
* ctrl.Result{Requeue: true}, nil: Reconciliation was successful, but you want to requeue immediately. We used this after adding the finalizer to ensure the next loop operates on the updated object state.
* ctrl.Result{RequeueAfter: duration}, nil: Requeue after a specific delay. This is useful for polling an external system that doesn't provide events. For example, if creating a database takes time, you might requeue every 30 seconds to check its status.
* ctrl.Result{}, err: Reconciliation failed. The controller-runtime will requeue with an exponential backoff (e.g., 1s, 2s, 4s, 8s...). This is the correct response for transient failures in finalization logic.
Avoid using Requeue: true in a loop without a state change, as this can lead to a CPU-intensive busy-loop.
Predicate Functions for Filtering Events
By default, your Reconcile function is triggered for every change to a watched resource. Many of these changes are irrelevant, such as status-only updates made by your own controller. This can cause unnecessary reconciliation cycles.
controller-runtime allows you to specify Predicates to filter which events trigger a reconciliation. A common and highly effective predicate is predicate.GenerationChangedPredicate, which only triggers reconciliation if the metadata.generation field of the object changes. This field is only incremented by the API server when the object's spec is modified.
Setup with a Predicate (main.go or controllers/suite_test.go):
// In your SetupWithManager function
import "sigs.k8s.io/controller-runtime/pkg/predicate"
// ...
func (r *ExternalDatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&dbv1.ExternalDatabase{}).
// Only reconcile on spec changes, not status updates.
// This also implicitly handles the deletion event because the
// addition of deletionTimestamp counts as a metadata change.
WithEventFilter(predicate.GenerationChangedPredicate{}).
Complete(r)
}
This simple addition can dramatically reduce the load on your operator and the Kubernetes API server in a busy cluster, as it prevents self-induced reconciliation loops caused by status updates.
Conclusion: Beyond Simple Automation
The Finalizer pattern is a cornerstone of reliable Kubernetes operator development. It elevates an operator from a simple automation script to a true state machine capable of managing the full lifecycle of external resources. By intercepting the deletion process, enforcing idempotency in cleanup logic, and anticipating failure modes like operator crashes and API errors, we can build controllers that are robust, predictable, and production-ready.
For senior engineers working in the Kubernetes ecosystem, mastering this pattern is not optional. It is the fundamental technique for ensuring that your automation doesn't leave a trail of orphaned resources and technical debt. The difference between a toy operator and one you can trust with critical infrastructure often lies in the careful and correct implementation of a finalizer-driven reconciliation loop.