K8s Operator Finalizers for Stateful External Resource Management
The Deletion Blind Spot in Declarative APIs
As senior engineers building on Kubernetes, we've embraced the power of the declarative model. We define our desired state in a Custom Resource (CR), and the operator's reconciliation loop works tirelessly to make reality match our specification. This works beautifully for creation and updates. However, a critical lifecycle phase is often mishandled in naive operator implementations: deletion.
When a user executes kubectl delete my-cr my-instance, the Kubernetes API server dutifully removes the object from etcd. The default garbage collector might clean up in-cluster child objects via OwnerReferences, but it has zero awareness of the AWS RDS instance, the Google Cloud Storage bucket, or the Kafka topic your operator provisioned on behalf of that CR. The result is an orphaned, and often costly, external resource. This breaks the declarative promise; the resource's lifecycle is no longer fully managed by the CR.
This is where Finalizers become an indispensable tool in the advanced operator developer's toolkit. They are not merely a metadata annotation; they are a fundamental mechanism that allows your controller to intercept the deletion process, execute critical cleanup logic, and ensure that the lifecycle of external resources is perfectly synchronized with the CR that manages them.
This article assumes you are already comfortable with Go, the Operator SDK or Kubebuilder, and the basic concepts of controllers and Custom Resource Definitions (CRDs). We will not cover how to build a basic operator. Instead, we will focus exclusively on the production-grade implementation of finalizers to solve the external resource cleanup problem.
The Mechanics of a Finalizer-Aware Deletion
A finalizer is simply a string key in the metadata.finalizers array of any Kubernetes object. When the API server receives a delete request for an object that has one or more finalizers:
metadata.deletionTimestamp field to the current time. This is the crucial signal that the object is in a "terminating" state.deletionTimestamp) triggers a reconciliation event, delivering the terminating object to your controller's Reconcile function.Your controller's responsibility is now to:
cr.GetDeletionTimestamp() is non-nil.metadata.finalizers array.Once the finalizers array is empty and the deletionTimestamp is set, the Kubernetes garbage collector is finally permitted to delete the object from etcd.
A Production Scenario: The `CloudDatabase` Operator
Let's implement this pattern for a CloudDatabase operator. This operator manages a database instance on a hypothetical cloud provider. The goal is to ensure that when a CloudDatabase CR is deleted, the actual database instance in the cloud is also safely deprovisioned.
First, our CRD (api/v1/clouddatabase_types.go):
package v1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// CloudDatabaseSpec defines the desired state of CloudDatabase
type CloudDatabaseSpec struct {
// Engine specifies the database engine (e.g., "postgres", "mysql").
Engine string `json:"engine"`
// Version specifies the engine version.
Version string `json:"version"`
// Size specifies the storage size in GB.
SizeGB int `json:"sizeGb"`
}
// CloudDatabaseStatus defines the observed state of CloudDatabase
type CloudDatabaseStatus struct {
// InstanceID is the unique identifier of the database in the cloud provider.
InstanceID string `json:"instanceId,omitempty"`
// Endpoint is the connection address for the database.
Endpoint string `json:"endpoint,omitempty"`
// Status reflects the current state (e.g., "CREATING", "AVAILABLE", "DELETING").
Status string `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// CloudDatabase is the Schema for the clouddatabases API
type CloudDatabase struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CloudDatabaseSpec `json:"spec,omitempty"`
Status CloudDatabaseStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// CloudDatabaseList contains a list of CloudDatabase
type CloudDatabaseList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CloudDatabase `json:"items"`
}
func init() {
SchemeBuilder.Register(&CloudDatabase{}, &CloudDatabaseList{})
}
Implementing the Reconcile Loop with Finalizers
Now, let's structure our controller's Reconcile method (controllers/clouddatabase_controller.go). We'll use a placeholder cloudAPI client to simulate interactions with a cloud provider.
We first define our finalizer name. It's best practice to use a domain-qualified name to avoid collisions with other controllers.
const cloudDatabaseFinalizer = "database.example.com/finalizer"
// Mock Cloud API Client for demonstration
type mockCloudAPI struct{}
func (m *mockCloudAPI) GetDatabaseStatus(instanceID string) (string, error) { /* ... */ }
func (m *mockCloudAPI) CreateDatabase(spec v1.CloudDatabaseSpec) (string, error) { /* ... */ }
func (m *mockCloudAPI) DeleteDatabase(instanceID string) error { /* ... */ }
// CloudDatabaseReconciler reconciles a CloudDatabase object
type CloudDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
Log logr.Logger
CloudAPI *mockCloudAPI // In a real app, this would be a real client
}
The core logic resides in the Reconcile function. We'll break it down into a clear, state-driven flow.
import (
"context"
"time"
"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
databasev1 "my-operator/api/v1"
)
func (r *CloudDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := r.Log.WithValues("clouddatabase", req.NamespacedName)
// 1. Fetch the CloudDatabase instance
db := &databasev1.CloudDatabase{}
if err := r.Get(ctx, req.NamespacedName, db); err != nil {
// Ignore not-found errors, as they can't be fixed by an immediate requeue.
// They could be caused by a deleted object that we are processing.
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// 2. Examine if the object is under deletion
isMarkedForDeletion := db.GetDeletionTimestamp() != nil
if isMarkedForDeletion {
if controllerutil.ContainsFinalizer(db, cloudDatabaseFinalizer) {
// Run our finalization logic. If it fails, we requeue the reconciliation
// to retry.
if err := r.finalizeCloudDatabase(ctx, log, db); err != nil {
// Don't remove the finalizer if cleanup fails.
return ctrl.Result{}, err
}
// Cleanup was successful. Remove our finalizer so Kubernetes can delete the object.
log.Info("External database deprovisioned, removing finalizer")
controllerutil.RemoveFinalizer(db, cloudDatabaseFinalizer)
if err := r.Update(ctx, db); err != nil {
return ctrl.Result{}, err
}
}
// Stop reconciliation as the item is being deleted
return ctrl.Result{}, nil
}
// 3. Add finalizer for new objects
if !controllerutil.ContainsFinalizer(db, cloudDatabaseFinalizer) {
log.Info("Adding finalizer for CloudDatabase")
controllerutil.AddFinalizer(db, cloudDatabaseFinalizer)
if err := r.Update(ctx, db); err != nil {
return ctrl.Result{}, err
}
}
// 4. The main reconciliation logic for create/update
// Check if the external resource exists; if not, create it.
if db.Status.InstanceID == "" {
instanceID, err := r.CloudAPI.CreateDatabase(db.Spec)
if err != nil {
log.Error(err, "Failed to create external database")
// Update status with failure condition here if desired
return ctrl.Result{}, err
}
db.Status.InstanceID = instanceID
db.Status.Status = "CREATING"
if err := r.Status().Update(ctx, db); err != nil {
return ctrl.Result{}, err
}
// Requeue to check status later
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}
// ... existing reconciliation logic to check status, update spec, etc. ...
return ctrl.Result{}, nil
}
The `finalizeCloudDatabase` Function: Handling Asynchronicity and Failures
This is where the most critical and complex logic resides. Cloud provider APIs are rarely synchronous. When you request to delete a database, the API call often returns immediately with a 202 Accepted, and the actual deletion happens in the background over several minutes.
Our finalizer logic must account for this. We cannot remove the finalizer until we have confirmation that the resource is truly gone. This requires a stateful cleanup process.
func (r *CloudDatabaseReconciler) finalizeCloudDatabase(ctx context.Context, log logr.Logger, db *databasev1.CloudDatabase) error {
// If there's no instance ID in the status, the external resource was likely never created.
// In this case, there's nothing to clean up.
if db.Status.InstanceID == "" {
log.Info("No external database to finalize, InstanceID is empty")
return nil
}
log.Info("Starting finalization for CloudDatabase", "instanceID", db.Status.InstanceID)
// Check the current status of the external database.
// This call must be idempotent.
status, err := r.CloudAPI.GetDatabaseStatus(db.Status.InstanceID)
if err != nil {
// Edge Case 1: The resource is already gone.
// This could happen if it was manually deleted or a previous reconcile failed after deletion.
if isCloudResourceNotFound(err) { // isCloudResourceNotFound is a hypothetical helper
log.Info("External database already deleted from cloud provider.")
return nil // Success, nothing to do.
}
// Edge Case 2: API error (permissions, throttling, etc.)
log.Error(err, "Failed to get database status from cloud API during finalization")
// We can't proceed, so we return the error to trigger a requeue.
return err
}
switch status {
case "DELETING":
// The database is already in the process of being deleted. We just need to wait.
log.Info("External database is already being deleted. Requeuing to check again later.")
// We return an error to force a requeue. A custom error type might be better here,
// but for simplicity, we'll just requeue with the original error.
// In a real implementation, you'd return a specific error or use ctrl.Result{RequeueAfter: ...}
// But returning an error is the simplest way to tell the controller-runtime to retry.
return fmt.Errorf("deletion in progress")
case "AVAILABLE", "STOPPED":
// The database exists and we need to initiate deletion.
log.Info("Initiating deletion of external database", "instanceID", db.Status.InstanceID)
if err := r.CloudAPI.DeleteDatabase(db.Status.InstanceID); err != nil {
log.Error(err, "Failed to initiate database deletion")
// Update status to reflect the failure
db.Status.Status = "DELETION_FAILED"
if updateErr := r.Status().Update(ctx, db); updateErr != nil {
log.Error(updateErr, "Failed to update status after deletion failure")
}
return err // Requeue to retry the deletion call
}
// After successfully initiating deletion, update our status and requeue.
db.Status.Status = "DELETING"
if err := r.Status().Update(ctx, db); err != nil {
return err
}
log.Info("Deletion initiated. Requeuing to monitor progress.")
return fmt.Errorf("deletion initiated, monitoring progress")
default:
// Any other status (e.g., "CREATING") might be unexpected during deletion.
log.Info("External database in unexpected state during finalization", "status", status)
return fmt.Errorf("unexpected cloud resource status: %s", status)
}
}
Key takeaways from this implementation:
finalizeCloudDatabase function might be called multiple times. It must always check the current state of the world (the cloud resource) before taking action. It correctly handles the case where the resource is already gone.status subresource to track its own progress (DELETING, DELETION_FAILED). This prevents re-issuing a DeleteDatabase call on every reconciliation.ctrl.Result{}. We return an error (fmt.Errorf("deletion in progress")) to force the controller-runtime to requeue the request. The default exponential backoff is often sufficient, but for long-running deletions, ctrl.Result{RequeueAfter: time.Minute} can prevent tight, useless polling loops.Advanced Edge Case: The 'Stuck' Finalizer
What happens if your controller has a persistent bug, loses its credentials permanently, or an immutable policy on the cloud provider prevents deletion? The finalizer logic will continuously fail, and the CloudDatabase CR will be stuck in a Terminating state forever. It cannot be deleted via kubectl delete.
This is a real-world operational problem. The only solution is manual intervention.
An administrator with sufficient privileges must manually remove the finalizer from the object. This is a dangerous operation, as it will lead to the exact problem we were trying to solve: an orphaned cloud resource. But sometimes it's the only way to unblock the system.
The command looks like this:
# First, get the current object YAML
kubectl get clouddatabase my-db -o yaml > my-db.yaml
# Manually edit my-db.yaml and remove the finalizer line:
# metadata:
# finalizers:
# - database.example.com/finalizer <-- DELETE THIS LINE
# Or, more surgically with `kubectl patch`:
kubectl patch clouddatabase my-db --type json --patch='[{"op": "remove", "path": "/metadata/finalizers"}]'
After this patch, the K8s garbage collector will immediately delete the CR. The operator has been bypassed. This underscores the need for robust error reporting and monitoring in your finalizer logic. When a finalizer fails repeatedly, it should raise alarms so that an operator can investigate the root cause (e.g., invalid IAM role) before resorting to a manual patch.
Finalizers vs. Owner References
It's crucial to understand why OwnerReferences are not the right tool for this job.
Pod to be owned by a ReplicaSet. When the ReplicaSet is deleted, the garbage collector sees this relationship and automatically deletes the Pod. This entire process is orchestrated by the kube-controller-manager and only works for objects known to the Kubernetes API server.Use OwnerReferences to manage the lifecycle of resources your operator creates within the same cluster (like a Service or ConfigMap for your CloudDatabase). Use Finalizers to manage the lifecycle of anything outside the Kubernetes API.
Conclusion: Mastering the Full Lifecycle
Implementing a finalizer correctly elevates an operator from a simple provisioning tool to a true lifecycle management system. It's the key to fulfilling the declarative promise of Kubernetes by ensuring that kubectl delete is a safe, complete, and predictable operation.
By building idempotent, state-aware cleanup functions, you can prevent costly orphaned resources and provide a seamless, reliable experience for users of your CRDs. While the logic is more complex than a simple create/update loop, it is non-negotiable for any production-grade operator that manages stateful, external systems. The patterns discussed here—detecting the deletionTimestamp, adding and removing the finalizer, and handling asynchronous cleanup with requeues—are the foundation for building robust and resilient controllers that can safely automate infrastructure at scale.