Kubernetes Operators: Finalizers for Stateful App Teardown
The Illusion of Declarative Deletion
As senior engineers working with Kubernetes, we've internalized the power of its declarative API. We define the desired state in a manifest, and a controller works tirelessly to make reality match that state. This works beautifully for stateless resources managed entirely within the cluster. However, the moment an operator needs to manage resources outside of Kubernetes—a Cloud SQL instance, an S3 bucket, a SendGrid API key—a dangerous gap emerges in the declarative model, specifically around deletion.
When a user executes kubectl delete my-db-claim, the API server initiates the garbage collection process. The DatabaseClaim custom resource (CR) is removed from etcd. For the operator watching DatabaseClaim resources, the object simply vanishes. It receives a 'delete' event, but by then, the object's specification, which might contain the ID of the external resource, is gone. The operator has no context to perform a cleanup. The result is an orphaned cloud database, silently accruing costs and becoming a potential security liability.
Standard Kubernetes owner references are insufficient here. An ownerReference can cascade deletion for objects within the same Kubernetes cluster, but it's powerless to command an external API like AWS or GCP.
This is the core problem that finalizers solve. They are a crucial mechanism that allows a controller to intercept the deletion process, execute imperative cleanup logic, and only then permit the Kubernetes API to complete the object's removal. They bridge the gap between Kubernetes's declarative world and the imperative reality of external systems.
Finalizers: A Mechanical Deep Dive
A finalizer is not a complex API object or a special type of controller. Mechanically, it's just a string added to the metadata.finalizers list of any Kubernetes object. For our purposes, this will be our Custom Resource instance.
apiVersion: database.example.com/v1alpha1
kind: CloudDatabase
metadata:
name: production-postgres
finalizers:
- database.example.com/finalizer
spec:
# ... spec for the database
When a user attempts to delete an object that has one or more entries in its finalizers list, the Kubernetes API server does something unique: it does not delete the object immediately.
Instead, it performs two actions:
deletionTimestamp to the object's metadata. This timestamp signifies the time the deletion was requested.- It leaves the object in the API server, effectively putting it into a read-only, 'terminating' state.
The object will remain in this terminating state indefinitely until its metadata.finalizers list is empty.
This is the hook our operator needs. The controller's reconciliation loop, which is constantly watching for changes, will receive an update event for the CloudDatabase object. Inside the Reconcile function, our logic will be:
metadata.deletionTimestamp is set. If it is, we know the object is being deleted.database.example.com/finalizer) is present in the finalizers list.- If both are true, we execute our external cleanup logic (e.g., call the cloud provider's API to delete the database instance).
metadata.finalizers list and update the object in the Kubernetes API.Once our controller (and any other controller that might have added its own finalizer) removes its entry, the finalizers list becomes empty. The API server, seeing an object with a deletionTimestamp and an empty finalizers list, finally proceeds with garbage collection and removes the object from etcd.
This process ensures that the object—and its critical spec and status data—remains available to the controller throughout the entire external resource teardown process.
Production Implementation: A Go Operator with Finalizers
Let's build a practical example using Go and the controller-runtime library, the foundation for Kubebuilder and the Operator SDK. We'll create an operator that manages a CloudDatabase CRD. The operator's primary responsibilities will be to provision a (mock) database when a CR is created and, crucially, to deprovision it using a finalizer when the CR is deleted.
1. Defining the CRD and Controller
First, our CloudDatabase API type definition (api/v1alpha1/clouddatabase_types.go):
package v1alpha1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// CloudDatabaseSpec defines the desired state of CloudDatabase
type CloudDatabaseSpec struct {
// Engine specifies the database engine (e.g., "postgres", "mysql")
Engine string `json:"engine"`
// Size specifies the database size in GB
Size int `json:"size"`
}
// CloudDatabaseStatus defines the observed state of CloudDatabase
type CloudDatabaseStatus struct {
// DBInstanceID is the unique identifier for the external database instance
DBInstanceID string `json:"dbInstanceId,omitempty"`
// Status indicates the current state (e.g., "Provisioning", "Available", "Deleting")
Status string `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// CloudDatabase is the Schema for the clouddatabases API
type CloudDatabase struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CloudDatabaseSpec `json:"spec,omitempty"`
Status CloudDatabaseStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// CloudDatabaseList contains a list of CloudDatabase
type CloudDatabaseList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CloudDatabase `json:"items"`
}
func init() {
SchemeBuilder.Register(&CloudDatabase{}, &CloudDatabaseList{})
}
2. The Core Reconciler Logic
Now, let's implement the Reconcile method in controllers/clouddatabase_controller.go. This is where the entire state machine, including the finalizer logic, resides.
We will define our finalizer name as a constant.
const cloudDatabaseFinalizer = "database.example.com/finalizer"
// A mock external database service client
type MockDBServiceClient struct{}
func (c *MockDBServiceClient) CreateDatabase(engine string, size int) (string, error) {
// In a real implementation, this would call a cloud provider API.
instanceID := "db-" + uuid.New().String()
log.Log.Info("mock db service: creating database", "instanceID", instanceID)
time.Sleep(2 * time.Second) // Simulate latency
return instanceID, nil
}
func (c *MockDBServiceClient) DeleteDatabase(instanceID string) error {
// In a real implementation, this would call a cloud provider API.
// This call MUST be idempotent.
log.Log.Info("mock db service: deleting database", "instanceID", instanceID)
time.Sleep(2 * time.Second) // Simulate latency
// Simulate an error 10% of the time to test retry logic
if rand.Intn(10) == 0 {
return fmt.Errorf("mock API error: failed to delete instance %s", instanceID)
}
return nil
}
func (c *MockDBServiceClient) GetDatabaseStatus(instanceID string) (string, error) {
// In a real implementation, this would query the cloud provider API.
return "Available", nil
}
The Reconcile function is the heart of the operator. Note the clear separation between the deletion path and the creation/update path.
import (
// ... other imports
"context"
"fmt"
"math/rand"
"time"
"github.com/go-logr/logr"
"github.com/google/uuid"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
databasev1alpha1 ".../api/v1alpha1"
)
// CloudDatabaseReconciler reconciles a CloudDatabase object
type CloudDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
Log logr.Logger
DBServiceClient *MockDBServiceClient
}
func (r *CloudDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := r.Log.WithValues("clouddatabase", req.NamespacedName)
// 1. Fetch the CloudDatabase instance
instance := &databasev1alpha1.CloudDatabase{}
err := r.Get(ctx, req.NamespacedName, instance)
if err != nil {
if errors.IsNotFound(err) {
// Object not found, probably deleted. Return and don't requeue.
log.Info("CloudDatabase resource not found. Ignoring since object must be deleted")
return ctrl.Result{}, nil
}
// Error reading the object - requeue the request.
log.Error(err, "Failed to get CloudDatabase")
return ctrl.Result{}, err
}
// 2. The Deletion Path: Check if the object is being deleted
isMarkedForDeletion := instance.GetDeletionTimestamp() != nil
if isMarkedForDeletion {
if controllerutil.ContainsFinalizer(instance, cloudDatabaseFinalizer) {
// Our finalizer is present, so let's handle external dependency cleanup.
log.Info("Performing finalizer cleanup for CloudDatabase")
if err := r.finalizeCloudDatabase(instance); err != nil {
// If cleanup fails, we must return an error so the reconciliation is retried.
// The finalizer will NOT be removed, blocking deletion.
log.Error(err, "Finalizer cleanup failed. Requeuing.")
return ctrl.Result{}, err
}
// Cleanup was successful. Remove our finalizer.
log.Info("Finalizer cleanup successful. Removing finalizer.")
controllerutil.RemoveFinalizer(instance, cloudDatabaseFinalizer)
err := r.Update(ctx, instance)
if err != nil {
return ctrl.Result{}, err
}
}
// Stop reconciliation as the item is being deleted
return ctrl.Result{}, nil
}
// 3. The Creation/Update Path: Ensure our finalizer is present
if !controllerutil.ContainsFinalizer(instance, cloudDatabaseFinalizer) {
log.Info("Adding finalizer for CloudDatabase")
controllerutil.AddFinalizer(instance, cloudDatabaseFinalizer)
err := r.Update(ctx, instance)
if err != nil {
return ctrl.Result{}, err
}
}
// 4. Main reconciliation logic: create or update the external database
if instance.Status.DBInstanceID == "" {
// Resource doesn't exist yet, create it.
log.Info("Provisioning new external database")
instanceID, err := r.DBServiceClient.CreateDatabase(instance.Spec.Engine, instance.Spec.Size)
if err != nil {
log.Error(err, "Failed to create external database")
// Update status to reflect failure
instance.Status.Status = "FailedProvisioning"
_ = r.Status().Update(ctx, instance)
return ctrl.Result{}, err
}
// Update the CR status with the new instance ID
instance.Status.DBInstanceID = instanceID
instance.Status.Status = "Available"
log.Info("External database provisioned", "instanceID", instanceID)
err = r.Status().Update(ctx, instance)
if err != nil {
return ctrl.Result{}, err
}
}
// ... additional reconciliation logic for updates could go here ...
return ctrl.Result{}, nil
}
// finalizeCloudDatabase performs the actual external resource cleanup.
func (r *CloudDatabaseReconciler) finalizeCloudDatabase(instance *databasev1alpha1.CloudDatabase) error {
log := r.Log.WithValues("clouddatabase", instance.Name)
if instance.Status.DBInstanceID == "" {
log.Info("External database instance ID not found in status, nothing to clean up.")
return nil
}
log.Info("Deleting external database", "instanceID", instance.Status.DBInstanceID)
err := r.DBServiceClient.DeleteDatabase(instance.Status.DBInstanceID)
if err != nil {
// This is a critical error. We must not proceed with finalizer removal.
// The controller will retry this operation on the next reconciliation.
return fmt.Errorf("failed to delete external database %s: %w", instance.Status.DBInstanceID, err)
}
log.Info("Successfully deleted external database", "instanceID", instance.Status.DBInstanceID)
return nil
}
// SetupWithManager sets up the controller with the Manager.
func (r *CloudDatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&databasev1alpha1.CloudDatabase{}).
Complete(r)
}
Analysis of the Implementation
if isMarkedForDeletion block creates a clean, understandable state machine. All deletion logic is contained within this block. All creation/update logic happens outside of it.Update. This is the registration step. The reconciliation for this object will then trigger again, this time with the finalizer present, and proceed to the provisioning logic.finalizeCloudDatabase function is designed to be idempotent. It first checks if DBInstanceID exists. If not, it assumes cleanup is already done or was never needed. The mock DeleteDatabase function itself should be idempotent (i.e., calling delete on an already deleted resource should not return an error).finalizeCloudDatabase returns an error, the Reconcile function propagates this error. The controller-runtime manager will catch this and requeue the reconciliation request, typically with exponential backoff. This ensures that transient network errors or API failures from the cloud provider don't prevent cleanup; the operator will keep retrying until it succeeds.Advanced Scenarios and Edge Cases
Building a production-grade operator requires thinking beyond the happy path. Finalizers introduce their own set of complex failure modes that must be handled.
Edge Case 1: The Stuck `Terminating` State
Problem: A CR is stuck in the Terminating state indefinitely. kubectl get clouddatabase my-db shows the object, but kubectl delete has already been issued.
Cause: This happens when the finalizer logic consistently fails and returns an error, or if there's a bug in the operator that prevents it from ever removing the finalizer. For example, if our DeleteDatabase call to the cloud provider returns a 403 Forbidden error because the credentials have expired, the operator will retry forever, and the finalizer will never be removed.
Solution & Mitigation:
Finalizer cleanup failed. Requeuing. should be a high-severity alert. An alert should fire if an object remains in the Terminating state for an excessive period (e.g., > 1 hour).503 Service Unavailable) should be retried. A permanent error (e.g., 404 Not Found, 403 Forbidden) might require manual intervention or a different logic path. For a 404, the operator could assume the resource is already gone and proceed to remove the finalizer. For a 403, it might update the CR's status with a CleanupFailed condition and stop retrying, signaling to a human that credentials need to be fixed. kubectl patch clouddatabase my-db -p '{"metadata":{"finalizers":[]}}' --type=merge
This is a dangerous operation. It forces the Kubernetes API to delete the CR, but it will almost certainly orphan the external resource. This should only be done when the external resource has been manually cleaned up or is known to be non-existent.
Edge Case 2: Controller Crash During Finalization
Problem: The operator pod crashes or is evicted right after it successfully calls the cloud provider's delete API but before it removes the finalizer from the CR.
Solution: The design is inherently resilient to this. The state is stored in Kubernetes, not in the controller's memory.
- The controller pod restarts.
controller-runtime manager starts the reconciliation loop.CloudDatabase objects and finds my-db, which is still in the Terminating state with the finalizer present.Reconcile function is called for my-db.finalizeCloudDatabase function is executed again.DeleteDatabase call is made again for the same instance ID. A well-designed cloud API will see that the resource is already being deleted or is gone and will return a success (or a specific NotFound error that can be interpreted as success). - The cleanup logic succeeds (for the second time), and this time, the operator successfully removes the finalizer and updates the CR.
- Kubernetes garbage collects the object. The system self-heals.
Edge Case 3: Forceful Deletion
Problem: An administrator forcefully deletes the CR, bypassing the graceful deletion process that finalizers rely on.
kubectl delete clouddatabase my-db --grace-period=0 --force
Impact: This command instructs the API server to immediately remove the object from etcd, regardless of any finalizers present. The operator's finalizer logic is never triggered. The external database is orphaned.
Mitigation: This is an operational problem, not a code problem. The primary mitigation is education and RBAC. Teams should be trained that --force is a destructive, break-glass-in-case-of-emergency tool.
However, a robust operator can have a secondary defense mechanism: a garbage collection controller. This is a separate controller, or a periodic task within the main operator, that doesn't act on Kubernetes events. Instead, it runs on a schedule (e.g., once every 24 hours) and performs the following actions:
- Lists all known external database instances from the cloud provider API.
CloudDatabase CRs from the Kubernetes API.CloudDatabase CR in Kubernetes, it flags it as a potential orphan and can either automatically delete it or report it for manual review.Performance and Scalability
For an operator managing thousands of CRs, the finalizer pattern has performance implications.
CREATE, followed immediately by an UPDATE to add the finalizer. Similarly, deletion involves an UPDATE to remove the finalizer before the object is garbage collected. This doubles the write load for object lifecycle events.UPDATE calls in a short period. This can be mitigated by increasing the number of concurrent reconciles (MaxConcurrentReconciles in the controller manager options), but this must be balanced against API server rate limits.controller-runtime cache is a key performance feature. When we UPDATE an object to remove its finalizer, that update event is sent back to the controller's watch. The local cache is invalidated, and the Reconcile function is triggered again. It will see the object has no finalizer but still has a deletionTimestamp. This is expected, and our logic correctly returns ctrl.Result{} to stop reconciliation. It's important to understand this flow to avoid infinite reconciliation loops.Conclusion
The finalizer pattern is not an optional enhancement for operators managing external resources; it is a fundamental requirement for production-readiness. It is the definitive solution for preventing orphaned resources and ensuring that the lifecycle of a Custom Resource is authoritatively and safely tied to the lifecycle of the stateful workload it represents.
By implementing an idempotent, error-handling cleanup function and registering it with the Kubernetes deletion flow via a finalizer, you elevate your operator from a simple provisioner to a true lifecycle manager. Mastering this pattern—including its edge cases and failure modes—is a hallmark of an advanced Kubernetes engineer and is essential for building the robust, self-healing systems that Kubernetes promises.