Managing External Resources with Kubernetes Finalizers in Custom Operators
The Lifecycle Mismatch: Why `kubectl delete` Isn't Enough
In a mature Kubernetes environment, operators are the cornerstone of automation, extending the Kubernetes API to manage complex, stateful applications and external resources. A common task for an operator is to provision an external resource—like an AWS RDS instance, a Google Cloud Storage bucket, or a SaaS subscription—in response to a Custom Resource (CR) being created. The problem arises during deletion. When a user executes kubectl delete my-cr, the Kubernetes garbage collector is swift and efficient. It marks the object for deletion and removes it. However, the Kubernetes control plane has no intrinsic knowledge of the external RDS instance this CR represented. The result is an orphaned resource, silently accruing costs and becoming a maintenance liability.
This lifecycle desynchronization is a fundamental challenge in operator development. The declarative nature of Kubernetes applies to in-cluster resources via ownerReferences, which enable cascading deletion. But this mechanism stops at the cluster boundary. To bridge this gap, Kubernetes provides a powerful, albeit often misunderstood, mechanism: Finalizers.
A finalizer is not a piece of code; it's a metadata key. Specifically, it's a string added to the metadata.finalizers array of an object. When the Kubernetes API server sees a delete request for an object that has a non-empty finalizers list, it does not immediately delete it. Instead, it updates the object's metadata.deletionTimestamp to the current time and leaves the object in a Terminating state. The object remains visible via the API until its finalizers list is empty.
This behavior transforms a simple deletion into a two-phase commit process, providing a hook for controllers to execute pre-delete cleanup logic. It is the controller's sole responsibility to detect the deletionTimestamp, perform the necessary external cleanup, and, only upon successful completion, remove its finalizer from the list. This article provides a comprehensive, production-focused guide to implementing this pattern correctly using Go and the controller-runtime library.
Scenario: The `ManagedDatabase` Operator
To ground our discussion in a practical example, we will build an operator that manages a hypothetical ManagedDatabase resource. This CR will represent a database instance in an external, third-party service. Our operator's primary responsibilities will be:
ManagedDatabase CR is created, call an external API to create a database instance.status subresource with the external database ID and connection endpoint.ManagedDatabase CR is deleted, use a finalizer to ensure the external database instance is properly destroyed before the CR is removed from the cluster.First, let's define our CRD. This structure provides the spec for desired state and status for observed state, which is crucial for a robust reconciliation loop.
manageddatabase_types.go
package v1alpha1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// ManagedDatabaseSpec defines the desired state of ManagedDatabase
type ManagedDatabaseSpec struct {
// Engine specifies the database engine (e.g., "postgres", "mysql").
// +kubebuilder:validation:Enum=postgres;mysql
// +kubebuilder:validation:Required
Engine string `json:"engine"`
// Version specifies the engine version.
// +kubebuilder:validation:Required
Version string `json:"version"`
// StorageGB specifies the allocated storage in gigabytes.
// +kubebuilder:validation:Minimum=10
StorageGB int `json:"storageGB"`
}
// ManagedDatabaseStatus defines the observed state of ManagedDatabase
type ManagedDatabaseStatus struct {
// DBInstanceID is the unique identifier for the external database instance.
DBInstanceID string `json:"dbInstanceId,omitempty"`
// Endpoint is the connection endpoint for the database.
Endpoint string `json:"endpoint,omitempty"`
// Phase indicates the current state of the resource (e.g., "Creating", "Ready", "Deleting").
Phase string `json:"phase,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// ManagedDatabase is the Schema for the manageddatabases API
type ManagedDatabase struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ManagedDatabaseSpec `json:"spec,omitempty"`
Status ManagedDatabaseStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// ManagedDatabaseList contains a list of ManagedDatabase
type ManagedDatabaseList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ManagedDatabase `json:"items"`
}
func init() {
SchemeBuilder.Register(&ManagedDatabase{}, &ManagedDatabaseList{})
}
Architecting the Reconciliation Loop with Finalizer Logic
The core of the operator is the Reconcile function. A naive implementation only handles creation and updates. A production-grade implementation must explicitly partition its logic to handle two distinct states: the normal reconciliation of a live object and the cleanup reconciliation of an object marked for deletion.
Our finalizer will be a unique string, typically following a domain-style convention to avoid collisions with other controllers that might also be managing the same object. We'll define it as a constant.
manageddatabase_controller.go
const managedDatabaseFinalizer = "database.example.com/finalizer"
// A mock external client for demonstration purposes.
// In a real implementation, this would make HTTP calls to a cloud provider.
type ExternalDBClient struct {}
func (c *ExternalDBClient) CreateDatabase(spec v1alpha1.ManagedDatabaseSpec) (string, string, error) {
// Simulate API call
log.Log.Info("Creating external database", "engine", spec.Engine, "version", spec.Version)
time.Sleep(2 * time.Second)
instanceID := "db-" + uuid.New().String()
endpoint := instanceID + ".db.example.com"
return instanceID, endpoint, nil
}
func (c *ExternalDBClient) DeleteDatabase(instanceID string) error {
// Simulate API call
log.Log.Info("Deleting external database", "instanceID", instanceID)
time.Sleep(2 * time.Second)
// This should be idempotent. If the DB is already gone, it should not return an error.
return nil
}
The Reconcile function acts as a state machine dispatcher. The primary condition it evaluates is the presence of the deletionTimestamp.
import (
"context"
"time"
"github.com/go-logr/logr"
"github.com/google/uuid"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
v1alpha1 "path/to/your/api/v1alpha1"
)
// ... Reconciler struct definition
type ManagedDatabaseReconciler struct {
client.Client
Scheme *runtime.Scheme
ExternalClient *ExternalDBClient
}
func (r *ManagedDatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
// 1. Fetch the ManagedDatabase instance
dbInstance := &v1alpha1.ManagedDatabase{}
if err := r.Get(ctx, req.NamespacedName, dbInstance); err != nil {
if errors.IsNotFound(err) {
logger.Info("ManagedDatabase resource not found. Ignoring since object must be deleted.")
return ctrl.Result{}, nil
}
logger.Error(err, "Failed to get ManagedDatabase")
return ctrl.Result{}, err
}
// 2. The core finalizer logic
if dbInstance.ObjectMeta.DeletionTimestamp.IsZero() {
// The object is NOT being deleted, so we proceed with normal reconciliation.
// We must ensure our finalizer is present on the object.
if !controllerutil.ContainsFinalizer(dbInstance, managedDatabaseFinalizer) {
logger.Info("Adding finalizer to ManagedDatabase")
controllerutil.AddFinalizer(dbInstance, managedDatabaseFinalizer)
if err := r.Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to add finalizer")
return ctrl.Result{}, err
}
}
// Handle the normal reconciliation logic (creation/updates)
return r.reconcileNormal(ctx, dbInstance, logger)
} else {
// The object IS being deleted.
if controllerutil.ContainsFinalizer(dbInstance, managedDatabaseFinalizer) {
// Our finalizer is present, so we must handle external resource cleanup.
return r.reconcileDelete(ctx, dbInstance, logger)
}
// Our finalizer has been removed, so we have nothing left to do.
// The object will be deleted by Kubernetes.
return ctrl.Result{}, nil
}
}
This top-level function is clean and clear. It fetches the resource and immediately branches based on the deletion status. Let's examine the two sub-reconciliation functions.
Normal Reconciliation: Creation and Updates
This function handles the "happy path." It ensures the external resource exists and its state matches the spec. The critical first step is to check if the external resource has been created yet, often by inspecting the status subresource.
func (r *ManagedDatabaseReconciler) reconcileNormal(ctx context.Context, dbInstance *v1alpha1.ManagedDatabase, logger logr.Logger) (ctrl.Result, error) {
// If the DBInstanceID is not set in the status, it means we haven't created the external DB yet.
if dbInstance.Status.DBInstanceID == "" {
logger.Info("Creating external database for ManagedDatabase resource")
// Update status to 'Creating'
dbInstance.Status.Phase = "Creating"
if err := r.Status().Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to update ManagedDatabase status to Creating")
return ctrl.Result{}, err
}
instanceID, endpoint, err := r.ExternalClient.CreateDatabase(dbInstance.Spec)
if err != nil {
logger.Error(err, "Failed to create external database")
// We could add more sophisticated error handling and status conditions here.
return ctrl.Result{}, err
}
// External resource is created. Now, update the CR status with the details.
dbInstance.Status.DBInstanceID = instanceID
dbInstance.Status.Endpoint = endpoint
dbInstance.Status.Phase = "Ready"
if err := r.Status().Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to update ManagedDatabase status after creation")
return ctrl.Result{}, err
}
logger.Info("Successfully created external database and updated status")
return ctrl.Result{}, nil
}
// TODO: Implement update logic. For example, check if dbInstance.Spec.StorageGB has changed
// and call an external API to update the storage.
logger.Info("Reconciliation complete, external database already exists")
return ctrl.Result{}, nil
}
Deletion Reconciliation: The Finalizer's Work
This is where the finalizer pattern truly shines. This function is only called when deletionTimestamp is set and our finalizer is present. Its single responsibility is to clean up the external resource.
func (r *ManagedDatabaseReconciler) reconcileDelete(ctx context.Context, dbInstance *v1alpha1.ManagedDatabase, logger logr.Logger) (ctrl.Result, error) {
logger.Info("Reconciling deletion for ManagedDatabase")
if dbInstance.Status.DBInstanceID != "" {
// Update status to 'Deleting'
dbInstance.Status.Phase = "Deleting"
if err := r.Status().Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to update ManagedDatabase status to Deleting")
return ctrl.Result{}, err
}
if err := r.ExternalClient.DeleteDatabase(dbInstance.Status.DBInstanceID); err != nil {
// If the deletion fails, we must not remove the finalizer.
// We return an error to requeue the request.
logger.Error(err, "Failed to delete external database. Requeuing.")
// Use RequeueAfter for controlled backoff
return ctrl.Result{RequeueAfter: 30 * time.Second}, err
}
}
// External resource is now deleted (or never existed).
// We can safely remove our finalizer.
logger.Info("External database deleted, removing finalizer")
controllerutil.RemoveFinalizer(dbInstance, managedDatabaseFinalizer)
if err := r.Update(ctx, dbInstance); err != nil {
logger.Error(err, "Failed to remove finalizer")
return ctrl.Result{}, err
}
// Finalizer removed. The object will now be garbage collected by Kubernetes.
return ctrl.Result{}, nil
}
Key takeaways from this implementation:
reconcileNormal function never has to worry about deletion, and reconcileDelete never worries about creation.DeleteDatabase call fails, the reconciler returns an error, and the request is requeued. The finalizer remains, preventing the CR's deletion until the cleanup succeeds.status.phase field provides crucial visibility into the operator's actions for users and other automated systems.Advanced Patterns and Production Hardening
The implementation above is robust, but production environments introduce complexity. Senior engineers must anticipate and handle edge cases that can disrupt the cleanup process.
Idempotency in Cleanup Logic
What happens if the operator crashes right after successfully calling DeleteDatabase but before removing the finalizer? On the next reconciliation, reconcileDelete will be called again for the same object. If our DeleteDatabase function is not idempotent, it might fail when trying to delete a resource that's already gone.
Your external client logic must be idempotent. A common pattern is to treat a "Not Found" error from the external API as a success during deletion.
Improved DeleteDatabase function:
func (c *ExternalDBClient) DeleteDatabase(instanceID string) error {
log.Log.Info("Attempting to delete external database", "instanceID", instanceID)
// In a real client, this would be a specific error type or status code from the cloud provider's SDK.
// e.g., if err != nil && !IsNotFoundError(err) { return err }
isAlreadyDeleted := c.simulateAPICall(instanceID)
if isAlreadyDeleted {
log.Log.Info("External database already deleted, treating as success")
return nil
}
// Perform the actual deletion logic here...
time.Sleep(2 * time.Second)
log.Log.Info("Successfully deleted external database", "instanceID", instanceID)
return nil
}
func (c *ExternalDBClient) simulateAPICall(instanceID string) bool {
// This is a placeholder for checking if the resource exists.
// For example, by using a map to track created DBs in this mock client.
return false // or true if it was already deleted
}
By handling "not found" as a success, you ensure that network blips or controller restarts don't leave the object in a permanently terminating state simply because the cleanup function is not re-entrant.
The "Stuck Finalizer" Problem
A common production issue is an object that is permanently stuck in the Terminating state. This happens when a controller has a bug and fails to remove its finalizer, or if the controller is uninstalled without a proper cleanup of the CRs it managed. The deletionTimestamp is set, the finalizer is present, but no controller is running to act on it.
Diagnosis:
$ kubectl get manageddatabase my-db -o yaml
apiVersion: database.example.com/v1alpha1
kind: ManagedDatabase
metadata:
# ...
deletionTimestamp: "2023-10-27T18:30:00Z"
finalizers:
- database.example.com/finalizer
# ...
status:
phase: Deleting
Manual Intervention (The Escape Hatch):
As a cluster administrator, you can manually intervene by patching the object to remove the finalizer. This is a powerful but dangerous operation; you should first manually confirm that the external resource has been cleaned up.
# DANGER: First, ensure the external resource (e.g., the RDS instance) is actually deleted!
kubectl patch manageddatabase my-db --type json --patch='[ { "op": "remove", "path": "/metadata/finalizers" } ]'
Prevention:
reconcileDelete function can handle all foreseeable API errors from the external service.Terminating state for an extended period (e.g., > 1 hour). This is a strong signal that your finalizer logic is stuck.Performance and Scalability Considerations
In a large cluster with thousands of CRs, the performance of your finalizer logic matters.
* API Server Load: Adding a finalizer requires an UPDATE API call upon object creation. Removing it requires another UPDATE call upon deletion. This doubles the write load on the API server for your CRD's lifecycle compared to a stateless resource. For high-churn resources, this can be significant. There is no way around this; it's the cost of correctness.
Requeue Strategy: When reconcileDelete fails, a simple return ctrl.Result{}, err results in an exponential backoff managed by controller-runtime. This is generally desirable. However, if you know an external API has a specific rate limit or recovery time, a fixed delay can be more predictable: return ctrl.Result{RequeueAfter: 1 time.Minute}, nil. Returning nil for the error prevents the exponential backoff and enforces your custom delay.
* Controller Concurrency: The MaxConcurrentReconciles setting on your controller manager determines how many reconciliation loops can run in parallel. If your DeleteDatabase call is slow (e.g., waiting for an RDS instance to terminate can take several minutes), a low concurrency setting can lead to a bottleneck. All worker goroutines could become stuck waiting for long-running deletions, starving normal reconciliation of other resources. Consider tuning this parameter based on the expected latency of your external API calls.
Conclusion: Finalizers as a Contract for Reliability
Finalizers are not just a feature; they are a fundamental pattern for building reliable operators that interact with the world outside Kubernetes. They establish a contract: the Kubernetes object will not fully disappear until its designated controller has certified that all associated external cleanup is complete. While the implementation requires careful attention to state management, idempotency, and error handling, it is the only Kubernetes-native way to prevent resource orphaning.
By mastering the pattern of checking the deletionTimestamp, executing idempotent cleanup logic, and safely removing the finalizer, you elevate your operator from a simple automation tool to a production-grade, lifecycle-aware system manager. This level of robustness is non-negotiable for any operator responsible for managing costly or critical stateful infrastructure.