Argo CD Matrix Generators with External Secrets for Multi-Cluster Apps

21 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Platform Engineering Challenge: Scaling GitOps Beyond Simple Generators

As a senior platform or DevOps engineer, you're likely familiar with the power of Argo CD's ApplicationSet controller for managing applications at scale. Standard generators—list, cluster, git—are excellent for homogenous environments. However, production reality is rarely that simple. You're often faced with a complex matrix of deployment targets:

* Multi-Cluster: Dozens or hundreds of Kubernetes clusters across different regions, cloud providers, and environments (dev, staging, prod).

* Multi-Tenant: A single application infrastructure must serve multiple tenants, each with unique configurations, feature flags, and branding.

* Externalized Secrets: For security and compliance, secrets like database credentials, API keys, and certificates are managed in a dedicated secrets manager like HashiCorp Vault, not in Git.

The core problem arises when you need to deploy an application for every valid combination of cluster and tenant, with each unique combination requiring a specific set of secrets. A simple cluster generator can't know about tenants, and a git generator struggles to securely and dynamically fetch credentials from an external source. This is where a more advanced pattern is required.

This article will walk through a production-proven solution: combining the matrix generator with a custom, secrets-aware Config Management Plugin (CMP). We'll build a system that can dynamically generate Argo CD Applications based on configuration from multiple Git sources while injecting secrets fetched directly from Vault at manifest generation time.

We will assume you have a working knowledge of Argo CD, the ApplicationSet controller, and Helm. We will not be covering the basics.


The Limitations of Standard Generators in a Complex Environment

Let's formalize our scenario:

  • Cluster Definitions: A Git repository (gitops-clusters) contains metadata for each Kubernetes cluster.
  • bash
        # gitops-clusters repo structure
        └── clusters
            ├── prod-us-east-1.json
            ├── prod-eu-west-1.json
            └── staging-us-east-1.json

    Each JSON file contains cluster-specific information:

    json
        // prod-us-east-1.json
        { "name": "prod-us-east-1", "server": "https://1.2.3.4", "region": "us-east-1", "environment": "prod" }
  • Tenant Configurations: A separate Git repository (gitops-tenants) holds Helm value overrides for each tenant.
  • bash
        # gitops-tenants repo structure
        └── my-app
            ├── tenant-a
            │   └── values.yaml
            └── tenant-b
                └── values.yaml
  • Secrets in Vault: Secrets are stored in Vault under a structured path, for example: secret/data/my-app///db.
  • Using standard generators, you might try a git file generator for clusters and a git directory generator for tenants. But how do you combine them? And more importantly, how do you securely fetch the credentials from secret/data/my-app/prod/tenant-a/db for the application deploying tenant-a to the prod-us-east-1 cluster?

    You can't. This is the precise limitation the matrix generator and CMPs are designed to solve.

    Step 1: Combining Configuration Sources with the Matrix Generator

    The matrix generator allows us to create a Cartesian product of the outputs from two or more standard generators. Let's start by combining our cluster and tenant definitions, ignoring the secret problem for a moment.

    We'll define two git generators within our matrix generator:

    Cluster Generator: Uses the files discovery method to find all .json files in the gitops-clusters repository.

    * Tenant Generator: Uses the directories discovery method to find all tenant configuration directories in the gitops-tenants repository.

    Here is the initial ApplicationSet manifest:

    yaml
    apiVersion: argoproj.io/v1alpha1
    kind: ApplicationSet
    metadata:
      name: my-app-multi-tenant
      namespace: argocd
    spec:
      generators:
        - matrix:
            generators:
              # Generator 1: Discover clusters from the clusters repo
              - git:
                  repoURL: https://github.com/your-org/gitops-clusters.git
                  revision: HEAD
                  files:
                    - path: "clusters/*.json"
    
              # Generator 2: Discover tenants from the tenants repo
              - git:
                  repoURL: https://github.com/your-org/gitops-tenants.git
                  revision: HEAD
                  directories:
                    - path: "my-app/*"
    
      template:
        metadata:
          # Generate a unique name for each application
          # e.g., prod-us-east-1-my-app-tenant-a
          name: '{{cluster.name}}-{{path.basename}}'
        spec:
          project: default
          source:
            repoURL: https://github.com/your-org/my-app-helm-chart.git
            targetRevision: '1.2.3'
            chart: my-app
            helm:
              valueFiles:
                # Use the values file from the tenant generator
                - $values/my-app/{{path.basename}}/values.yaml
              # Pass cluster metadata as Helm values
              parameters:
                - name: "cluster.name"
                  value: "{{cluster.name}}"
                - name: "cluster.region"
                  value: "{{cluster.region}}"
                - name: "tenant.id"
                  value: "{{path.basename}}"
          destination:
            # Target the correct cluster discovered by the cluster generator
            server: '{{cluster.server}}'
            namespace: 'my-app-{{path.basename}}' # e.g., my-app-tenant-a
          syncPolicy:
            automated:
              prune: true
              selfHeal: true
            syncOptions:
              - CreateNamespace=true

    This manifest successfully generates an Argo CD Application for every cluster-tenant combination. For example, it will create an application named prod-us-east-1-tenant-a targeting the prod-us-east-1 cluster, using the Helm values from my-app/tenant-a/values.yaml.

    However, we still haven't addressed the secrets problem. Our Helm chart requires database credentials, which are not in Git.

    Step 2: Building a Secrets-Aware Config Management Plugin (CMP)

    To bridge the gap between Argo CD and Vault, we'll create a custom Config Management Plugin. A CMP is essentially a script that Argo CD can invoke to render Kubernetes manifests. Instead of using built-in tools like Helm or Kustomize directly, Argo CD will delegate the rendering process to our script.

    Our plugin will:

  • Receive parameters from the ApplicationSet template (like Vault path, tenant ID, cluster environment).
    • Authenticate with HashiCorp Vault (we'll use the Kubernetes Auth Method).
    • Fetch the required secrets from a dynamically constructed path.
  • Generate a temporary secrets.yaml file containing the fetched secrets.
  • Invoke the real helm template command, passing both the original tenant values.yaml and our dynamically generated secrets.yaml.
    • Pipe the final rendered YAML manifests back to Argo CD.

    The Plugin Script

    Here is a robust Python script for our CMP. It uses the hvac library for Vault communication and is designed to be run inside a container.

    vault-helm-cmp.py:

    python
    #!/usr/bin/env python3
    
    import os
    import sys
    import subprocess
    import yaml
    import hvac
    
    def main():
        print("Initializing Vault-Helm CMP", file=sys.stderr)
    
        # --- 1. Get parameters from environment variables --- 
        app_name = os.environ.get("APP_NAME")
        chart_path = os.environ.get("CHART_PATH", ".")
        helm_values_path = os.environ.get("HELM_VALUES_PATH")
        vault_addr = os.environ.get("VAULT_ADDR")
        vault_role = os.environ.get("VAULT_ROLE")
        vault_secret_path = os.environ.get("VAULT_SECRET_PATH")
        
        if not all([app_name, helm_values_path, vault_addr, vault_role, vault_secret_path]):
            print("Error: Missing required environment variables.", file=sys.stderr)
            sys.exit(1)
    
        # --- 2. Authenticate with Vault using K8s Auth Method --- 
        print(f"Authenticating to Vault at {vault_addr} with role {vault_role}", file=sys.stderr)
        try:
            with open('/var/run/secrets/kubernetes.io/serviceaccount/token') as f:
                jwt = f.read()
            
            client = hvac.Client(url=vault_addr)
            auth_response = client.auth.kubernetes.login(
                role=vault_role,
                jwt=jwt,
            )
            client.token = auth_response['auth']['client_token']
            assert client.is_authenticated()
            print("Vault authentication successful.", file=sys.stderr)
    
        except Exception as e:
            print(f"Error authenticating with Vault: {e}", file=sys.stderr)
            sys.exit(1)
    
        # --- 3. Fetch secrets from Vault ---
        print(f"Fetching secrets from path: {vault_secret_path}", file=sys.stderr)
        try:
            response = client.secrets.kv.v2.read_secret_version(path=vault_secret_path)
            secrets = response['data']['data']
            if not secrets:
                raise ValueError("No data found at secret path")
        except Exception as e:
            print(f"Error fetching secrets from Vault path '{vault_secret_path}': {e}", file=sys.stderr)
            # Gracefully exit if secrets don't exist for a given combination
            # This prevents a single missing secret from poisoning the entire ApplicationSet
            print("Assuming no secrets required for this application, proceeding with empty secrets file.", file=sys.stderr)
            secrets = {}
    
        # --- 4. Generate temporary secrets values file ---
        secrets_yaml_path = "/tmp/secrets.yaml"
        with open(secrets_yaml_path, 'w') as f:
            # We nest the secrets under a 'secrets' key to avoid conflicts
            yaml.dump({"secrets": secrets}, f)
        print(f"Generated temporary secrets file at {secrets_yaml_path}", file=sys.stderr)
    
        # --- 5. Construct and execute helm template command --- 
        helm_command = [
            "helm", "template", app_name,
            chart_path,
            "-f", helm_values_path,
            "-f", secrets_yaml_path
        ]
    
        # Add any additional helm parameters passed to the plugin
        # e.g., --set, --namespace
        if 'ARGOCD_APP_NAMESPACE' in os.environ:
            helm_command.extend(["--namespace", os.environ['ARGOCD_APP_NAMESPACE']])
        
        print(f"Executing Helm command: {' '.join(helm_command)}", file=sys.stderr)
    
        # --- 6. Pipe output to stdout for Argo CD ---
        try:
            result = subprocess.run(helm_command, check=True, capture_output=True, text=True)
            print(result.stdout) # This is the final rendered YAML
            print(f"Successfully rendered manifests for {app_name}", file=sys.stderr)
        except subprocess.CalledProcessError as e:
            print(f"Error during 'helm template':\n{e.stderr}", file=sys.stderr)
            sys.exit(1)
    
    if __name__ == "__main__":
        main()
    

    Containerize the Plugin

    Now, we need to package this script into a container image that also includes Helm.

    Dockerfile:

    dockerfile
    FROM python:3.9-slim
    
    # Install dependencies: curl, gnupg for Helm install
    RUN apt-get update && apt-get install -y curl gnupg && rm -rf /var/lib/apt/lists/*
    
    # Install Helm
    RUN curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
    
    # Install Python libraries
    RUN pip install hvac==1.1.0 pyyaml==6.0
    
    # Copy the CMP script
    COPY vault-helm-cmp.py /usr/local/bin/vault-helm-cmp.py
    RUN chmod +x /usr/local/bin/vault-helm-cmp.py
    
    # This script will be the entrypoint for manifest generation
    COPY discover.sh /usr/local/bin/discover.sh
    RUN chmod +x /usr/local/bin/discover.sh
    
    WORKDIR /src
    
    # Entrypoint is handled by Argo CD's plugin config

    Argo CD plugins require a discover.sh script, though for a sidecar CMP, its job is minimal. It just needs to confirm that it can handle the current directory.

    discover.sh:

    bash
    #!/bin/sh
    # A simple discovery script. For our use case, we just exit 0
    # as the plugin is explicitly named in the Application spec.
    exit 0

    Build and push this image to your container registry (e.g., your-registry/vault-helm-cmp:1.0.0).

    Register the Plugin with Argo CD

    To make Argo CD aware of our new plugin, we must patch the argocd-repo-server deployment to include our CMP as a sidecar container and modify the argocd-cm ConfigMap to register it.

    cmp-patch.yaml:

    yaml
    # Patch for argocd-cm ConfigMap
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: argocd-cm
      namespace: argocd
    data:
      configManagementPlugins: |
        - name: vault-helm-cmp
          generate:
            command: ["/usr/local/bin/vault-helm-cmp.py"]
    
    ---
    # Patch for argocd-repo-server Deployment
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: argocd-repo-server
      namespace: argocd
    spec:
      template:
        spec:
          containers:
            - name: vault-helm-cmp
              image: your-registry/vault-helm-cmp:1.0.0
              command: ["/usr/local/bin/discover.sh"]
              volumeMounts:
                - name: var-files
                  mountPath: /var/run/secrets
                - name: argocd-repo-server-tmp
                  mountPath: /tmp
          volumes:
            - name: var-files
              emptyDir: {}
            - name: argocd-repo-server-tmp
              emptyDir: {}

    Apply this patch using Kustomize or by directly modifying your Argo CD installation manifests.

    Security Note: The argocd-repo-server will need a Kubernetes ServiceAccount that is authorized to authenticate with Vault. You must configure a VaultKubernetesAuth role that binds this ServiceAccount to a Vault policy granting read access to the relevant secret paths.

    Step 3: Integrating the CMP into the ApplicationSet

    Now we can update our ApplicationSet to use the vault-helm-cmp plugin. The key change is in the template.spec.source section.

    We will replace the standard helm block with a plugin block. We'll use the env array to pass the required parameters to our Python script, dynamically constructing the VAULT_SECRET_PATH using values from our matrix generators.

    Here is the final, complete ApplicationSet manifest:

    yaml
    apiVersion: argoproj.io/v1alpha1
    kind: ApplicationSet
    metadata:
      name: my-app-multi-tenant
      namespace: argocd
    spec:
      generators:
        - matrix:
            generators:
              # Generator 1: Discover clusters
              - git:
                  repoURL: https://github.com/your-org/gitops-clusters.git
                  revision: HEAD
                  files:
                    - path: "clusters/*.json"
    
              # Generator 2: Discover tenants
              - git:
                  repoURL: https://github.com/your-org/gitops-tenants.git
                  revision: HEAD
                  directories:
                    - path: "my-app/*"
    
      template:
        metadata:
          name: '{{cluster.name}}-{{path.basename}}'
        spec:
          project: default
          # --- THIS IS THE UPDATED SECTION --- 
          source:
            repoURL: https://github.com/your-org/gitops-tenants.git
            targetRevision: HEAD
            # The path here doesn't matter as much, but we set it to the root.
            path: '.' 
            plugin:
              name: vault-helm-cmp
              env:
                # Static parameters for the plugin
                - name: VAULT_ADDR
                  value: "https://vault.your-org.com"
                - name: VAULT_ROLE
                  value: "argocd-repo-server"
                - name: CHART_PATH
                  # The chart is in a different repo, so we need to fetch it. 
                  # A better approach is using Helm's OCI registry support.
                  # For this example, we assume the chart is packaged with the plugin or fetched separately.
                  # A more robust CMP would handle `helm dependency update`.
                  value: "/path/to/unpacked/chart/my-app"
                
                # Dynamic parameters for the plugin
                - name: APP_NAME
                  value: "{{cluster.name}}-{{path.basename}}"
                - name: HELM_VALUES_PATH
                  value: "my-app/{{path.basename}}/values.yaml"
                - name: VAULT_SECRET_PATH
                  # Dynamically construct the Vault path
                  value: "secret/data/my-app/{{cluster.environment}}/{{path.basename}}/db"
    
          destination:
            server: '{{cluster.server}}'
            namespace: 'my-app-{{path.basename}}'
          syncPolicy:
            automated:
              prune: true
              selfHeal: true
            syncOptions:
              - CreateNamespace=true

    With this manifest, the ApplicationSet controller will:

  • Generate a parameter set for each (cluster, tenant) pair.
  • For each set, it instructs the argocd-repo-server to invoke our vault-helm-cmp plugin.
  • The plugin sidecar runs, receives the dynamic environment variables (e.g., VAULT_SECRET_PATH=secret/data/my-app/prod/tenant-a/db).
  • The script authenticates, fetches secrets, runs helm template with the combined values, and returns the final YAML to Argo CD.
    • Argo CD syncs the resulting manifests to the target cluster.

    Advanced Edge Cases and Performance Considerations

    This pattern is powerful, but in a large-scale production environment, you must consider the following:

    1. Performance and Rate Limiting

    If you have 100 clusters and 50 tenants, this ApplicationSet will generate 5,000 applications. The argocd-repo-server will make 5,000 calls to your Git provider and 5,000 API calls to Vault during every refresh cycle (typically 3 minutes).

    * Solution: Webhooks and Caching: Configure Git webhooks to trigger Argo CD refreshes instead of relying on polling. This reduces unnecessary checks. Implement caching within your CMP. For a given VAULT_SECRET_PATH, the secrets might not change often. The plugin could cache results in memory (e.g., using a time-to-live cache) to reduce load on Vault. Be mindful of memory consumption in the argocd-repo-server pod.

    * Solution: ApplicationSet Sharding: For extremely large-scale deployments, consider sharding the ApplicationSet controller itself, although this is a very advanced topic beyond the scope of this article.

    2. Error Handling: The Poison Pill Problem

    What happens if a secret path is missing for one tenant? In our script, we handled this gracefully: print("Assuming no secrets required...", file=sys.stderr) and secrets = {}. This is a critical design choice. If the script were to exit with a non-zero code, the manifest generation for that application would fail. If this happens consistently, it can clog up Argo CD's work queues. Deciding whether a missing secret is a fatal error or a recoverable condition is context-dependent. Your CMP should be explicit about this logic.

    3. Secret Rotation and Manifest Refresh

    When a secret is updated in Vault, how does Argo CD pick up the change? It doesn't, automatically. The CMP only runs when Argo CD refreshes an application.

    * Solution: The most common approach is to rely on Argo CD's periodic refresh. A 3-5 minute delay for secret propagation is often acceptable. If you need near-instant updates, you would need a more complex system where a Vault-aware operator (like the External Secrets Operator) watches Vault and triggers an argocd app refresh via the Argo CD API when a relevant secret changes.

    4. CMP Security and Resource Management

    The CMP sidecar runs with the same ServiceAccount as the argocd-repo-server. This ServiceAccount needs to be carefully locked down. The Vault role it uses should have read-only access to a very specific set of secret paths (e.g., secret/data/my-app/*). Never grant it write or admin privileges.

    Furthermore, monitor the resource consumption (CPU, memory) of the argocd-repo-server pod. A poorly written or inefficient CMP can starve the main repo server process, impacting the performance of your entire Argo CD instance.

    5. Testing Your CMP

    Testing a CMP can be cumbersome. You can test the script locally by mocking the environment variables Argo CD provides.

    bash
    # You need a valid K8s service account token and Vault configured for K8s auth
    export KUBERNETES_SERVICE_HOST=some_host # mock this
    # mount a valid token
    
    # Set env vars to simulate an ApplicationSet run
    export VAULT_ADDR="..."
    export VAULT_ROLE="..."
    export VAULT_SECRET_PATH="secret/data/my-app/staging/tenant-a/db"
    export HELM_VALUES_PATH="./path/to/values.yaml"
    export APP_NAME="test-app"
    export CHART_PATH="./path/to/chart"
    
    # Run the script and inspect its output
    python3 vault-helm-cmp.py > rendered.yaml

    This allows you to iterate on the plugin logic without needing to rebuild the container and patch Argo CD for every change.

    Conclusion: A Blueprint for Scalable, Secure GitOps

    By combining the matrix generator with a custom, secrets-aware Config Management Plugin, we have designed a GitOps workflow that meets the demands of a complex, multi-vector enterprise environment. This pattern maintains a clean separation of concerns: cluster topology, tenant configuration, and application secrets all live in their authoritative sources, and Argo CD dynamically composes them at deployment time.

    While this approach introduces the overhead of creating and maintaining a custom plugin, the payoff is a highly scalable, secure, and flexible platform that can manage a vast and diverse application landscape. For organizations committed to GitOps at scale, mastering this advanced ApplicationSet pattern is not just a useful trick—it's a foundational capability for building a robust internal developer platform.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles