Argo CD ApplicationSet Generators for Dynamic Multi-Cluster Deployments

October 9, 2025

17 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Scaling Problem: Beyond Manual `Application` Manifests

In any mature Kubernetes ecosystem, the number of clusters rarely remains static. Development, staging, preview, and production clusters across multiple regions and clouds create a complex, dynamic fleet. The traditional GitOps approach of maintaining one Argo CD Application manifest per application, per cluster, rapidly becomes untenable. This manual process is error-prone, creates significant toil for platform teams, and fails to scale. A commit to update a shared component like a logging agent could require dozens of pull requests.

This is the precise problem the Argo CD ApplicationSet controller is designed to solve. It is not merely a convenience wrapper; it's a fundamental shift from imperative application management to a declarative, automated factory model. An ApplicationSet resource uses generators to produce Application manifests based on external sources of truth—be it a list of clusters, a Git repository structure, or a combination thereof.

This article assumes you are already familiar with Argo CD's core concepts. We will not cover what an Application or a Project is. Instead, we will focus exclusively on the advanced, production-oriented patterns for using ApplicationSet generators to manage complex, multi-cluster topologies.

The `Cluster` Generator: Fleet-Wide Policy Enforcement

The Cluster generator is the most direct way to target clusters already known to Argo CD. It iterates through the cluster secrets in the argocd namespace and generates parameters for each one. Its primary use case is enforcing the deployment of ubiquitous, platform-level components to every cluster, or a specific subset of them.

Scenario: Deploying a Security Agent to All Production Clusters

Imagine a mandate to deploy a runtime security agent, like Falco, to all production clusters. Manually creating an Application for each new production cluster is a compliance risk. The Cluster generator automates this.

The key to unlocking its power for targeted deployments lies in Kubernetes labels applied to the cluster Secret objects that Argo CD uses for authentication.

First, let's label our cluster secrets:

bash

# Label the production cluster in us-east-1
kubectl label secret cluster-prod-us-east-1 -n argocd argocd.argoproj.io/secret-type=cluster env=production region=us-east-1

# Label the staging cluster in eu-west-1
kubectl label secret cluster-staging-eu-west-1 -n argocd argocd.argoproj.io/secret-type=cluster env=staging region=eu-west-1

Now, we can create an ApplicationSet that uses a selector to target only the secrets with the env: production label.

yaml

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: falco-security-agent
  namespace: argocd
spec:
  generators:
  - clusters:
      # The selector is the key to targeted, policy-based deployment.
      selector:
        matchLabels:
          env: production
  template:
    metadata:
      # Dynamically name the application based on the target cluster's name.
      # This prevents naming collisions.
      name: '{{name}}-falco'
      namespace: argocd
      finalizers:
        - resources-finalizer.argocd.argoproj.io
    spec:
      project: platform-services
      source:
        repoURL: 'https://falcosecurity.github.io/charts'
        chart: falco
        targetRevision: 2.0.15
        helm:
          releaseName: falco
          # Parameterize values based on cluster metadata
          values: |
            driver:
              kind: module
            tty: true
            customRules:
              cluster_name: "{{name}}"
              cluster_region: "{{metadata.labels.region}}"
      destination:
        # 'server' and 'name' are the parameters generated by the cluster generator.
        server: '{{server}}'
        namespace: falco
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
        - CreateNamespace=true

Analysis of the Advanced Pattern:

Label-based Selection: The selector is the critical component. We are no longer hardcoding a list of clusters. Any cluster added to Argo CD with the env: production label will automatically receive the Falco agent. Decommissioning a cluster (and its secret) will cause the corresponding Application to be garbage collected.

Dynamic Parameterization: We use generated parameters like {{name}}, {{server}}, and {{metadata.labels.region}} directly in the Application template. This allows us to customize the deployment for each target, for instance, by injecting the cluster name and region into Falco's custom rules.

Lifecycle Management: The finalizers on the template's metadata ensure that when an Application is deleted by the ApplicationSet controller (e.g., because a cluster was removed), Argo CD will first delete all the resources it created in the managed cluster. This is crucial for clean decommissioning.

Edge Case: Cluster Secret Updates

The ApplicationSet controller watches for changes to the cluster secrets. If you add or remove a label from a secret, the controller will reconcile within a few minutes (by default, requeueAfterSeconds is 3 minutes), either generating a new Application or deleting an existing one. This makes your GitOps system responsive to infrastructure metadata changes, not just Git commits.

The `Git` Generator: Your Repository as the Source of Truth

While the Cluster generator is excellent for policy-based deployments, the Git generator provides more granular control by deriving the deployment matrix from the structure or content of a Git repository. This is the preferred pattern for managing application configurations and tenant onboarding.

We'll explore two primary strategies: directory discovery and file discovery.

Scenario 1: Directory Discovery for Per-Cluster Configuration

This pattern is ideal when each cluster has a distinct and potentially complex configuration. You model your Git repository so that each directory corresponds to a target for deployment.

Repository Structure:

text

/apps-config
├── guestbook
│   ├── clusters
│   │   ├── prod-us-east-1
│   │   │   └── config.json
│   │   └── staging-eu-west-1
│   │       └── config.json
│   └── base
│       └── values.yaml

Each config.json file contains the metadata for its respective cluster deployment:

prod-us-east-1/config.json:

json

{
  "clusterName": "prod-us-east-1",
  "clusterUrl": "https://1.2.3.4",
  "revision": "1.2.0"
}

staging-eu-west-1/config.json:

json

{
  "clusterName": "staging-eu-west-1",
  "clusterUrl": "https://5.6.7.8",
  "revision": "main"
}

Now, the ApplicationSet uses the Git generator with path discovery to find these files and generate Applications.

yaml

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: guestbook-app
  namespace: argocd
spec:
  generators:
  - git:
      repoURL: https://github.com/my-org/apps-config.git
      revision: HEAD
      # Discover all config.json files under the guestbook/clusters path.
      files:
      - path: "guestbook/clusters/**/config.json"
  template:
    metadata:
      name: '{{path.basename}}-guestbook'
      namespace: argocd
    spec:
      project: default
      source:
        repoURL: https://github.com/my-org/guestbook-helm-chart.git
        # The target revision is read from the discovered config.json file.
        targetRevision: '{{revision}}'
        chart: guestbook
        helm:
          valueFiles:
          # Reference a base values file from a known location.
          - $argocd-appset-source/guestbook/base/values.yaml
      destination:
        # The destination server is also read from the config.json file.
        server: '{{clusterUrl}}'
        namespace: guestbook

Analysis of the Advanced Pattern:

Git as a Database: The Git repository is no longer just for storing manifests; it's a structured database of deployment targets. Onboarding a new cluster for the guestbook app is a simple PR that adds a new directory and config.json file.

Path Globbing: The path field with a glob is powerful. It allows for flexible directory structures without needing to update the ApplicationSet.

Dynamic Configuration: Key deployment parameters like targetRevision and clusterUrl are not hardcoded. They are dynamically read from the JSON files, allowing for different environments to track different branches or tags of the application code.

Referencing Source Repo Files: The $argocd-appset-source variable is a subtle but critical feature. It allows the Application template to reference other files from the same repository that the generator is using. Here, we use it to include a base/values.yaml, enabling a layered configuration approach.

Scenario 2: File Discovery for Centralized Tenant Management

For some use cases, like managing hundreds of similar tenants, a single configuration file is easier to manage and automate than a sprawling directory structure.

Repository Structure:

text

/tenant-config
└── tenants.yaml

tenants.yaml:

yaml

- tenant: acme-corp
  cluster: prod-us-east-1
  clusterUrl: https://1.2.3.4
  plan: premium
  namespace: acme-corp-ns
- tenant: stark-industries
  cluster: prod-us-east-1
  clusterUrl: https://1.2.3.4
  plan: standard
  namespace: stark-industries-ns
- tenant: cyberdyne-systems
  cluster: prod-eu-central-1
  clusterUrl: https://9.8.7.6
  plan: standard
  namespace: cyberdyne-ns

The ApplicationSet now parses this YAML file.

yaml

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: tenant-apps
  namespace: argocd
spec:
  generators:
  - git:
      repoURL: https://github.com/my-org/tenant-config.git
      revision: HEAD
      files:
      - path: "tenants.yaml"
  template:
    metadata:
      name: '{{tenant}}-app'
    spec:
      project: tenants
      source:
        repoURL: https://github.com/my-org/tenant-helm-chart.git
        targetRevision: 'v3.4.1'
        chart: tenant-app
        helm:
          parameters:
          # Pass parameters directly to Helm's --set flag.
          - name: "tenant.name"
            value: "{{tenant}}"
          - name: "tenant.plan"
            value: "{{plan}}"
      destination:
        server: '{{clusterUrl}}'
        namespace: '{{namespace}}'

Why this is a powerful alternative:

* Centralized View: You have a single file that defines all tenants, making it easy to audit and perform bulk updates.

* Automation-Friendly: A CI/CD pipeline or an onboarding script can easily append a new YAML entry to this file to provision a new tenant, a much simpler operation than creating a directory structure and files.

The `Matrix` Generator: Combining Dimensions for Maximum Automation

The Matrix generator is arguably the most advanced and powerful feature of ApplicationSets. It takes two or more generators and creates a Cartesian product of their generated parameter sets. This allows you to decouple the definition of what gets deployed from where it gets deployed.

Production Scenario: Onboarding a New Cluster with a Standard Application Stack

Let's define a standard stack of applications (prometheus, loki, traefik) that must be deployed to every new cluster in a specific region. We want to manage the list of applications and the list of clusters independently.

Step 1: Define the Application List in a Git Repo

/platform-config/apps/standard-stack.json:

json

[
  {
    "name": "prometheus",
    "repoURL": "https://prometheus-community.github.io/helm-charts",
    "chart": "prometheus",
    "version": "15.5.3",
    "namespace": "monitoring"
  },
  {
    "name": "loki",
    "repoURL": "https://grafana.github.io/helm-charts",
    "chart": "loki-stack",
    "version": "2.6.4",
    "namespace": "monitoring"
  },
  {
    "name": "traefik",
    "repoURL": "https://helm.traefik.io/traefik",
    "chart": "traefik",
    "version": "10.19.2",
    "namespace": "ingress"
  }
]

Step 2: Create a Matrix ApplicationSet

This ApplicationSet will combine a Git generator (to get the list of apps) with a Cluster generator (to get the list of target clusters).

yaml

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: standard-platform-stack
  namespace: argocd
spec:
  generators:
  - matrix:
      generators:
        # Generator A: Discovers clusters with the 'platform-standard' label
        - clusters:
            selector:
              matchLabels:
                platform: standard

        # Generator B: Reads the list of applications from a Git file
        - git:
            repoURL: https://github.com/my-org/platform-config.git
            revision: main
            files:
              - path: "apps/standard-stack.json"

  template:
    metadata:
      # A unique name is critical: 'clusterName-appName'
      name: '{{name}}-{{nameNormalized}}'
      labels:
        # Use labels for better filtering in the Argo CD UI
        cluster: '{{name}}'
        app: '{{nameNormalized}}'
    spec:
      project: platform-services
      source:
        # Parameters from Generator B (Git)
        repoURL: '{{repoURL}}'
        chart: '{{chart}}'
        targetRevision: '{{version}}'
      destination:
        # Parameters from Generator A (Clusters)
        server: '{{server}}'
        namespace: '{{namespace}}'
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
        - CreateNamespace=true

Note: The nameNormalized parameter is a built-in helper that is a DNS-friendly version of the cluster name. We use name from the cluster generator and the name field from our JSON file, which Argo CD makes available. To avoid ambiguity, Argo CD would make the cluster's name available as name and the Git file's name field available as name too. Let's assume for clarity Argo CD would name-mangle one, or we would rename our JSON field to appName to be explicit. Let's correct the example to be robust, renaming name in the JSON to appName.

Corrected standard-stack.json name fields to appName:

json

[ { "appName": "prometheus", ... }, { "appName": "loki", ... } ]

Corrected template name:

name: '{{name}}-{{appName}}'

Analysis of the Power of Matrix:

Decoupling: The platform team can now add a new application to the standard-stack.json file in one PR, and it will be automatically rolled out to all targeted clusters. Conversely, the infrastructure team can provision a new cluster, label its secret with platform: standard, and the entire* standard stack will be deployed to it automatically, with zero changes to any ApplicationSet manifests.

* Scalability: This pattern scales effortlessly. Adding 10 new clusters and 5 new standard apps results in 50 Application manifests being generated, all from just two independent sources of truth.

* Combinatorial Complexity Managed: It handles the combinatorial explosion of app-cluster pairings declaratively, which would be impossible to manage manually.

Production Edge Cases and Performance Considerations

1. Cluster Lifecycle and Resource Pruning

When a cluster targeted by an ApplicationSet is decommissioned, you must ensure its Application and all associated resources are cleaned up. The process should be:

Remove the source of truth that generates the Application. For the Cluster generator, this means deleting the cluster Secret. For the Git generator, it means removing the directory or file entry.

The ApplicationSet controller detects the change and deletes the corresponding Argo CD Application resource.

If the Application manifest has the resources-finalizer.argocd.argoproj.io finalizer, Argo CD will first connect to the target cluster and delete all the Kubernetes resources it manages before deleting the Application resource itself.

CRITICAL: Always include the finalizer in your ApplicationSet template. Without it, you will be left with orphaned resources in decommissioned clusters.

2. Rate Limiting and Performance at Scale

The ApplicationSet controller polls Git repositories and queries the Kubernetes API for cluster secrets. With hundreds of clusters and dozens of ApplicationSet resources, this can create significant load.

* requeueAfterSeconds: This parameter on the generator controls how often it re-checks its source. The default is often aggressive (e.g., 3 minutes for Git). For stable configurations, increase this value to 10-15 minutes or more to reduce load on your Git provider and the Argo CD controller. argocd-applicationset-controller --git-refresh-interval 15m.

* Webhooks: For Git generators, configure webhooks from your Git provider to Argo CD. This allows for event-driven updates on commits, allowing you to set a much higher polling interval as a fallback.

* Controller Resources: Monitor the CPU and memory usage of the argocd-applicationset-controller pod. At scale, you will need to provide it with adequate resource requests and limits (e.g., 1-2 CPUs, 2-4 GiB RAM).

3. Debugging Generated Applications

When an ApplicationSet doesn't behave as expected, debugging can be tricky. Use these tools:

* kubectl describe applicationset -n argocd: The status field of the ApplicationSet resource is invaluable. It shows the last time a generator was reconciled, any errors encountered (e.g., Git authentication failure), and a list of the parameters it successfully generated.

* Dry Run: Before committing a complex ApplicationSet, you can use the argocd-applicationset CLI locally (if you have it installed) or apply it to a non-critical environment. Check the generated Application resources to ensure they have the correct parameters before pointing it at production.

* Application Health: The health status of the ApplicationSet itself will reflect errors in generation. If an Application generated by the set becomes unhealthy, this status does not bubble up to the ApplicationSet. You must monitor the health of the individual Application resources.

Conclusion: From Application Management to Platform Automation

Mastering ApplicationSet generators is a pivotal step in evolving a GitOps practice from simple application deployment to true platform automation. By leveraging generators, you codify the policies and logic of your deployment strategy, creating a system that is scalable, resilient, and self-service.

The progression from a List generator for static environments, to a Cluster generator for fleet-wide policy, to a Git generator for configuration-driven deployments, and finally to a Matrix generator for fully decoupled, combinatorial automation, represents a significant increase in the maturity and capability of your Kubernetes platform. These are not just features; they are architectural patterns that, when implemented correctly, drastically reduce operational overhead and enable engineering teams to move faster and more safely at scale.

The Scaling Problem: Beyond Manual `Application` Manifests

The `Cluster` Generator: Fleet-Wide Policy Enforcement

The `Git` Generator: Your Repository as the Source of Truth

The `Matrix` Generator: Combining Dimensions for Maximum Automation

Production Edge Cases and Performance Considerations

Conclusion: From Application Management to Platform Automation

Found this article helpful?