GitOps Flux CD生產實踐:從Bootstrap到多叢集的6種部署模式

DevOps

手動 kubectl apply 正在摧毀你的生產環境

凌晨3點,線上告警炸了。你 ssh 到跳板機,kubectl apply -f deployment.yaml,問題暫時解決。但第二天發現:昨晚的變更沒有記錄,配置已經漂移,沒人知道叢集裡到底跑的是什麼版本。

這不是個例,這是傳統運維的日常災難:

  • 配置漂移:有人直接改了 ConfigMap,Git 裡的宣告和叢集狀態不一致
  • 無審計追蹤:kubectl 操作不留痕,出了問題無法回溯
  • 緊急回滾困難:不知道該回滾到哪個版本,只能手動拼湊
  • 多叢集噩夢:3個叢集5個環境,手動同步配置到崩潰
  • 安全風險:CI 系統持有叢集管理員憑證,一旦洩露全盤皆輸

GitOps 的核心思想:Git 是唯一可信源。Flux CD 作為 CNCF 畢業專案,是 Kubernetes 原生的 GitOps 引擎,以拉模式持續調和叢集狀態。


核心概念一覽

概念 說明 類比
GitOps 以 Git 倉庫為唯一可信源的基礎設施管理方法論 建築藍圖
Flux CD CNCF 畢業的 Kubernetes GitOps 引擎 自動施工隊
Kustomize Kubernetes 原生的配置定製工具,無需模板 裝修方案疊加
HelmRelease Flux 自定義資源,宣告式管理 Helm Chart 部署 套件管理器宣告
Source Controller Flux 元件,管理 Git/Helm/OCI/Bucket 等來源 倉庫管理員
Reconciliation 持續對比期望狀態與實際狀態並自動修復 巡檢糾偏
Progressive Delivery 漸進式交付,金絲雀/藍綠/AB測試逐步放量 逐步開門迎客

生產環境面臨的5大挑戰

挑戰1:多環境配置管理混亂

開發、測試、預發、生產四個環境,每個環境都有獨立的 YAML 副本。修改一個參數要改4個檔案,漏改一個就是事故。

挑戰2:Helm Chart 版本失控

Chart 版本、values 檔案、依賴關係散落各處。升級一個 Chart 不知道會影響哪些服務。

挑戰3:多叢集協同困難

多個 Kubernetes 叢集(公有雲、私有雲、邊緣節點),配置無法統一管理,同步全靠人工。

挑戰4:Secrets 明文儲存

資料庫密碼、API Key 直接寫在 YAML 裡提交到 Git,安全隱患巨大。

挑戰5:發佈缺乏灰度能力

一刀切全量發佈,新版本有問題直接影響全部使用者,無法逐步驗證。


6種生產部署模式

模式1:Flux Bootstrap 引導安裝

Flux Bootstrap 是一切的基礎——它將 Flux 自身也納入 GitOps 管理,實現「自舉」。

# 安裝 Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash

# 驗證叢集就緒
flux check --pre

# Bootstrap:將 Flux 安裝到叢集並關聯 Git 倉庫
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/production \
  --personal=false \
  --token-auth

# 驗證安裝
flux get kustomizations
kubectl get pods -n flux-system

Bootstrap 完成後,Flux 會在 Git 倉庫中建立 clusters/production/flux-system/ 目錄,包含所有 Flux 元件的清單:

# clusters/production/flux-system/gotk-components.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: flux-system
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  secretRef:
    name: flux-system
  url: ssh://git@github.com/myorg/fleet-infra.git
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./clusters/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
# 檢視調和狀態
flux get kustomizations --watch

# 強制立即調和
flux reconcile kustomization flux-system --with-source

# 檢視來源狀態
flux get sources git

模式2:Kustomize 覆蓋層實現多環境管理

使用 Kustomize 的 base/overlay 模式,一份基礎配置 + 環境差異覆蓋,徹底消除配置重複。

fleet-infra/
├── clusters/
│   ├── production/
│   │   └── flux-system/
│   ├── staging/
│   │   └── flux-system/
│   └── development/
│       └── flux-system/
├── apps/
│   ├── base/
│   │   ├── kustomization.yaml
│   │   ├── deployment.yaml
│   │   ├── service.yaml
│   │   └── hpa.yaml
│   ├── overlays/
│   │   ├── production/
│   │   │   ├── kustomization.yaml
│   │   │   ├── deployment-patch.yaml
│   │   │   └── hpa-patch.yaml
│   │   ├── staging/
│   │   │   ├── kustomization.yaml
│   │   │   └── deployment-patch.yaml
│   │   └── development/
│   │       ├── kustomization.yaml
│   │       └── deployment-patch.yaml
# apps/base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - deployment.yaml
  - service.yaml
  - hpa.yaml
commonLabels:
  app.kubernetes.io/managed-by: flux
# apps/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-app
          image: myorg/web-app:latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
          env:
            - name: LOG_LEVEL
              value: "info"
# apps/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: production
resources:
  - ../../base
patches:
  - path: deployment-patch.yaml
  - path: hpa-patch.yaml
# apps/overlays/production/deployment-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 5
  template:
    spec:
      containers:
        - name: web-app
          env:
            - name: LOG_LEVEL
              value: "warn"
          resources:
            requests:
              cpu: 250m
              memory: 256Mi
            limits:
              cpu: "1"
              memory: 1Gi
# apps/overlays/production/hpa-patch.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app
spec:
  minReplicas: 5
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
# clusters/production/apps.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: web-app
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: https://github.com/myorg/web-app-manifests.git
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: web-app-production
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: web-app
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: web-app
      namespace: production
  timeout: 3m0s
# 驗證 Kustomize 建構
flux build kustomization web-app-production \
  --path ./apps/overlays/production \
  --kustomization-file ./clusters/production/apps.yaml

# 檢視調和狀態
flux get kustomizations

模式3:HelmRelease 從 Git 宣告式管理

Flux 的 HelmRelease 讓 Helm 部署也完全宣告式,values 檔案存放在 Git 中,變更自動觸發升級。

# clusters/production/nginx-ingress.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
  name: ingress-nginx
  namespace: flux-system
spec:
  interval: 5m0s
  url: https://kubernetes.github.io/ingress-nginx
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: ingress-nginx
  namespace: flux-system
spec:
  interval: 10m0s
  chart:
    spec:
      chart: ingress-nginx
      version: "4.11.x"
      sourceRef:
        kind: HelmRepository
        name: ingress-nginx
      interval: 1m0s
  valuesFrom:
    - kind: ConfigMap
      name: ingress-nginx-default-values
    - kind: Secret
      name: ingress-nginx-sealed-values
      valuesKey: values.yaml
  values:
    controller:
      replicaCount: 3
      resources:
        requests:
          cpu: 200m
          memory: 256Mi
        limits:
          cpu: "1"
          memory: 512Mi
      service:
        type: LoadBalancer
        annotations:
          service.beta.kubernetes.io/aws-load-balancer-type: nlb
      config:
        proxy-body-size: "50m"
        proxy-read-timeout: "300"
        enable-real-ip: "true"
      metrics:
        enabled: true
        serviceMonitor:
          enabled: true
          additionalLabels:
            release: prometheus
# clusters/production/redis-ha.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
  name: bitnami
  namespace: flux-system
spec:
  interval: 5m0s
  url: https://charts.bitnami.com/bitnami
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: redis-ha
  namespace: database
spec:
  interval: 15m0s
  chart:
    spec:
      chart: redis
      version: "19.x"
      sourceRef:
        kind: HelmRepository
        name: bitnami
  install:
    remediation:
      retries: 3
  upgrade:
    remediation:
      retries: 3
      remediateLastFailure: true
  rollback:
    timeout: 5m0s
    cleanupOnFail: true
  values:
    architecture: replication
    auth:
      existingSecret: redis-secret
      existingSecretPasswordKey: password
    master:
      persistence:
        enabled: true
        size: 8Gi
        storageClass: gp3-encrypted
      resources:
        requests:
          cpu: 250m
          memory: 512Mi
    replica:
      replicaCount: 2
      persistence:
        enabled: true
        size: 8Gi
        storageClass: gp3-encrypted
    metrics:
      enabled: true
      serviceMonitor:
        enabled: true
# 檢視 Helm 發佈狀態
flux get helmreleases --all-namespaces

# 強制調和 HelmRelease
flux reconcile helmrelease redis-ha -n database --with-source

# 檢視 HelmRelease 詳情
flux describe helmrelease redis-ha -n database

# 檢視可用的 Chart 版本
flux get sources chart --all-namespaces

模式4:多叢集管理

Flux 天然支援多叢集——每個叢集一個目錄,共享應用配置,獨立環境變數。

fleet-infra/
├── clusters/
│   ├── production/
│   │   ├── flux-system/
│   │   ├── apps.yaml
│   │   ├── infrastructure.yaml
│   │   └── monitoring.yaml
│   ├── staging/
│   │   ├── flux-system/
│   │   ├── apps.yaml
│   │   └── infrastructure.yaml
│   └── us-east-2/
│       ├── flux-system/
│       ├── apps.yaml
│       └── infrastructure.yaml
├── infrastructure/
│   ├── base/
│   └── overlays/
│       ├── production/
│       ├── staging/
│       └── us-east-2/
└── apps/
    ├── base/
    └── overlays/
# Bootstrap 生產叢集
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/production \
  --token-auth

# Bootstrap 預發叢集
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/staging \
  --token-auth

# Bootstrap 區域叢集(使用不同上下文)
kubectl config use-context us-east-2-admin
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/us-east-2 \
  --token-auth
# clusters/production/infrastructure.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infrastructure
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./infrastructure/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: flux-system
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: infrastructure
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: ingress-nginx-controller
      namespace: ingress-nginx
# clusters/us-east-2/apps.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/us-east-2
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: infrastructure
  postBuildSubstitute:
    CLUSTER_REGION: "us-east-2"
    CLUSTER_NAME: "prod-us-east-2"
# 檢視多叢集調和狀態(切換上下文)
kubectl config use-context production-admin
flux get kustomizations

kubectl config use-context staging-admin
flux get kustomizations

# 暫停某個叢集的調和(維護窗口)
flux suspend kustomization apps

# 恢復調和
flux resume kustomization apps

模式5:Secrets 管理與 SOPS/sealed-secrets

Secrets 絕不能明文提交到 Git。Flux 原生整合 SOPS 和 sealed-secrets 兩種方案。

方案A:SOPS + Age

# 安裝 age 加密工具
curl -sLO https://github.com/FiloSottile/age/releases/latest/download/age-v1.2.0-linux-amd64.tar.gz
tar xzf age-v1.2.0-linux-amd64.tar.gz
sudo mv age/age* /usr/local/bin/

# 生成金鑰對
age-keygen -o age.agekey

# 將公鑰記錄下來
age-keygen -y age.agekey
# 輸出類似:age1abc123...

# 將私鑰存入叢集 Secret
kubectl create namespace flux-system || true
cat age.agekey | kubectl create secret generic sops-age \
  --namespace=flux-system \
  --from-file=age.agekey=/dev/stdin \
  --dry-run=client -o yaml | kubectl apply -f -
# clusters/production/sops-decryption.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  secretRef:
    name: flux-system
  url: ssh://git@github.com/myorg/fleet-infra.git
  ignore: |
    /**//*.md
    /**//*.txt
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./clusters/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  decryption:
    provider: sops
    secretRef:
      name: sops-age
# 加密 Secret 檔案
sops --encrypt --age=age1abc123... \
  --encrypted-regex '^(data|stringData)$' \
  --in-place apps/overlays/production/db-secret.yaml
# 加密後的 Secret 檔案(可安全提交到 Git)
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
  namespace: production
type: Opaque
data:
  username: ENC[AES256_GCM,data:xxxxxxx,tag:yyyy==,type:str]
  password: ENC[AES256_GCM,data:zzzzzzz,tag:wwww==,type:str]
sops:
  kms: []
  gcp_kms: []
  azure_kv: []
  hc_vault: []
  age:
    - recipient: age1abc123...
      enc: |
        -----BEGIN AGE ENCRYPTED FILE-----
        xxxxxxxxxxxxxxxxxxxxxxx
        -----END AGE ENCRYPTED FILE-----
  lastmodified: "2026-06-15T10:00:00Z"
  mac: ENC[AES256_GCM,data:mmmmm,tag:nnnn==,type:str]

方案B:Sealed Secrets

# 安裝 kubeseal CLI
curl -sLO https://github.com/bitnami-labs/sealed-secrets/releases/latest/download/kubeseal-linux-amd64
sudo install -m 755 kubeseal-linux-amd64 /usr/local/bin/kubeseal

# 從叢集取得公鑰
kubeseal --fetch-cert > sealed-secrets-cert.pem

# 建立 SealedSecret
kubectl create secret generic db-credentials \
  --namespace=production \
  --from-literal=username=admin \
  --from-literal=password='S3cur3P@ss!' \
  --dry-run=client -o yaml | \
  kubeseal --cert sealed-secrets-cert.pem \
  --format yaml > apps/overlays/production/db-sealedsecret.yaml
# apps/overlays/production/db-sealedsecret.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: db-credentials
  namespace: production
spec:
  encryptedData:
    username: AgBfj3k2...密封資料...
    password: AgCg7m9x...密封資料...
  template:
    metadata:
      name: db-credentials
      namespace: production
    type: Opaque

模式6:Flagger 金絲雀漸進式交付

Flagger 是 Flux 生態的漸進式交付工具,配合 Istio/NGINX/Skipper 等實現自動化金絲雀發佈。

# 使用 Helm 安裝 Flagger
helm repo add flagger https://flagger.app
helm upgrade --install flagger flagger/flagger \
  --namespace=flagger-system \
  --create-namespace \
  --set meshProvider=istio \
  --set metricsServer=http://prometheus.istio-system:9090
# apps/base/canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: web-app
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  service:
    port: 8080
    targetPort: 8080
    gateways:
      - istio-system/public-gateway
    hosts:
      - web-app.example.com
    trafficPolicy:
      tls:
        mode: DISABLE
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m
      - name: request-duration
        thresholdRange:
          max: 500
        interval: 1m
    webhooks:
      - name: load-test
        type: rollout
        url: http://flagger-loadtester.test/
        timeout: 5s
        metadata:
          cmd: "hey -z 1m -q 10 -c 2 http://web-app.production:8080/"
      - name: acceptance-test
        type: pre-rollout
        url: http://flagger-loadtester.test/
        timeout: 30s
        metadata:
          type: bash
          cmd: "curl -sf http://web-app.canary:8080/healthz"
# 金絲雀發佈流程視覺化
# 1. 偵測到新映像 → 建立 Canary Deployment
# 2. 0% → 10% 流量 → 分析指標
# 3. 10% → 20% 流量 → 分析指標
# 4. 20% → 30% 流量 → 分析指標
# 5. 30% → 40% 流量 → 分析指標
# 6. 40% → 50% 流量 → 分析指標
# 7. 100% 流量 → 提升為正式版本
# 任何階段指標不達標 → 自動回滾
# 檢視金絲雀狀態
flux get kustomizations --watch

# 檢視 Flagger 金絲雀詳情
kubectl get canary web-app -n production -o yaml

# 手動觸發金絲雀
flux reconcile kustomization apps --with-source

# 檢視金絲雀事件
kubectl describe canary web-app -n production

# 強制回滾
kubectl patch canary web-app -n production \
  -p '{"status":{"phase":"Rollback"}}' --type=merge

5個常見陷阱與正確做法

陷阱1:忽略調和間隔設定

錯誤做法

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 1h
  sourceRef:
    kind: GitRepository
    name: flux-system

正確做法

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  retryInterval: 1m0s
  timeout: 3m0s
  sourceRef:
    kind: GitRepository
    name: flux-system

關鍵:設定 retryInterval 確保調和失敗後快速重試,timeout 防止調和卡死。

陷阱2:HelmRelease 缺少回滾配置

錯誤做法

apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: redis
spec:
  chart:
    spec:
      chart: redis
      sourceRef:
        kind: HelmRepository
        name: bitnami
  values:
    architecture: replication

正確做法

apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: redis
spec:
  install:
    remediation:
      retries: 3
  upgrade:
    remediation:
      retries: 3
      remediateLastFailure: true
  rollback:
    timeout: 5m0s
    cleanupOnFail: true
    disableWait: false
  uninstall:
    keepHistory: true
  chart:
    spec:
      chart: redis
      sourceRef:
        kind: HelmRepository
        name: bitnami
  values:
    architecture: replication

陷阱3:Secret 明文提交 Git

錯誤做法

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
stringData:
  username: admin
  password: S3cur3P@ss!

正確做法

# 使用 SOPS 加密後提交
sops --encrypt --age=age1abc123... \
  --encrypted-regex '^(data|stringData)$' \
  --in-place secret.yaml
# 加密後可安全提交
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: ENC[AES256_GCM,data:xxx,tag:yyy==,type:str]
  password: ENC[AES256_GCM,data:zzz,tag:www==,type:str]
sops:
  age:
    - recipient: age1abc123...
      enc: |
        -----BEGIN AGE ENCRYPTED FILE-----
        -----END AGE ENCRYPTED FILE-----

陷阱4:缺少健康檢查

錯誤做法

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system

正確做法

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: web-app
      namespace: production
    - apiVersion: apps/v1
      kind: Deployment
      name: api-server
      namespace: production
  timeout: 5m0s

陷阱5:GitRepository 使用 HTTPS 而非 SSH

錯誤做法

flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/production

正確做法

# 使用 token 認證(推薦用於 GitHub)
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/production \
  --token-auth

# 或使用 SSH 金鑰
ssh-keygen -t ed25519 -C "flux@production" -f flux-ssh-key
flux bootstrap github \
  --owner=myorg \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/production \
  --ssh-key-algorithm=ed25519

錯誤排查速查表

錯誤訊息 原因 解決方案
unable to clone repository Git 憑證無效或網路不通 檢查 Secret 中的 SSH key/token,確認倉庫存取權限
artifact fetch failed Source Controller 無法拉取製品 檢查網路策略、代理配置,確認 Source 狀態
dry-run failed, error: resource exists 資源衝突,已有同名資源 使用 prune: true 或手動清理衝突資源
health check failed 健康檢查逾時,Pod 未就緒 檢查 Pod 事件和日誌,確認映像拉取和啟動
chart pull failed Helm Chart 拉取失敗 檢查 HelmRepository URL 和認證資訊
Helm install failed: timed out Helm 安裝逾時 增大 timeout,檢查 Readiness Probe 配置
decryption failed SOPS 解密失敗 確認 sops-age Secret 存在且私鑰正確
Kustomization dependency not ready 依賴的 Kustomization 未就緒 檢查 dependsOn 配置,確認依賴項狀態
drift detected 叢集狀態與 Git 宣告不一致 檢查是否有人手動修改了叢集資源
no matches for kind "HelmRelease" CRD 未安裝 確認 Helm Controller 已安裝且 CRD 已註冊
# 通用排查命令
flux check                                    # 檢查 Flux 元件狀態
flux get sources all                          # 檢視所有來源
flux get kustomizations                       # 檢視 Kustomization 狀態
flux get helmreleases --all-namespaces        # 檢視 HelmRelease 狀態
flux logs --level=error                       # 檢視 Flux 錯誤日誌
flux logs --kind=kustomization --name=apps    # 檢視特定資源日誌

# 深度排查
kubectl describe gitrepository flux-system -n flux-system
kubectl describe kustomization apps -n flux-system
kubectl logs -n flux-system deploy/kustomize-controller --tail=100
kubectl logs -n flux-system deploy/source-controller --tail=100
kubectl logs -n flux-system deploy/helm-controller --tail=100

進階最佳化

依賴編排與部署順序

Flux 的 dependsOn 實現了宣告式的部署順序控制,確保基礎設施就緒後再部署應用。

# clusters/production/dependencies.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: crds
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./infrastructure/crds
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infrastructure
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./infrastructure/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: crds
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: cert-manager
      namespace: cert-manager
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./apps/overlays/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: infrastructure

通知與告警整合

Flux Notification Controller 可以將調和事件推送到 Slack、Teams、Discord 等。

# clusters/production/notifications.yaml
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
  name: slack
  namespace: flux-system
spec:
  type: slack
  channel: flux-deployments
  secretRef:
    name: slack-webhook-url
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: slack-alert
  namespace: flux-system
spec:
  providerRef:
    name: slack
  eventSeverity: error
  eventSources:
    - kind: Kustomization
      name: "*"
    - kind: HelmRelease
      name: "*"
    - kind: GitRepository
      name: "*"
  exclusionList:
    - "waiting.*"
    - "reconcilation.*in_progress"
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: slack-info
  namespace: flux-system
spec:
  providerRef:
    name: slack
  eventSeverity: info
  eventSources:
    - kind: Kustomization
      name: "apps"
  summary: "應用部署通知"
# 建立 Slack Webhook Secret
kubectl create secret generic slack-webhook-url \
  --namespace=flux-system \
  --from-literal=address=https://hooks.slack.com/services/T00/B00/xxx

映像自動更新

Flux Image Automation 實現了「提交程式碼 → 建構映像 → 自動部署」的全自動流水線。

# clusters/production/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: web-app
  namespace: flux-system
spec:
  image: myorg/web-app
  interval: 1m0s
  secretRef:
    name: registry-credentials
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: web-app
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: web-app
  policy:
    semver:
      range: ">=1.0.0 <2.0.0"
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
  name: web-app
  namespace: flux-system
spec:
  interval: 1m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  git:
    commit:
      author:
        email: flux@myorg.com
        name: Flux Bot
      messageTemplate: |
        auto: update {{ .AutomationObject }} image
        {{ range .Updated.Images -}}
        - {{ . }}
        {{ end -}}
  update:
    path: ./apps/overlays/production
    strategy: Setters
# apps/overlays/production/deployment-patch.yaml
# 使用 setter 標記,ImageUpdateAutomation 會自動替換
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  annotations:
    # image.fluxcd.io/setters: web-app
spec:
  template:
    spec:
      containers:
        - name: web-app
          image: myorg/web-app:1.0.0 # {"$imagepolicy": "flux-system:web-app"}
# 檢視映像策略
flux get image policies

# 檢視映像倉庫
flux get image repositories

# 檢視自動更新
flux get image update-auto

# 手動觸發映像掃描
flux reconcile image repository web-app

GitOps 工具對比

特性 Flux CD ArgoCD Jenkins X Spinnaker
核心模式 Pull(拉) Pull(拉) Push + Pull Push(推)
CNCF 狀態 畢業專案 畢業專案 孵化專案 已歸檔
多叢集 原生支援 原生支援 有限支援 原生支援
UI 儀表板 無(可選 Weaveworks) 內建豐富 UI 內建 內建豐富 UI
Helm 支援 HelmRelease CRD Helm + Helmfile Pipeline Helm + Bake
Kustomize 原生支援 原生支援 有限
漸進式交付 Flagger(金絲雀) Argo Rollouts 內建策略
Secrets 管理 SOPS 原生整合 Vault/Sealed Vault Vault
映像自動更新 Image Automation Image Updater Pipeline
通知 Notification Controller 內建 Pipeline 內建
學習曲線
資源佔用 低(~200MB) 中(~500MB)
適用場景 宣告式純 GitOps 視覺化 GitOps CI/CD 一體化 複雜發佈策略
社群活躍度 非常高

總結:Flux CD 的設計哲學是「Git 做什麼,叢集就做什麼」——沒有 UI 的干擾,沒有手動操作的餘地。它用宣告式 API 和持續調和確保叢集狀態始終與 Git 倉庫一致。從 Bootstrap 到多叢集,從 Kustomize 到 HelmRelease,從 SOPS 到 Flagger,6種模式涵蓋了生產環境 GitOps 的全部場景。選擇 Flux,就是選擇了一條純粹、可審計、自動化的 GitOps 之路。


推薦工具

本站提供瀏覽器本地工具,免註冊即可試用 →

#GitOps#Flux CD#Kubernetes#CI/CD#持续交付#2026#ArgoCD