GitOps Flux CD生产实践:从Bootstrap到多集群的6种部署模式
手动 kubectl apply 正在摧毁你的生产环境
凌晨3点,线上告警炸了。你 ssh 到跳板机,kubectl apply -f deployment.yaml,问题暂时解决。但第二天发现:昨晚的变更没有记录,配置已经漂移,没人知道集群里到底跑的是什么版本。
这不是个例,这是传统运维的日常灾难:
- 配置漂移:有人直接改了 ConfigMap,Git 里的声明和集群状态不一致
- 无审计追踪:kubectl 操作不留痕,出了问题无法回溯
- 紧急回滚困难:不知道该回滚到哪个版本,只能手动拼凑
- 多集群噩梦:3个集群5个环境,手动同步配置到崩溃
- 安全风险:CI 系统持有集群管理员凭证,一旦泄露全盘皆输
GitOps 的核心思想:Git 是唯一可信源。Flux CD 作为 CNCF 毕业项目,是 Kubernetes 原生的 GitOps 引擎,以拉模式持续调和集群状态。
核心概念一览
| 概念 | 说明 | 类比 |
|---|---|---|
| GitOps | 以 Git 仓库为唯一可信源的基础设施管理方法论 | 建筑蓝图 |
| Flux CD | CNCF 毕业的 Kubernetes GitOps 引擎 | 自动施工队 |
| Kustomize | Kubernetes 原生的配置定制工具,无需模板 | 装修方案叠加 |
| HelmRelease | Flux 自定义资源,声明式管理 Helm Chart 部署 | 包管理器声明 |
| Source Controller | Flux 组件,管理 Git/Helm/OCI/Bucket 等源 | 仓库管理员 |
| Reconciliation | 持续对比期望状态与实际状态并自动修复 | 巡检纠偏 |
| Progressive Delivery | 渐进式交付,金丝雀/蓝绿/AB测试逐步放量 | 逐步开门迎客 |
生产环境面临的5大挑战
挑战1:多环境配置管理混乱
开发、测试、预发、生产四个环境,每个环境都有独立的 YAML 副本。修改一个参数要改4个文件,漏改一个就是事故。
挑战2:Helm Chart 版本失控
Chart 版本、values 文件、依赖关系散落各处。升级一个 Chart 不知道会影响哪些服务。
挑战3:多集群协同困难
多个 Kubernetes 集群(公有云、私有云、边缘节点),配置无法统一管理,同步全靠人工。
挑战4:Secrets 明文存储
数据库密码、API Key 直接写在 YAML 里提交到 Git,安全隐患巨大。
挑战5:发布缺乏灰度能力
一刀切全量发布,新版本有问题直接影响全部用户,无法逐步验证。
6种生产部署模式
模式1:Flux Bootstrap 引导安装
Flux Bootstrap 是一切的基础——它将 Flux 自身也纳入 GitOps 管理,实现"自举"。
# 安装 Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash
# 验证集群就绪
flux check --pre
# Bootstrap:将 Flux 安装到集群并关联 Git 仓库
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/production \
--personal=false \
--token-auth
# 验证安装
flux get kustomizations
kubectl get pods -n flux-system
Bootstrap 完成后,Flux 会在 Git 仓库中创建 clusters/production/flux-system/ 目录,包含所有 Flux 组件的清单:
# clusters/production/flux-system/gotk-components.yaml
# Flux 自动生成,包含所有控制器
apiVersion: v1
kind: Namespace
metadata:
name: flux-system
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
secretRef:
name: flux-system
url: ssh://git@github.com/myorg/fleet-infra.git
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 10m0s
path: ./clusters/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
# 查看调和状态
flux get kustomizations --watch
# 强制立即调和
flux reconcile kustomization flux-system --with-source
# 查看源状态
flux get sources git
模式2:Kustomize 覆盖层实现多环境管理
使用 Kustomize 的 base/overlay 模式,一份基础配置 + 环境差异覆盖,彻底消除配置重复。
fleet-infra/
├── clusters/
│ ├── production/
│ │ └── flux-system/
│ ├── staging/
│ │ └── flux-system/
│ └── development/
│ └── flux-system/
├── apps/
│ ├── base/
│ │ ├── kustomization.yaml
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ └── hpa.yaml
│ ├── overlays/
│ │ ├── production/
│ │ │ ├── kustomization.yaml
│ │ │ ├── deployment-patch.yaml
│ │ │ └── hpa-patch.yaml
│ │ ├── staging/
│ │ │ ├── kustomization.yaml
│ │ │ └── deployment-patch.yaml
│ │ └── development/
│ │ ├── kustomization.yaml
│ │ └── deployment-patch.yaml
# apps/base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- hpa.yaml
commonLabels:
app.kubernetes.io/managed-by: flux
# apps/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 2
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web-app
image: myorg/web-app:latest
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
env:
- name: LOG_LEVEL
value: "info"
# apps/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: production
resources:
- ../../base
patches:
- path: deployment-patch.yaml
- path: hpa-patch.yaml
# apps/overlays/production/deployment-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 5
template:
spec:
containers:
- name: web-app
env:
- name: LOG_LEVEL
value: "warn"
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: "1"
memory: 1Gi
# apps/overlays/production/hpa-patch.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app
spec:
minReplicas: 5
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
# clusters/production/apps.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: web-app
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
url: https://github.com/myorg/web-app-manifests.git
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: web-app-production
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: web-app
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: web-app
namespace: production
timeout: 3m0s
# 验证 Kustomize 构建
flux build kustomization web-app-production \
--path ./apps/overlays/production \
--kustomization-file ./clusters/production/apps.yaml
# 查看调和状态
flux get kustomizations
模式3:HelmRelease 从 Git 声明式管理
Flux 的 HelmRelease 让 Helm 部署也完全声明式,values 文件存放在 Git 中,变更自动触发升级。
# clusters/production/nginx-ingress.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: ingress-nginx
namespace: flux-system
spec:
interval: 5m0s
url: https://kubernetes.github.io/ingress-nginx
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
name: ingress-nginx
namespace: flux-system
spec:
interval: 10m0s
chart:
spec:
chart: ingress-nginx
version: "4.11.x"
sourceRef:
kind: HelmRepository
name: ingress-nginx
interval: 1m0s
valuesFrom:
- kind: ConfigMap
name: ingress-nginx-default-values
- kind: Secret
name: ingress-nginx-sealed-values
valuesKey: values.yaml
values:
controller:
replicaCount: 3
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: "1"
memory: 512Mi
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
config:
proxy-body-size: "50m"
proxy-read-timeout: "300"
enable-real-ip: "true"
metrics:
enabled: true
serviceMonitor:
enabled: true
additionalLabels:
release: prometheus
# clusters/production/redis-ha.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: bitnami
namespace: flux-system
spec:
interval: 5m0s
url: https://charts.bitnami.com/bitnami
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
name: redis-ha
namespace: database
spec:
interval: 15m0s
chart:
spec:
chart: redis
version: "19.x"
sourceRef:
kind: HelmRepository
name: bitnami
install:
remediation:
retries: 3
upgrade:
remediation:
retries: 3
remediateLastFailure: true
rollback:
timeout: 5m0s
cleanupOnFail: true
values:
architecture: replication
auth:
existingSecret: redis-secret
existingSecretPasswordKey: password
master:
persistence:
enabled: true
size: 8Gi
storageClass: gp3-encrypted
resources:
requests:
cpu: 250m
memory: 512Mi
replica:
replicaCount: 2
persistence:
enabled: true
size: 8Gi
storageClass: gp3-encrypted
metrics:
enabled: true
serviceMonitor:
enabled: true
# 查看 Helm 发布状态
flux get helmreleases --all-namespaces
# 强制调和 HelmRelease
flux reconcile helmrelease redis-ha -n database --with-source
# 查看 HelmRelease 详情
flux describe helmrelease redis-ha -n database
# 查看可用的 Chart 版本
flux get sources chart --all-namespaces
模式4:多集群管理
Flux 天然支持多集群——每个集群一个目录,共享应用配置,独立环境变量。
fleet-infra/
├── clusters/
│ ├── production/
│ │ ├── flux-system/ # Flux 自身配置
│ │ ├── apps.yaml # 生产环境应用
│ │ ├── infrastructure.yaml # 基础设施组件
│ │ └── monitoring.yaml # 监控栈
│ ├── staging/
│ │ ├── flux-system/
│ │ ├── apps.yaml
│ │ └── infrastructure.yaml
│ └── us-east-2/ # 区域集群
│ ├── flux-system/
│ ├── apps.yaml
│ └── infrastructure.yaml
├── infrastructure/
│ ├── base/ # 共享基础设施
│ └── overlays/
│ ├── production/
│ ├── staging/
│ └── us-east-2/
└── apps/
├── base/
└── overlays/
# Bootstrap 生产集群
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/production \
--token-auth
# Bootstrap 预发集群
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/staging \
--token-auth
# Bootstrap 区域集群(使用不同上下文)
kubectl config use-context us-east-2-admin
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/us-east-2 \
--token-auth
# clusters/production/infrastructure.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m0s
path: ./infrastructure/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: flux-system
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: infrastructure
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: ingress-nginx-controller
namespace: ingress-nginx
# clusters/us-east-2/apps.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/us-east-2
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: infrastructure
postBuildSubstitute:
CLUSTER_REGION: "us-east-2"
CLUSTER_NAME: "prod-us-east-2"
# 查看多集群调和状态(切换上下文)
kubectl config use-context production-admin
flux get kustomizations
kubectl config use-context staging-admin
flux get kustomizations
# 暂停某个集群的调和(维护窗口)
flux suspend kustomization apps
# 恢复调和
flux resume kustomization apps
模式5:Secrets 管理与 SOPS/sealed-secrets
Secrets 绝不能明文提交到 Git。Flux 原生集成 SOPS 和 sealed-secrets 两种方案。
方案A:SOPS + Age
# 安装 age 加密工具
curl -sLO https://github.com/FiloSottile/age/releases/latest/download/age-v1.2.0-linux-amd64.tar.gz
tar xzf age-v1.2.0-linux-amd64.tar.gz
sudo mv age/age* /usr/local/bin/
# 生成密钥对
age-keygen -o age.agekey
# 将公钥记录下来
age-keygen -y age.agekey
# 输出类似:age1abc123...
# 将私钥存入集群 Secret
kubectl create namespace flux-system || true
cat age.agekey | kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=/dev/stdin \
--dry-run=client -o yaml | kubectl apply -f -
# clusters/production/sops-decryption.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
secretRef:
name: flux-system
url: ssh://git@github.com/myorg/fleet-infra.git
ignore: |
# 排除不需要的文件
/**//*.md
/**//*.txt
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 10m0s
path: ./clusters/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
decryption:
provider: sops
secretRef:
name: sops-age
# 加密 Secret 文件
sops --encrypt --age=age1abc123... \
--encrypted-regex '^(data|stringData)$' \
--in-place apps/overlays/production/db-secret.yaml
# 加密后的 Secret 文件(可安全提交到 Git)
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
namespace: production
type: Opaque
data:
username: ENC[AES256_GCM,data:xxxxxxx,tag:yyyy==,type:str]
password: ENC[AES256_GCM,data:zzzzzzz,tag:wwww==,type:str]
sops:
kms: []
gcp_kms: []
azure_kv: []
hc_vault: []
age:
- recipient: age1abc123...
enc: |
-----BEGIN AGE ENCRYPTED FILE-----
xxxxxxxxxxxxxxxxxxxxxxx
-----END AGE ENCRYPTED FILE-----
lastmodified: "2026-06-15T10:00:00Z"
mac: ENC[AES256_GCM,data:mmmmm,tag:nnnn==,type:str]
方案B:Sealed Secrets
# 安装 sealed-secrets 控制器
flux install --components=source-controller,kustomize-controller,helm-controller,notification-controller
# 安装 kubeseal CLI
curl -sLO https://github.com/bitnami-labs/sealed-secrets/releases/latest/download/kubeseal-linux-amd64
sudo install -m 755 kubeseal-linux-amd64 /usr/local/bin/kubeseal
# 从集群获取公钥
kubeseal --fetch-cert > sealed-secrets-cert.pem
# 创建 SealedSecret
kubectl create secret generic db-credentials \
--namespace=production \
--from-literal=username=admin \
--from-literal=password='S3cur3P@ss!' \
--dry-run=client -o yaml | \
kubeseal --cert sealed-secrets-cert.pem \
--format yaml > apps/overlays/production/db-sealedsecret.yaml
# apps/overlays/production/db-sealedsecret.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: db-credentials
namespace: production
spec:
encryptedData:
username: AgBfj3k2...密封数据...
password: AgCg7m9x...密封数据...
template:
metadata:
name: db-credentials
namespace: production
type: Opaque
模式6:Flagger 金丝雀渐进式交付
Flagger 是 Flux 生态的渐进式交付工具,配合 Istio/NGINX/Skipper 等实现自动化金丝雀发布。
# 安装 Flagger
flux install --components=source-controller,kustomize-controller,helm-controller,notification-controller
# 使用 Helm 安装 Flagger
helm repo add flagger https://flagger.app
helm upgrade --install flagger flagger/flagger \
--namespace=flagger-system \
--create-namespace \
--set meshProvider=istio \
--set metricsServer=http://prometheus.istio-system:9090
# apps/base/canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: web-app
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
service:
port: 8080
targetPort: 8080
gateways:
- istio-system/public-gateway
hosts:
- web-app.example.com
trafficPolicy:
tls:
mode: DISABLE
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
webhooks:
- name: load-test
type: rollout
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://web-app.production:8080/"
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 30s
metadata:
type: bash
cmd: "curl -sf http://web-app.canary:8080/healthz"
# 金丝雀发布流程可视化
# 1. 检测到新镜像 → 创建 Canary Deployment
# 2. 0% → 10% 流量 → 分析指标
# 3. 10% → 20% 流量 → 分析指标
# 4. 20% → 30% 流量 → 分析指标
# 5. 30% → 40% 流量 → 分析指标
# 6. 40% → 50% 流量 → 分析指标
# 7. 100% 流量 → 提升为正式版本
# 任何阶段指标不达标 → 自动回滚
# 查看金丝雀状态
flux get kustomizations --watch
# 查看 Flagger 金丝雀详情
kubectl get canary web-app -n production -o yaml
# 手动触发金丝雀
flux reconcile kustomization apps --with-source
# 查看金丝雀事件
kubectl describe canary web-app -n production
# 强制回滚
kubectl patch canary web-app -n production \
-p '{"status":{"phase":"Rollback"}}' --type=merge
5个常见陷阱与正确做法
陷阱1:忽略调和间隔设置
❌ 错误做法:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 1h
sourceRef:
kind: GitRepository
name: flux-system
✅ 正确做法:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
retryInterval: 1m0s
timeout: 3m0s
sourceRef:
kind: GitRepository
name: flux-system
关键:设置
retryInterval确保调和失败后快速重试,timeout防止调和卡死。
陷阱2:HelmRelease 缺少回滚配置
❌ 错误做法:
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
name: redis
spec:
chart:
spec:
chart: redis
sourceRef:
kind: HelmRepository
name: bitnami
values:
architecture: replication
✅ 正确做法:
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRelease
metadata:
name: redis
spec:
install:
remediation:
retries: 3
upgrade:
remediation:
retries: 3
remediateLastFailure: true
rollback:
timeout: 5m0s
cleanupOnFail: true
disableWait: false
uninstall:
keepHistory: true
chart:
spec:
chart: redis
sourceRef:
kind: HelmRepository
name: bitnami
values:
architecture: replication
陷阱3:Secret 明文提交 Git
❌ 错误做法:
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
stringData:
username: admin
password: S3cur3P@ss!
✅ 正确做法:
# 使用 SOPS 加密后提交
sops --encrypt --age=age1abc123... \
--encrypted-regex '^(data|stringData)$' \
--in-place secret.yaml
# 加密后可安全提交
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
data:
username: ENC[AES256_GCM,data:xxx,tag:yyy==,type:str]
password: ENC[AES256_GCM,data:zzz,tag:www==,type:str]
sops:
age:
- recipient: age1abc123...
enc: |
-----BEGIN AGE ENCRYPTED FILE-----
-----END AGE ENCRYPTED FILE-----
陷阱4:缺少健康检查
❌ 错误做法:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
✅ 正确做法:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: web-app
namespace: production
- apiVersion: apps/v1
kind: Deployment
name: api-server
namespace: production
timeout: 5m0s
陷阱5:GitRepository 使用 HTTPS 而非 SSH
❌ 错误做法:
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/production
✅ 正确做法:
# 使用 token 认证(推荐用于 GitHub)
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/production \
--token-auth
# 或使用 SSH 密钥
ssh-keygen -t ed25519 -C "flux@production" -f flux-ssh-key
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=clusters/production \
--ssh-key-algorithm=ed25519
错误排查速查表
| 错误信息 | 原因 | 解决方案 |
|---|---|---|
unable to clone repository |
Git 凭证无效或网络不通 | 检查 Secret 中的 SSH key/token,确认仓库访问权限 |
artifact fetch failed |
Source Controller 无法拉取制品 | 检查网络策略、代理配置,确认 Source 状态 |
dry-run failed, error: resource exists |
资源冲突,已有同名资源 | 使用 prune: true 或手动清理冲突资源 |
health check failed |
健康检查超时,Pod 未就绪 | 检查 Pod 事件和日志,确认镜像拉取和启动 |
chart pull failed |
Helm Chart 拉取失败 | 检查 HelmRepository URL 和认证信息 |
Helm install failed: timed out |
Helm 安装超时 | 增大 timeout,检查 Readiness Probe 配置 |
decryption failed |
SOPS 解密失败 | 确认 sops-age Secret 存在且私钥正确 |
Kustomization dependency not ready |
依赖的 Kustomization 未就绪 | 检查 dependsOn 配置,确认依赖项状态 |
drift detected |
集群状态与 Git 声明不一致 | 检查是否有人手动修改了集群资源 |
no matches for kind "HelmRelease" |
CRD 未安装 | 确认 Helm Controller 已安装且 CRD 已注册 |
# 通用排查命令
flux check # 检查 Flux 组件状态
flux get sources all # 查看所有源
flux get kustomizations # 查看 Kustomization 状态
flux get helmreleases --all-namespaces # 查看 HelmRelease 状态
flux logs --level=error # 查看 Flux 错误日志
flux logs --kind=kustomization --name=apps # 查看特定资源日志
# 深度排查
kubectl describe gitrepository flux-system -n flux-system
kubectl describe kustomization apps -n flux-system
kubectl logs -n flux-system deploy/kustomize-controller --tail=100
kubectl logs -n flux-system deploy/source-controller --tail=100
kubectl logs -n flux-system deploy/helm-controller --tail=100
高级优化
依赖编排与部署顺序
Flux 的 dependsOn 实现了声明式的部署顺序控制,确保基础设施就绪后再部署应用。
# clusters/production/dependencies.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: crds
namespace: flux-system
spec:
interval: 10m0s
path: ./infrastructure/crds
prune: true
sourceRef:
kind: GitRepository
name: flux-system
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m0s
path: ./infrastructure/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: crds
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: cert-manager
namespace: cert-manager
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: infrastructure
通知与告警集成
Flux Notification Controller 可以将调和事件推送到 Slack、Teams、Discord 等。
# clusters/production/notifications.yaml
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack
namespace: flux-system
spec:
type: slack
channel: flux-deployments
secretRef:
name: slack-webhook-url
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: slack-alert
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: error
eventSources:
- kind: Kustomization
name: "*"
- kind: HelmRelease
name: "*"
- kind: GitRepository
name: "*"
exclusionList:
- "waiting.*"
- "reconcilation.*in_progress"
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: slack-info
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: info
eventSources:
- kind: Kustomization
name: "apps"
summary: "应用部署通知"
# 创建 Slack Webhook Secret
kubectl create secret generic slack-webhook-url \
--namespace=flux-system \
--from-literal=address=https://hooks.slack.com/services/T00/B00/xxx
镜像自动更新
Flux Image Automation 实现了"提交代码 → 构建镜像 → 自动部署"的全自动流水线。
# clusters/production/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: web-app
namespace: flux-system
spec:
image: myorg/web-app
interval: 1m0s
secretRef:
name: registry-credentials
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: web-app
namespace: flux-system
spec:
imageRepositoryRef:
name: web-app
policy:
semver:
range: ">=1.0.0 <2.0.0"
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: web-app
namespace: flux-system
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
git:
commit:
author:
email: flux@myorg.com
name: Flux Bot
messageTemplate: |
auto: update {{ .AutomationObject }} image
{{ range .Updated.Images -}}
- {{ . }}
{{ end -}}
update:
path: ./apps/overlays/production
strategy: Setters
# apps/overlays/production/deployment-patch.yaml
# 使用 setter 标记,ImageUpdateAutomation 会自动替换
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
annotations:
# image.fluxcd.io/setters: web-app
spec:
template:
spec:
containers:
- name: web-app
image: myorg/web-app:1.0.0 # {"$imagepolicy": "flux-system:web-app"}
# 查看镜像策略
flux get image policies
# 查看镜像仓库
flux get image repositories
# 查看自动更新
flux get image update-auto
# 手动触发镜像扫描
flux reconcile image repository web-app
GitOps 工具对比
| 特性 | Flux CD | ArgoCD | Jenkins X | Spinnaker |
|---|---|---|---|---|
| 核心模式 | Pull(拉) | Pull(拉) | Push + Pull | Push(推) |
| CNCF 状态 | 毕业项目 | 毕业项目 | 孵化项目 | 已归档 |
| 多集群 | 原生支持 | 原生支持 | 有限支持 | 原生支持 |
| UI 仪表盘 | 无(可选 Weaveworks) | 内置丰富 UI | 内置 | 内置丰富 UI |
| Helm 支持 | HelmRelease CRD | Helm + Helmfile | Pipeline | Helm + Bake |
| Kustomize | 原生支持 | 原生支持 | 有限 | 无 |
| 渐进式交付 | Flagger(金丝雀) | Argo Rollouts | 无 | 内置策略 |
| Secrets 管理 | SOPS 原生集成 | Vault/Sealed | Vault | Vault |
| 镜像自动更新 | Image Automation | Image Updater | Pipeline | 无 |
| 通知 | Notification Controller | 内置 | Pipeline | 内置 |
| 学习曲线 | 中 | 低 | 高 | 高 |
| 资源占用 | 低(~200MB) | 中(~500MB) | 高 | 高 |
| 适用场景 | 声明式纯 GitOps | 可视化 GitOps | CI/CD 一体化 | 复杂发布策略 |
| 社区活跃度 | 高 | 非常高 | 低 | 低 |
总结:Flux CD 的设计哲学是"Git 做什么,集群就做什么"——没有 UI 的干扰,没有手动操作的余地。它用声明式 API 和持续调和确保集群状态始终与 Git 仓库一致。从 Bootstrap 到多集群,从 Kustomize 到 HelmRelease,从 SOPS 到 Flagger,6种模式覆盖了生产环境 GitOps 的全部场景。选择 Flux,就是选择了一条纯粹、可审计、自动化的 GitOps 之路。
推荐工具
本站提供浏览器本地工具,免注册即可试用 →