DevOps CI/CD 流水線實戰:Docker + Kubernetes 全鏈路

DevOps

CI/CD 基礎與 2026 技術格局

CI/CD(持續整合/持續部署)是 DevOps 的核心實踐,目標是將程式碼從提交到生產環境的整個過程自動化、可追溯、可回滾

核心概念

概念 全稱 核心目標
CI Continuous Integration 頻繁合併程式碼,自動建構+測試,儘早發現問題
CD(交付) Continuous Delivery 程式碼隨時可部署到生產,需人工審批
CD(部署) Continuous Deployment 程式碼通過測試後自動部署到生產,無需人工干預

2026 主流 CI/CD 平台對比

平台 適用場景 核心優勢 Pipeline 定義
GitHub Actions 開源專案、中小團隊 原生整合、Marketplace 生態、免費額度大 .github/workflows/*.yml
GitLab CI 企業私有化、自託管 內建容器映像庫、安全掃描、K8s 整合 .gitlab-ci.yml
Jenkins 複雜流水線、傳統企業 外掛生態最豐富、高度可自訂 Jenkinsfile(Groovy)

💡 使用 YAML 格式化 工具編輯和校驗 CI/CD 設定檔,避免縮排錯誤。


Docker 最佳實踐

Docker 是 CI/CD 流水線的基石——每一次建構都應產出確定性的、可重現的容器映像。

多階段建構(Multi-Stage Build)

多階段建構是減小映像體積的第一利器,將編譯環境與執行環境分離:

# 階段1:建構
FROM golang:1.23-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server ./cmd/server

# 階段2:執行
FROM gcr.io/distroless/static:nonroot
COPY --from=builder /app/server /server
USER nonroot:nonroot
ENTRYPOINT ["/server"]

效果:Go 映像從 ~300MB 壓縮到 ~5MB

Node.js 多階段建構範例:

FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:22-alpine AS runner
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["node", "dist/main.js"]

映像層快取最佳化

Docker 映像層快取的關鍵原則:變化頻率低的指令放前面,變化頻率高的放後面

# 好的做法:先複製依賴檔案,利用快取
COPY package*.json ./
RUN npm ci
COPY . .

# 壞的做法:先複製全部原始碼,每次都重新安裝依賴
COPY . .
RUN npm ci

映像體積最佳化清單

最佳化手段 效果 適用場景
多階段建構 減少 60-90% 所有編譯型語言
Alpine 基礎映像 減少 50-80% 不依賴 glibc 的應用
distroless 映像 僅含應用二進位 Go、Java 等靜態編譯
.dockerignore 減少建構上下文 所有專案
合併 RUN 指令 減少映像層數 apt/apk 安裝場景
壓縮二進位 -ldflags="-s -w" 減少 20-30% Go 專案
# 合併 RUN 指令減少層數
RUN apk add --no-cache curl=8.11.0 && \
    apk add --no-cache git=2.45.0 && \
    rm -rf /var/cache/apk/*

.dockerignore 最佳實踐

# .dockerignore
node_modules
npm-debug.log
.git
.github
.gitlab
.vscode
.idea
*.md
*.test.js
coverage/
dist/
.env
.env.local

💡 使用 JSON 格式化 工具檢查 package.json 依賴版本一致性。


Kubernetes 部署策略

Kubernetes 提供多種部署策略,選擇取決於業務風險容忍度回滾速度要求

滾動更新(Rolling Update)

K8s 預設策略,逐步替換舊 Pod:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2        # 最多同時多出2個Pod
      maxUnavailable: 1   # 最多允許1個Pod不可用
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: registry.example.com/my-app:v2.0.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10

藍綠部署(Blue-Green Deployment)

同時執行兩套完整環境,切換 Service 指向實現零停機:

# blue-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app-blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
      version: blue
  template:
    metadata:
      labels:
        app: my-app
        version: blue
    spec:
      containers:
        - name: my-app
          image: registry.example.com/my-app:v1.0.0
---
# green-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app-green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
      version: green
  template:
    metadata:
      labels:
        app: my-app
        version: green
    spec:
      containers:
        - name: my-app
          image: registry.example.com/my-app:v2.0.0
---
# service.yaml(切換 selector 即可藍綠切換)
apiVersion: v1
kind: Service
metadata:
  name: my-app-svc
spec:
  selector:
    app: my-app
    version: blue    # 改為 green 即切換到新版本
  ports:
    - port: 80
      targetPort: 8080

金絲雀發布(Canary Release)

逐步將流量切到新版本,先小比例驗證再全量發布:

# canary-with-istio.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-app-vs
spec:
  hosts:
    - my-app.example.com
  http:
    - match:
        - headers:
            x-canary:
              exact: "true"
      route:
        - destination:
            host: my-app
            subset: canary
          weight: 100
    - route:
        - destination:
            host: my-app
            subset: stable
          weight: 90
        - destination:
            host: my-app
            subset: canary
          weight: 10

部署策略對比

策略 停機時間 回滾速度 資源開銷 複雜度 適用場景
滾動更新 日常發布
藍綠部署 快(切 Service) 高(雙倍) 關鍵業務
金絲雀 高風險變更

GitOps 與 ArgoCD

GitOps 是 2026 年 Kubernetes 部署的事實標準——用 Git 倉庫作為唯一事實來源,所有變更透過 Git 提交觸發。

GitOps 核心原則

  1. 宣告式:所有基礎設施和應用設定都是宣告式的
  2. 版本化:所有設定儲存在 Git 倉庫,完整變更歷史
  3. 自動拉取:部署工具自動從 Git 拉取變更並套用
  4. 持續協調:持續比對叢集狀態與 Git 宣明,自動修復漂移

ArgoCD 設定範例

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://git.example.com/platform/k8s-manifests.git
    targetRevision: main
    path: overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: my-app
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

Kustomize 多環境管理

k8s-manifests/
├── base/
│   ├── deployment.yaml
│   ├── service.yaml
│   └── kustomization.yaml
└── overlays/
    ├── development/
    │   ├── kustomization.yaml
    │   └── patch-replicas.yaml
    ├── staging/
    │   ├── kustomization.yaml
    │   └── patch-replicas.yaml
    └── production/
        ├── kustomization.yaml
        └── patch-replicas.yaml
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
  - ../../base
patchesStrategicMerge:
  - patch-replicas.yaml
  - patch-resources.yaml
configMapGenerator:
  - name: app-config
    literals:
      - ENV=production
      - LOG_LEVEL=warn
      - DB_HOST=prod-db.internal

Pipeline as Code:完整 GitHub Actions Workflow

這是生產級 CI/CD 流水線的完整實作,涵蓋建構、測試、安全掃描、部署全鏈路:

name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}
  K8S_NAMESPACE: my-app

jobs:
  # 作業1:程式碼檢查與單元測試
  lint-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '1.23'
      - name: Lint
        run: golangci-lint run ./...
      - name: Unit Test
        run: go test -race -coverprofile=coverage.out ./...
      - name: Upload Coverage
        uses: codecov/codecov-action@v4
        with:
          file: coverage.out

  # 作業2:安全掃描
  security-scan:
    runs-on: ubuntu-latest
    needs: lint-and-test
    steps:
      - uses: actions/checkout@v4
      - name: Trivy FS Scan
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: fs
          severity: CRITICAL,HIGH
          exit-code: '1'
      - name: Snyk SAST
        uses: snyk/actions/golang@master
        env:
          SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}

  # 作業3:建構並推送 Docker 映像
  build-and-push:
    runs-on: ubuntu-latest
    needs: security-scan
    permissions:
      contents: read
      packages: write
    outputs:
      image_tag: ${{ steps.meta.outputs.tags }}
      image_digest: ${{ steps.build.outputs.digest }}
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: Docker Metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=
            type=ref,event=branch
            type=semver,pattern={{version}}
      - name: Build and Push
        id: build
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          build-args: |
            BUILD_DATE=${{ github.event.head_commit.timestamp }}
            VCS_REF=${{ github.sha }}

  # 作業4:映像安全掃描
  image-scan:
    runs-on: ubuntu-latest
    needs: build-and-push
    steps:
      - name: Trivy Image Scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ needs.build-and-push.outputs.image_tag }}
          severity: CRITICAL,HIGH
          exit-code: '1'
          format: sarif
          output: trivy-results.sarif
      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy-results.sarif

  # 作業5:部署到 K8s
  deploy:
    runs-on: ubuntu-latest
    needs: [build-and-push, image-scan]
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
      - uses: actions/checkout@v4
      - uses: azure/setup-kubectl@v3
      - uses: azure/setup-helm@v3
      - name: Configure kubectl
        run: |
          mkdir -p $HOME/.kube
          echo "${{ secrets.KUBE_CONFIG }}" | base64 -d > $HOME/.kube/config
      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./helm/my-app \
            --namespace ${{ env.K8S_NAMESPACE }} \
            --set image.tag=${{ needs.build-and-push.outputs.image_tag }} \
            --set image.digest=${{ needs.build-and-push.outputs.image_digest }} \
            --values ./helm/my-app/values-production.yaml \
            --timeout 5m \
            --wait
      - name: Verify Deployment
        run: |
          kubectl rollout status deployment/my-app \
            --namespace ${{ env.K8S_NAMESPACE }} \
            --timeout=3m
      - name: Smoke Test
        run: |
          STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
            https://my-app.example.com/healthz)
          if [ "$STATUS" != "200" ]; then
            echo "Smoke test failed: HTTP $STATUS"
            exit 1
          fi

GitLab CI 完整設定

# .gitlab-ci.yml
stages:
  - test
  - security
  - build
  - deploy

variables:
  DOCKER_TLS_CERTDIR: "/certs"
  REGISTRY: $CI_REGISTRY
  IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA

test:
  stage: test
  image: golang:1.23-alpine
  script:
    - go test -race -coverprofile=coverage.out ./...
    - go tool cover -func=coverage.out
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml

security-scan:
  stage: security
  image: aquasec/trivy:latest
  script:
    - trivy fs --severity CRITICAL,HIGH --exit-code 1 .
  allow_failure: false

build:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - docker build
        --cache-from $CI_REGISTRY_IMAGE:latest
        --tag $IMAGE_TAG
        --tag $CI_REGISTRY_IMAGE:latest
        --build-arg VCS_REF=$CI_COMMIT_SHA
        .
    - docker push $IMAGE_TAG
    - docker push $CI_REGISTRY_IMAGE:latest

deploy:staging:
  stage: deploy
  image: bitnami/kubectl:latest
  script:
    - kubectl config use-context staging
    - helm upgrade --install my-app ./helm/my-app
        --namespace staging
        --set image.tag=$CI_COMMIT_SHORT_SHA
        --values ./helm/my-app/values-staging.yaml
        --wait
  environment:
    name: staging
    url: https://staging.my-app.example.com
  only:
    - develop

deploy:production:
  stage: deploy
  image: bitnami/kubectl:latest
  script:
    - kubectl config use-context production
    - helm upgrade --install my-app ./helm/my-app
        --namespace production
        --set image.tag=$CI_COMMIT_SHORT_SHA
        --values ./helm/my-app/values-production.yaml
        --wait
  environment:
    name: production
    url: https://my-app.example.com
  when: manual
  only:
    - main

Jenkins Pipeline(Declarative)

// Jenkinsfile
pipeline {
    agent any

    environment {
        REGISTRY = 'registry.example.com'
        IMAGE_NAME = 'my-app'
        IMAGE_TAG = "${env.BUILD_NUMBER}-${env.GIT_COMMIT.take(8)}"
    }

    stages {
        stage('Test') {
            agent { label 'golang' }
            steps {
                sh 'go test -race -coverprofile=coverage.out ./...'
                sh 'golangci-lint run ./...'
            }
        }

        stage('Security Scan') {
            steps {
                sh "trivy fs --severity CRITICAL,HIGH --exit-code 1 ."
            }
        }

        stage('Build & Push') {
            agent { label 'docker' }
            steps {
                script {
                    docker.withRegistry("https://${REGISTRY}", 'registry-credentials') {
                        def image = docker.build(
                            "${IMAGE_NAME}:${IMAGE_TAG}",
                            '--build-arg VCS_REF=${GIT_COMMIT} .'
                        )
                        image.push()
                        image.push('latest')
                    }
                }
            }
        }

        stage('Deploy to Staging') {
            when { branch 'develop' }
            steps {
                sh """
                    helm upgrade --install ${IMAGE_NAME} ./helm/${IMAGE_NAME} \
                        --namespace staging \
                        --set image.tag=${IMAGE_TAG} \
                        --values ./helm/${IMAGE_NAME}/values-staging.yaml \
                        --wait
                """
            }
        }

        stage('Deploy to Production') {
            when { branch 'main' }
            input {
                message '確認部署到生產環境?'
                ok '部署'
            }
            steps {
                sh """
                    helm upgrade --install ${IMAGE_NAME} ./helm/${IMAGE_NAME} \
                        --namespace production \
                        --set image.tag=${IMAGE_TAG} \
                        --values ./helm/${IMAGE_NAME}/values-production.yaml \
                        --wait
                """
            }
        }
    }

    post {
        failure {
            slackSend(
                channel: '#cicd-alerts',
                color: 'danger',
                message: "Pipeline 失敗: ${env.JOB_NAME} #${env.BUILD_NUMBER}"
            )
        }
        success {
            slackSend(
                channel: '#cicd-alerts',
                color: 'good',
                message: "部署成功: ${env.JOB_NAME} #${env.BUILD_NUMBER} → ${IMAGE_TAG}"
            )
        }
    }
}

容器映像庫管理

映像標籤策略

標籤類型 範例 生命週期 用途
不可變標籤 sha-abc1234 永久 生產部署引用
語意版本 v2.1.0 永久 版本發布
分支標籤 main, develop 可覆蓋 開發/預覽
latest latest 可覆蓋 僅用於本地開發

核心原則:生產環境絕不使用可變標籤(如 latest),必須使用不可變標籤(如 Git SHA)。

映像清理策略

# GitHub Actions: 定期清理舊映像
name: Registry Cleanup
on:
  schedule:
    - cron: '0 2 * * 0'  # 每週日凌晨2點

jobs:
  cleanup:
    runs-on: ubuntu-latest
    steps:
      - name: Delete untagged images
        uses: actions/delete-package-versions@v5
        with:
          package-name: my-app
          min-versions-to-keep: 10
          delete-only-untagged-versions: true

安全掃描整合

Trivy:全端安全掃描

# 檔案系統掃描(依賴漏洞)
trivy fs --severity CRITICAL,HIGH --exit-code 1 .

# 映像掃描
trivy image --severity CRITICAL,HIGH registry.example.com/my-app:v2.0.0

# IaC 掃描(K8s manifest / Dockerfile)
trivy config --severity CRITICAL,HIGH ./k8s/

# SBOM 產生
trivy image --format spdx-json --output sbom.json registry.example.com/my-app:v2.0.0

Snyk:開發者友好的安全平台

# GitHub Actions: Snyk 整合
- name: Snyk Open Source
  uses: snyk/actions/golang@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    args: --severity-threshold=high

- name: Snyk Container
  uses: snyk/actions/docker@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    image: registry.example.com/my-app:v2.0.0
    args: --severity-threshold=high --file=Dockerfile

安全掃描層級

層級 工具 掃描內容 觸發時機
SAST Snyk Code / SonarQube 原始碼漏洞 每次提交
SCA Snyk Open Source / Trivy fs 依賴漏洞 每次提交
容器掃描 Trivy image / Snyk Container 映像漏洞 映像建構後
IaC 掃描 Trivy config / Checkov K8s/Dockerfile 設定風險 PR 階段
DAST OWASP ZAP 執行時漏洞 部署到 staging 後

💡 使用 Hash 加密 工具產生 CI/CD Secret 的校驗值,確保敏感設定不被竄改。


環境管理:Dev / Staging / Prod

環境隔離策略

# Helm values 多環境設定
# values-development.yaml
replicaCount: 1
resources:
  requests:
    cpu: 100m
    memory: 128Mi
autoscaling:
  enabled: false
config:
  logLevel: debug
  dbHost: dev-db.internal

# values-staging.yaml
replicaCount: 2
resources:
  requests:
    cpu: 250m
    memory: 256Mi
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 5
config:
  logLevel: info
  dbHost: staging-db.internal

# values-production.yaml
replicaCount: 3
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 1000m
    memory: 1Gi
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilizationPercentage: 70
config:
  logLevel: warn
  dbHost: prod-db.internal

GitHub Actions Environment 保護規則

# 生產環境需要人工審批
deploy-production:
  runs-on: ubuntu-latest
  environment: production    # 需在 GitHub Settings 中設定審批人
  steps:
    - name: Deploy
      run: helm upgrade --install my-app ./helm/my-app

在 GitHub 倉庫 Settings → Environments 中設定:

  • production:Required reviewers = 2 人審批,Wait timer = 5 分鐘
  • staging:無需審批,自動部署

監控與警報整合

Prometheus + Grafana 指標採集

# K8s Pod Monitor 註解
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
spec:
  template:
    spec:
      containers:
        - name: my-app
          image: registry.example.com/my-app:v2.0.0
          ports:
            - containerPort: 8080

部署警報規則

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: deployment-alerts
  namespace: monitoring
spec:
  groups:
    - name: deployment
      rules:
        - alert: DeploymentRolloutStuck
          expr: |
            kube_deployment_status_replicas_unavailable / kube_deployment_status_replicas > 0.5
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Deployment {{ $labels.deployment }} 滾動更新卡住"
        - alert: HighErrorRateAfterDeploy
          expr: |
            rate(http_requests_total{status=~"5.."}[5m])
            /
            rate(http_requests_total[5m]) > 0.05
          for: 3m
          labels:
            severity: critical
          annotations:
            summary: "部署後 5xx 錯誤率超過 5%"

Slack/釘釘 警報通知

# GitHub Actions: 部署通知
- name: Notify Deployment
  if: always()
  uses: 8398a7/action-slack@v3
  with:
    status: ${{ job.status }}
    fields: repo,message,commit,author,action,eventName,ref,workflow
  env:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

回滾策略

自動回滾:健康檢查失敗時

# Helm 部署 + 自動回滾
- name: Deploy with Auto Rollback
  run: |
    helm upgrade --install my-app ./helm/my-app \
      --namespace production \
      --set image.tag=${{ steps.meta.outputs.tags }} \
      --values ./helm/my-app/values-production.yaml \
      --timeout 5m \
      --wait || \
    (echo "部署失敗,執行回滾..." && \
     helm rollback my-app --namespace production && \
     exit 1)

手動回滾:基於 Git SHA

# 回滾到指定版本
kubectl rollout undo deployment/my-app --to-revision=3

# 回滾 Helm 部署
helm rollback my-app 2 --namespace production

# 基於 GitOps 的回滾:回退 Git 提交
git revert <commit-hash>
git push origin main
# ArgoCD 自動偵測到變更並執行回滾

金絲雀自動回滾

# Argo Rollouts: 自動分析 + 回滾
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  strategy:
    canary:
      canaryAnalysis:
        templates:
          - templateName: success-rate
            clusterScope: true
        startingStep: 2
        steps:
          - setWeight: 10
          - pause: { duration: 5m }
          - setWeight: 30
          - pause: { duration: 5m }
          - setWeight: 60
          - pause: { duration: 5m }
          - setWeight: 100
        analysisRun:
          successfulRunHistoryLimit: 3
          unsuccessfulRunHistoryLimit: 3

常見流水線故障與修復

故障現象 根因 修復方案
Docker 建構快取失效 .dockerignore 缺失或 COPY 順序錯誤 最佳化 Dockerfile 指令順序,新增 .dockerignore
映像推送 403 Registry 認證過期或權限不足 檢查 Service Account / Token 權限
K8s ImagePullBackOff 映像標籤不存在或 Registry 不可達 驗證映像標籤,檢查 Registry 網路和 Secret
Helm 部署超時 readinessProbe 設定錯誤或資源不足 調整 probe 參數,增加 resources limits
測試環境與生產不一致 環境設定差異 使用 Kustomize/Helm 統一管理,減少硬編碼
安全掃描誤報 依賴間接引入的漏洞 設定 .trivyignore 或 Snyk policy 忽略已知誤報
併發部署衝突 多人同時觸發流水線 使用 GitHub Concurrency 或 GitLab resource_group
Secret 洩露 明文寫入 YAML 或日誌 使用 Sealed Secrets / External Secrets Operator

併發控制

# GitHub Actions: 防止併發部署衝突
concurrency:
  group: deploy-${{ github.ref }}
  cancel-in-progress: true

除錯技巧

# 檢視 Pod 事件
kubectl describe pod <pod-name> -n <namespace>

# 檢視部署歷史
kubectl rollout history deployment/my-app -n production

# 檢視 Helm 發布歷史
helm history my-app -n production

# 連接埠轉發除錯
kubectl port-forward svc/my-app 8080:80 -n staging

# 檢視容器日誌
kubectl logs -f deployment/my-app -n production --all-containers

FAQ

Q:GitHub Actions 免費額度夠用嗎? A:公開倉庫無限,私有倉庫每月 2000 分鐘(Linux)。自託管 Runner 無限制。

Q:Docker 映像應該用 latest 標籤嗎? A:生產環境絕不使用 latest。使用 Git SHA 或語意版本作為不可變標籤,確保部署可追溯和可回滾。

Q:藍綠部署和金絲雀發布怎麼選? A:藍綠適合需要快速回滾的關鍵業務(切換 Service 即可),金絲雀適合需要漸進驗證的高風險變更。日常發布用滾動更新即可。

Q:GitOps 和傳統 CI/CD Push 模式有什麼區別? A:傳統 Push 模式是 CI 流水線主動 kubectl apply,GitOps 是叢集內 Agent(ArgoCD)主動拉取 Git 變更。GitOps 的優勢:Git 是唯一事實來源,叢集狀態漂移可自動修復。

Q:如何處理 CI/CD 中的 Secret? A:使用平台原生 Secret 管理(GitHub Secrets / GitLab Variables / Jenkins Credentials),K8s 中使用 Sealed Secrets 或 External Secrets Operator,絕不將 Secret 提交到 Git

Q:多叢集部署如何管理? A:使用 ArgoCD ApplicationSet + Git 目錄結構,或 Helm + kubeconfig 多上下文切換。推薦 ArgoCD 方案,原生支援多叢集。

Q:流水線太慢怎麼最佳化? A:1)利用 Docker 層快取和 GitHub Actions 快取;2)平行執行獨立 Job;3)使用自託管 Runner 減少冷啟動;4)增量測試(只測試變更模組)。

本站提供瀏覽器本地工具,免註冊即可試用 →

#DevOps#CI/CD#Docker#Kubernetes#教程