K8sクラウドコスト急増？2026年FinOps実践：5つの戦略でKubernetesコストを60%削減

月末にクラウドの請求書を確認すると、K8s費用がまた予算を30%超過？CPU利用率が15%しかないのに100%分課金？Spotインスタンスが回収されてサービス断絶？これらは2026年のクラウドネイティブチームで日常茶飯事です。FinOpsは単なる節約ではなく、クラウドの支出1円がビジネス価値を生むようにする実践です。本記事では5つの実践的戦略でK8sコストを60%削減します。

背景知識：FinOpsフレームワーク

FinOps（Financial Operations）は、クラウド消費に財務アカウンタビリティを導入する実践フレームワークです：

フェーズ	目標	主要アクション	担当
Inform（可視化）	コストの可視化	タグガバナンス、コスト割り当て、請求書分析	FinOpsチーム
Optimize（最適化）	無駄の削減	右sizing、Spotインスタンス、予約インスタンス	エンジニアリングチーム
Operate（運用）	継続制御	予算アラート、オートスケーリング、ポリシー実行	プラットフォームチーム

問題分析：K8sコストの無駄はどこから？

無駄の原因	割合	典型的シナリオ
リソース過剰割り当て（request >> 実使用量）	40%	CPU request 2コア、実際は0.3コアのみ使用
アイドルリソース（トラフィックなしでも稼働）	25%	開発環境24h稼働、業務後トラフィックなし
Spot/Preemptible未使用	20%	全PodがOn-Demandインスタンス
オートスケーリング未設定	10%	HPA未設定、オフピーク時にリソース未解放
ストレージの無駄	5%	PVC過大、未クリーンアップのログとイメージ

戦略1：リソース右sizing

ステップ1：metrics-serverとkube-resource-reportのインストール

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

helm install kube-resource-report helm/kube-resource-report \
  --set prometheus.url=http://prometheus:9090

ステップ2：リソース利用率の分析

# 全Podのリソース使用率を確認
kubectl top pods -A --sort-by=cpu

# kubectl-resource-capacityでrequestと実使用量を比較
kubectl resource-capacity --sort cpu.request --pods

ステップ3：自動右sizingレコメンデーション

apiVersion: apps.kubecost.com/v1beta1
kind: RightSizingRecommendation
metadata:
  name: all-deployments
spec:
  targetRef:
    kind: Deployment
    name: order-service
  current:
    requests:
      cpu: "2"
      memory: "4Gi"
    limits:
      cpu: "4"
      memory: "8Gi"
  recommended:
    requests:
      cpu: "250m"
      memory: "512Mi"
    limits:
      cpu: "500m"
      memory: "1Gi"
  savingsPercent: 85

ステップ4：P99ベースの右sizingスクリプト

# right_sizing.py
import subprocess
import json
from datetime import datetime, timedelta

def get_pod_metrics(namespace: str, days: int = 7) -> dict:
    """获取过去N天的Pod资源使用P99"""
    end_time = datetime.now()
    start_time = end_time - timedelta(days=days)

    query = f'sum(rate(container_cpu_usage_seconds_total{{namespace="{namespace}"}}[5m])) by (pod)'
    result = subprocess.run([
        'kubectl', 'get', '--raw',
        f'/apis/metrics.k8s.io/v1beta1/namespaces/{namespace}/pods'
    ], capture_output=True, text=True)

    pods = json.loads(result.stdout)
    metrics = {}
    for item in pods.get('items', []):
        pod_name = item['metadata']['name']
        containers = item['containers']
        total_cpu = 0
        total_mem = 0
        for c in containers:
            usage = c.get('usage', {})
            cpu_str = usage.get('cpu', '0m')
            mem_str = usage.get('memory', '0Ki')
            total_cpu += parse_cpu(cpu_str)
            total_mem += parse_memory(mem_str)
        metrics[pod_name] = {'cpu_millicores': total_cpu, 'memory_mib': total_mem}

    return metrics

def parse_cpu(s: str) -> int:
    if s.endswith('m'):
        return int(s[:-1])
    return int(float(s) * 1000)

def parse_memory(s: str) -> int:
    if s.endswith('Ki'):
        return int(s[:-2]) // 1024
    if s.endswith('Mi'):
        return int(s[:-2])
    if s.endswith('Gi'):
        return int(s[:-2]) * 1024
    return int(s) // (1024 * 1024)

def generate_recommendations(metrics: dict, buffer_percent: int = 20) -> list:
    """生成右sizing推荐，加buffer_percent的缓冲"""
    recommendations = []
    for pod, usage in metrics.items():
        cpu_rec = int(usage['cpu_millicores'] * (1 + buffer_percent / 100))
        mem_rec = int(usage['memory_mib'] * (1 + buffer_percent / 100))
        recommendations.append({
            'pod': pod,
            'recommended_cpu': f'{cpu_rec}m',
            'recommended_memory': f'{mem_rec}Mi',
            'current_cpu': usage['cpu_millicores'],
            'current_memory': usage['memory_mib'],
        })
    return recommendations

if __name__ == '__main__':
    metrics = get_pod_metrics('production')
    recs = generate_recommendations(metrics, buffer_percent=20)
    for r in recs:
        print(f"{r['pod']}: CPU {r['recommended_cpu']}, Memory {r['recommended_memory']}")

戦略2：Spot/Preemptibleインスタンス

# spot-node-pool.yaml
apiVersion: v1
kind: Node
metadata:
  name: spot-pool
  labels:
    cloud.google.com/gke-provisioning: spot
    node-type: spot
  annotations:
    cloud.google.com/spot: "true"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  replicas: 5
  selector:
    matchLabels:
      app: batch-processor
  template:
    metadata:
      labels:
        app: batch-processor
    spec:
      nodeSelector:
        node-type: spot
      tolerations:
      - key: "cloud.google.com/gke-provisioning"
        operator: "Equal"
        value: "spot"
        effect: "NoSchedule"
      containers:
      - name: processor
        image: myapp/batch-processor:latest
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
      terminationGracePeriodSeconds: 60

Spotインスタンス中断処理

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spot-aware-service
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:latest
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: spot-aware-service

戦略3：クラスターオートスケーリング

# cluster-autoscaler.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    spec:
      containers:
      - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.30.0
        name: cluster-autoscaler
        command:
        - ./cluster-autoscaler
        - --scale-down-delay-after-add=5m
        - --scale-down-unneeded-time=5m
        - --scale-down-utilization-threshold=0.5
        - --max-nodes-total=50
        - --min-nodes-total=3
        - --balance-similar-node-groups
        - --expander=priority
        env:
        - name: CA_SKIP_NODES_WITH_LOCAL_STORAGE
          value: "false"

HPA + VPAの組み合わせ

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-gateway-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-gateway
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
---
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  name: api-gateway-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-gateway
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: "2"
        memory: 4Gi

戦略4：コスト監視とアラート

# kubecost-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: kubecost-cost-model
  namespace: kubecost
data:
  cost-model.json: |
    {
      "clusterName": "production",
      "defaultCPUPrice": "0.031611",
      "defaultRAMPrice": "0.004237",
      "spotLabel": "cloud.google.com/gke-provisioning",
      "spotLabelValue": "spot",
      "spotDiscount": 0.6
    }
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cost-alerts
data:
  alerts.json: |
    [
      {
        "name": "daily-budget-alert",
        "type": "budget",
        "threshold": 500,
        "window": "1d",
        "aggregation": "cluster",
        "notification": {
          "type": "slack",
          "channel": "#finops-alerts"
        }
      },
      {
        "name": "namespace-spike-alert",
        "type": "spendChange",
        "threshold": 0.3,
        "window": "7d",
        "baselineWindow": "7d",
        "aggregation": "namespace",
        "notification": {
          "type": "email",
          "email": "finops-team@company.com"
        }
      }
    ]

戦略5：開発環境のスケジュールスケーリング

# dev-namespace-schedule.yaml
apiVersion: zalando.org/v1
kind: ScheduleSwitch
metadata:
  name: dev-environment-schedule
  namespace: dev
spec:
  switches:
  - startTime: "0 8 * * 1-5"
    endTime: "0 20 * * 1-5"
    replicas: 1
    description: "平日8:00-20:00稼働"
  - startTime: "0 20 * * 1-5"
    endTime: "0 8 * * 1-5"
    replicas: 0
    description: "業務時間外は0にスケールダウン"
  - startTime: "0 0 * * 0,6"
    endTime: "0 0 * * 1"
    replicas: 0
    description: "週末は0にスケールダウン"

落とし穴ガイド

No.	落とし穴	症状	解決策
1	VPA AutoモードでPodが頻繁再起動	サービス可用性低下	まずOffモードで推奨値を観察し、確認後にAutoに切替
2	Spotインスタンス回収でPodが一斉中断	サービス大範囲503	topologySpreadConstraintsでAZ分散、preStopでgraceful shutdown
3	HPAとVPAが同時作用で競合	レプリカ数とリソース量が振動	HPAはレプリカ数、VPAはリソース量を管理、同一指標で併用禁止
4	Cluster Autoscalerが重要ノードを縮小	ステートフルPodが退避	重要ノードに `cluster-autoscaler.kubernetes.io/safe-to-evict: "false"` アノテーションを付与
5	コスト割り当てタグの不整合	チーム/プロジェクト別のコスト集計不可	タグポリシーを確立し、Kyverno/OPAで強制実行

エラートラブルシューティング

エラーメッセージ	原因	解決方法
`metrics-server: no metrics available`	metrics-server未インストールまたは未Ready	metrics-serverをインストール、`kubectl top nodes` で確認
`VPA recommender: OOMKilled`	VPA Recommenderのメモリ不足	VPA recommenderのmemory requestを増加
`HPA: unable to get metric`	カスタムメトリクス未登録	Prometheus AdapterとカスタムメトリクスAPIを確認
`ClusterAutoscaler: node group not found`	ノードグループ設定エラー	`--nodes` パラメータ形式を確認：`min:max:node-group-name`
`Spot node: preempted`	Spotインスタンスがクラウド providerに回収	正常動作、十分なPodDisruptionBudgetを確保
`PodDisruptionBudget: not enough replicas`	PDBが厳しすぎて縮小を阻止	PDBの `minAvailable` または `maxUnavailable` を調整
`Kubecost: pricing data not available`	クラウド価格APIに到達不能	カスタム価格を設定または `defaultCPUPrice/defaultRAMPrice` を使用
`Scale to 0: jobs still running`	未完了Jobが縮小を阻止	Jobの完了を待機または `activeDeadlineSeconds` を設定
`ResourceQuota: exceeded quota`	右sizing後テナントクォータ超過	ResourceQuotaを調整またはプラットフォームチームと協調
`Node: NotReady after scale-up`	新ノードの初期化失敗	ノード起動スクリプトとinitコンテナを確認

高度な最適化

1. Reserved Instance / Savings Plans

プラン	割引率	柔軟性	適用ケース
On-Demand	0%	最高	一時的/バースト負荷
Spot/Preemptible	60-90%	低（回収可能性あり）	ステートレス/中断可能
1年Reserved	30-40%	中	安定ベースライン負荷
3年Reserved	50-60%	低	長期コアサービス
Savings Plans	20-40%	高（インスタンス横断）	ミックスドワークロード

2. マルチクラスターコスト最適化

apiVersion: kubecost.com/v1
kind: MultiClusterCost
spec:
  clusters:
  - name: us-east-prod
    apiEndpoint: https://k8s-us-east.example.com
    costWeight: 1.0
  - name: eu-west-prod
    apiEndpoint: https://k8s-eu-west.example.com
    costWeight: 0.8
  aggregation:
    byTeam: true
    byService: true
    byNamespace: true

3. スマートスケジューリング最適化

apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: cost-aware-scheduler
  plugins:
    score:
      enabled:
      - name: NodeResourcesLeastAllocated
      - name: CostAwarePriority
      disabled:
      - name: NodeResourcesMostAllocated

比較分析

FinOpsツール	コスト可視性	右sizing推奨	Spot対応	オープンソース	価格
Kubecost	★★★★★	★★★★★	★★★★	はい	無料/エンタープライズ$
CloudHealth	★★★★★	★★★★	★★★	いいえ	クラウド支出の%
AWS Cost Explorer	★★★★	★★★	★★★	いいえ	無料
Prometheus + Grafana	★★★	★★	★	はい	無料
Vantage	★★★★	★★★★	★★★	いいえ	シート課金

まとめ：K8sコスト最適化は一律削減ではなく、FinOps3フェーズのシステム工程です——Informフェーズでコストを透明化し、Optimizeフェーズで右sizing+Spot+オートスケーリングで無駄を削減し、Operateフェーズでアラートとポリシーで継続制御します。5つの戦略を組み合わせると：右sizingで40%削減、Spotインスタンスで20%削減、オートスケーリングで15%削減、開発環境スケジューリングで10%削減、コスト監視でリバウンド防止5%——合計60%の削減です。2026年、FinOpsを知らないK8s運用は、請求書を見ない買い物狂と同じです。

オンラインツールおすすめ

Cronジョブ設定：/ja/dev/cron-expression
JSONデータフォーマッター：/ja/json/format
Base64エンコード/デコード：/ja/encode/base64