K8s网络安全策略实战:从默认拒绝到零信任的6种防御模式
你的K8s集群,网络是裸奔的吗?
一个默认配置的Kubernetes集群,所有Pod之间可以自由通信——前端Pod能直接访问数据库Pod,测试命名空间能访问生产命名空间,被入侵的Pod能在集群内横向移动到任何服务。2025年某金融公司因为一个未设网络策略的Pod被攻破,攻击者在30分钟内横向移动到支付系统,盗取了200万条用户数据。这不是电影情节,这是真实发生的安全事件。
Kubernetes NetworkPolicy是集群网络安全的基石。从默认拒绝到微隔离,从Cilium eBPF到零信任架构,本文覆盖6种防御模式,让你的集群网络不再裸奔。
核心概念速查
| 概念 | 说明 | 关键词 |
|---|---|---|
| NetworkPolicy | K8s原生网络策略资源,控制Pod间流量 | ingress/egress、selector |
| Default Deny | 默认拒绝所有流量,显式允许合法流量 | 白名单、零信任基础 |
| Micro-segmentation | 基于标签的细粒度网络隔离 | 标签选择器、命名空间隔离 |
| Cilium | 基于eBPF的CNI插件,支持L3-L7策略 | eBPF、L7策略、可观测性 |
| eBPF | 内核级可编程技术,实现高性能网络过滤 | 内核态、零拷贝、XDP |
| mTLS | 双向TLS认证,服务间加密通信 | 证书轮换、身份认证 |
| Zero Trust | 零信任网络架构,永不信任,始终验证 | 持续验证、最小权限 |
问题深入分析:K8s网络安全的5大挑战
| 挑战 | 现状 | 风险等级 | 根因 |
|---|---|---|---|
| 默认全通 | 集群内Pod间无任何网络限制 | 🔴 严重 | K8s默认不设NetworkPolicy |
| 横向移动 | 攻击者突破一个Pod后可访问所有服务 | 🔴 严重 | 缺乏微隔离策略 |
| 策略爆炸 | 大规模集群NetworkPolicy数量失控 | 🟡 中等 | 标签设计不合理 |
| DNS依赖 | 服务发现依赖CoreDNS,DNS策略缺失 | 🟡 中等 | 忽视DNS层安全 |
| 可观测性差 | 网络策略效果难以验证和审计 | 🟠 较高 | 缺乏策略审计工具 |
模式1:默认拒绝所有流量
默认拒绝是零信任网络的第一步。在没有任何NetworkPolicy的命名空间中,所有Pod可以自由通信——这是最危险的状态。
命名空间级默认拒绝
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
允许DNS解析(egress必需)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
批量为所有命名空间设置默认拒绝
#!/bin/bash
NAMESPACES=$(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}')
for ns in $NAMESPACES; do
if [ "$ns" = "kube-system" ] || [ "$ns" = "kube-public" ]; then
echo "Skipping system namespace: $ns"
continue
fi
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: $ns
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
EOF
echo "Applied default-deny-all to namespace: $ns"
done
验证默认拒绝策略
kubectl get networkpolicy -n production
kubectl describe networkpolicy default-deny-all -n production
kubectl run test-client --image=busybox:1.36 -n production --rm -it -- \
wget -qO- --timeout=2 http://api-service.production.svc.cluster.local:8080
模式2:基于标签的微隔离策略
微隔离通过标签选择器实现细粒度的Pod间访问控制,是NetworkPolicy的核心能力。
三层应用微隔离
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: web
tier: frontend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
env: production
podSelector:
matchLabels:
app: ingress-nginx
ports:
- protocol: TCP
port: 8080
- protocol: TCP
port: 8443
egress:
- to:
- podSelector:
matchLabels:
app: api
tier: backend
ports:
- protocol: TCP
port: 8080
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: api
tier: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: web
tier: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: postgres
tier: database
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
tier: cache
ports:
- protocol: TCP
port: 6379
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-policy
namespace: production
spec:
podSelector:
matchLabels:
tier: database
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: api
tier: backend
ports:
- protocol: TCP
port: 5432
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
跨命名空间策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-monitoring
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
purpose: monitoring
podSelector:
matchLabels:
app: prometheus
ports:
- protocol: TCP
port: 9090
命名空间标签管理
kubectl label namespace monitoring purpose=monitoring
kubectl label namespace staging env=staging
kubectl label namespace production env=production
kubectl label namespace kube-system kubernetes.io/metadata.name=kube-system
kubectl get namespaces --show-labels
模式3:Cilium eBPF高级网络策略
Cilium基于eBPF技术,突破K8s原生NetworkPolicy的L3/L4限制,支持L7层HTTP/gRPC/Kafka协议策略。
安装Cilium
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set hubble.enabled=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--set operator.prometheus.enabled=true
L7 HTTP策略
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: l7-http-policy
namespace: production
spec:
endpointSelector:
matchLabels:
app: api
tier: backend
ingress:
- fromEndpoints:
- matchLabels:
app: web
tier: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: "/api/v1/.*"
- method: POST
path: "/api/v1/orders"
- method: PUT
path: "/api/v1/orders/.*"
Kafka协议策略
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: kafka-policy
namespace: production
spec:
endpointSelector:
matchLabels:
app: kafka
ingress:
- fromEndpoints:
- matchLabels:
app: order-service
toPorts:
- ports:
- port: "9092"
protocol: TCP
rules:
kafka:
- role: produce
topic: orders
- role: consume
topic: orders
- fromEndpoints:
- matchLabels:
app: payment-service
toPorts:
- ports:
- port: "9092"
protocol: TCP
rules:
kafka:
- role: produce
topic: payments
- role: consume
topic: payments
基于DNS的egress策略
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: external-api-egress
namespace: production
spec:
endpointSelector:
matchLabels:
app: api
tier: backend
egress:
- toFQDNs:
- matchName: "api.stripe.com"
- matchName: "api.sendgrid.com"
- matchPattern: "*.amazonaws.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: UDP
rules:
dns:
- matchPattern: "*"
Hubble可观测性
cilium hubble port-forward &
hubble observe --namespace production --since 1m
hubble observe --namespace production --label app=api --verdict DROPPED
hubble observe --namespace production --http-path "/api/v1/.*" --method GET
模式4:基于DNS的网络策略
原生NetworkPolicy不支持基于域名的策略,但Cilium和Calico扩展了这一能力,让egress控制更加灵活。
Cilium FQDN策略
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-external-services
namespace: production
spec:
endpointSelector:
matchLabels:
app: payment-service
egress:
- toFQDNs:
- matchName: "api.stripe.com"
- matchName: "api.paypal.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
- toFQDNs:
- matchName: "s3.amazonaws.com"
- matchPattern: "*.s3.amazonaws.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: UDP
rules:
dns:
- matchPattern: "*"
Calico GlobalNetworkPolicy DNS策略
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: allow-external-dns
spec:
selector: app == "payment-service"
order: 100
types:
- Egress
egress:
- action: Allow
protocol: TCP
destination:
domains:
- "api.stripe.com"
- "api.paypal.com"
ports:
- 443
- action: Allow
protocol: UDP
destination:
selector: k8s-app == "kube-dns"
ports:
- 53
DNS策略监控
cilium hubble observe --dns --namespace production
cilium hubble observe --fqdn "api.stripe.com" --namespace production
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100
kubectl get endpoints kube-dns -n kube-system
模式5:服务网格mTLS
服务网格通过Sidecar代理实现自动mTLS,为服务间通信提供加密和身份认证。
Istio严格mTLS模式
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: backend-mtls
namespace: production
spec:
selector:
matchLabels:
tier: backend
mtls:
mode: STRICT
portLevelMtls:
8080:
mode: STRICT
Istio AuthorizationPolicy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: backend-authz
namespace: production
spec:
selector:
matchLabels:
app: api
tier: backend
rules:
- from:
- source:
principals:
- "cluster.local/ns/production/sa/frontend"
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/*"]
- from:
- source:
namespaces: ["monitoring"]
principals:
- "cluster.local/ns/monitoring/sa/prometheus"
to:
- operation:
methods: ["GET"]
paths: ["/metrics"]
Cilium Cluster Mesh mTLS
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: allow-mtls-traffic
spec:
endpointSelector: {}
ingress:
- fromRequires:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
ingress:
- fromEndpoints:
- matchLabels:
io.cilium.k8s.policy.serviceaccount: monitoring
toPorts:
- ports:
- port: "9090"
protocol: TCP
证书管理
istioctl analyze -n production
istioctl proxy-config secret deploy/frontend.production
kubectl get certificates -n production
kubectl describe certificate backend-cert -n production
kubectl logs -n istio-system -l app=citadel --tail=50
模式6:零信任网络架构蓝图
零信任不是单一技术,而是一套安全架构理念:永不信任,始终验证,最小权限。
零信任网络架构分层
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: zero-trust-foundation
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector: {}
ports: []
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
零信任身份层
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: identity-based-policy
namespace: production
spec:
endpointSelector:
matchLabels:
app: api
tier: backend
env: production
ingress:
- fromRequires:
- matchLabels:
app: web
tier: frontend
env: production
io.cilium.k8s.policy.serviceaccount: frontend-sa
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: "/api/v1/.*"
- method: POST
path: "/api/v1/orders"
零信任审计层
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
resources:
- group: networking.k8s.io
resources: ["networkpolicies"]
verbs: ["create", "update", "delete"]
- level: Metadata
resources:
- group: cilium.io
resources: ["ciliumnetworkpolicies", "ciliumclusterwidenetworkpolicies"]
verbs: ["create", "update", "delete"]
零信任可观测性
cilium hubble observe --namespace production --type trace --type drop
cilium hubble observe --verdict DROPPED --since 5m --namespace production
kubectl get ciliumnetworkpolicies -A
kubectl get ciliumclusterwidenetworkpolicies
kubectl get networkpolicies -A
cilium connectivity test --namespace production
零信任架构验证脚本
#!/bin/bash
echo "=== Zero Trust Network Audit ==="
echo "[1] Checking default deny policies..."
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
count=$(kubectl get networkpolicy -n "$ns" 2>/dev/null | grep -c "default-deny" || true)
if [ "$count" -eq 0 ] && [ "$ns" != "kube-system" ]; then
echo " WARNING: No default-deny policy in namespace: $ns"
fi
done
echo "[2] Checking mTLS status..."
istioctl proxy-config secret -n production 2>/dev/null || echo " Istio not installed or no proxies found"
echo "[3] Checking Cilium policy status..."
cilium policy get 2>/dev/null || echo " Cilium not available"
echo "[4] Checking for overly permissive policies..."
kubectl get networkpolicies -A -o json | \
python3 -c "
import json, sys
policies = json.load(sys.stdin)
for p in policies.get('items', []):
ns = p['metadata']['namespace']
name = p['metadata']['name']
ingress = p.get('spec', {}).get('ingress', [])
for i in ingress:
if not i.get('from') and not i.get('ports'):
print(f' WARNING: {ns}/{name} has empty ingress from selector')
egress = p.get('spec', {}).get('egress', [])
for e in egress:
if not e.get('to') and not e.get('ports'):
print(f' WARNING: {ns}/{name} has empty egress to selector')
"
echo "=== Audit Complete ==="
5大常见陷阱
陷阱1:忘记允许DNS流量
# ❌ 错误:拒绝所有egress后DNS也无法解析
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# ✅ 正确:必须显式允许DNS egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
陷阱2:命名空间缺少标签
# ❌ 错误:namespaceSelector匹配不到任何命名空间
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-monitoring
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
purpose: monitoring
# ✅ 正确:先给命名空间打标签
# kubectl label namespace monitoring purpose=monitoring
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-monitoring
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
purpose: monitoring
podSelector:
matchLabels:
app: prometheus
陷阱3:CNI不支持NetworkPolicy
# ❌ 错误:flannel不支持NetworkPolicy,策略不会生效
# 使用flannel作为CNI
# ✅ 正确:使用支持NetworkPolicy的CNI
# kubectl get pods -n kube-system -l k8s-app=calico-node
# kubectl get pods -n kube-system -l k8s-app=cilium
# kubectl get pods -n kube-system -l app=antrea
陷阱4:策略顺序导致覆盖
# ❌ 错误:先写允许策略再写拒绝策略,拒绝不会覆盖允许
# NetworkPolicy是累加的,没有优先级概念
# ✅ 正确:NetworkPolicy是白名单模型,所有策略累加
# 如果需要优先级,使用Calico的GlobalNetworkPolicy或Cilium策略
# Calico支持order字段控制优先级
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: deny-suspicious
spec:
order: 50
selector: all()
types:
- Ingress
ingress:
- action: Deny
source:
selector: app == "compromised-service"
陷阱5:忽略kube-system命名空间
# ❌ 错误:对kube-system也设置默认拒绝,导致集群功能异常
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: kube-system
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# ✅ 正确:kube-system需要特殊处理,允许必要流量
# 对kube-system命名空间跳过默认拒绝策略
# 或为kube-system中的关键组件设置精确的允许策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: kube-system-allow
namespace: kube-system
spec:
podSelector:
matchLabels:
k8s-app: kube-dns
policyTypes:
- Ingress
ingress:
- from: []
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
错误排查表
| 错误现象 | 可能原因 | 排查命令 | 解决方案 |
|---|---|---|---|
| Pod间无法通信 | 默认拒绝策略过严 | kubectl get networkpolicy -A |
添加精确的ingress/egress规则 |
| 服务发现失败 | DNS egress被阻断 | kubectl exec -it <pod> -- nslookup api-service |
添加DNS egress允许规则 |
| NetworkPolicy不生效 | CNI不支持 | kubectl get pods -n kube-system -l k8s-app |
切换到Calico/Cilium/Antrea |
| 跨命名空间访问被拒 | 命名空间缺少标签 | kubectl get ns --show-labels |
给命名空间添加必要标签 |
| Hubble无法观测 | Cilium未启用Hubble | cilium status |
Helm安装时启用Hubble |
| mTLS连接失败 | 证书过期或未签发 | istioctl proxy-config secret <pod> |
检查Certificate资源状态 |
| L7策略不生效 | Cilium版本过低 | cilium version |
升级到Cilium 1.14+ |
| DNS策略不生效 | CoreDNS版本过低 | kubectl get deploy coredns -n kube-system -o yaml |
升级CoreDNS |
| 策略数量爆炸 | 标签设计不合理 | kubectl get networkpolicy -A | wc -l |
重新设计标签体系 |
| Calico策略冲突 | GlobalNetworkPolicy优先级问题 | calicoctl get globalnetworkpolicy -o yaml |
调整order字段 |
高级优化
策略即代码(PaC)
使用GitOps管理NetworkPolicy,确保策略变更经过代码审查:
git checkout -b feature/add-network-policy
mkdir -p k8s/network-policies/production
cat > k8s/network-policies/production/default-deny.yaml << 'EOF'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
EOF
git add . && git commit -m "feat: add default deny policy for production"
git push origin feature/add-network-policy
策略自动化测试
cilium connectivity test \
--test "echo-ingress-l7" \
--namespace production \
--force-deploy
kubectl run policy-test \
--image=busybox:1.36 \
-n production \
--rm -it -- \
wget -qO- --timeout=2 http://api-service:8080/healthz
kubectl run dns-test \
--image=busybox:1.36 \
-n production \
--rm -it -- \
nslookup api-service.production.svc.cluster.local
策略性能优化
cilium config | grep policy
cilium bpf policy list
kubectl get ciliumnetworkpolicies -A -o json | \
python3 -c "
import json, sys
policies = json.load(sys.stdin)
print(f'Total CiliumNetworkPolicies: {len(policies.get(\"items\", []))}')
for p in policies.get('items', []):
ns = p['metadata']['namespace']
name = p['metadata']['name']
ingress = len(p.get('spec', {}).get('ingress', []))
egress = len(p.get('spec', {}).get('egress', []))
print(f' {ns}/{name}: ingress={ingress}, egress={egress}')
"
CNI插件对比
| 特性 | Calico | Cilium | Antrea | Weave Net |
|---|---|---|---|---|
| NetworkPolicy支持 | ✅ 完整 | ✅ 完整+L7 | ✅ 完整 | ⚠️ 基础 |
| L7策略 | ❌ | ✅ HTTP/gRPC/Kafka | ❌ | ❌ |
| FQDN策略 | ✅ | ✅ | ❌ | ❌ |
| eBPF数据面 | ✅ 可选 | ✅ 默认 | ✅ 可选 | ❌ |
| 可观测性 | ❌ | ✅ Hubble | ⚠️ Flow Exporter | ❌ |
| 加密 | ✅ WireGuard | ✅ WireGuard/IPsec | ✅ IPsec | ✅ IPsec |
| 性能 | 高 | 极高 | 高 | 中 |
| 多集群 | ✅ | ✅ Cluster Mesh | ✅ | ❌ |
| Service Mesh | ❌ | ✅ 内置 | ❌ | ❌ |
| 社区活跃度 | 高 | 极高 | 高 | 低 |
| 适用场景 | 通用生产 | 高性能+L7 | vSphere环境 | 开发测试 |
总结
Kubernetes网络安全不是一蹴而就的,而是一个渐进式加固的过程。从默认拒绝开始,逐步实现微隔离,引入Cilium eBPF获得L7能力,通过DNS策略控制外部访问,借助服务网格实现mTLS,最终构建零信任网络架构。每一步都在缩小攻击面,每一层都在增加防御深度。记住:没有NetworkPolicy的K8s集群,就是攻击者的游乐场。
推荐工具
本站提供浏览器本地工具,免注册即可试用 →