TiDB HTAP即時分析實戰:從TiFlash到即時看板的6種生產模式
T+1分析已死,即時洞察才是未來
凌晨3點,線上訂單暴漲10倍,CEO問:「現在各品類即時GMV多少?」——你只能回答:「資料還在ETL管道裡,明天才能看到。」
這就是傳統T+1分析的困境:
- ETL管道脆弱:Kafka → Flink → ClickHouse,任何一環故障,資料斷流
- 資料延遲不可控:從OLTP到OLAP,分鐘級到小時級延遲是常態
- 雙系統運維噩夢:MySQL負責交易,ClickHouse負責分析,資料一致性靠人肉
- 成本失控:兩套儲存、兩套計算、兩套運維,資源利用率低下
- 即時性不足:業務決策需要秒級回應,T+1報表已成歷史遺留
TiDB HTAP的承諾:一套系統,同時承載OLTP和OLAP,資料零延遲從行存同步到列存。這不是PPT架構,而是已經在數千家企業生產驗證的方案。
HTAP核心概念速查
| 概念 | 全稱 | 核心作用 | 類比 |
|---|---|---|---|
| HTAP | Hybrid Transactional/Analytical Processing | 一套系統同時處理交易和分析 | 一人分飾交易員和分析師 |
| TiFlash | TiDB Flash | 列式儲存引擎,承載OLAP查詢 | OLAP的專屬加速器 |
| TiKV | TiDB Key-Value | 行式儲存引擎,承載OLTP交易 | OLTP的基石 |
| MPP | Massively Parallel Processing | 多節點平行計算框架 | 多人同時算不同分片 |
| 列式儲存 | Columnar Storage | 按列儲存資料,分析查詢只讀需要的列 | 只翻需要的書頁 |
| Raft Learner | Raft一致性協議學習者角色 | 非同步複製資料到TiFlash,不影響主流程 | 影子抄寫員 |
| 資料市集 | Data Mart | 面向特定業務域的預聚合資料集合 | 為每個部門定製的報表櫥窗 |
💡 使用 JSON 格式化 工具格式化 TiDB 設定 JSON 檔案,便於排查問題。
傳統即時分析架構的5大痛點
痛點1:ETL管道是最大的單點故障
MySQL → Canal → Kafka → Flink → ClickHouse
│ │ │ │ │
│ │ │ │ └─ 寫入失敗,資料遺失
│ │ │ └─ 作業崩潰,資料積壓
│ │ └─ 分區不均,消費延遲
│ └─ Binlog解析異常
└─ 主從延遲
任何一個環節出問題,整條鏈路斷裂。凌晨告警不斷,運維苦不堪言。
痛點2:資料一致性無法保證
OLTP和OLAP是兩套獨立系統,資料同步存在延遲視窗:
-- MySQL: 訂單已退款
UPDATE orders SET status = 'refunded' WHERE id = 10086;
-- ClickHouse: 5分鐘後才看到退款狀態
-- 這5分鐘內,報表資料是錯的
痛點3:雙系統運維成本翻倍
| 運維項 | MySQL | ClickHouse | 合計 |
|---|---|---|---|
| 監控告警 | Grafana + Prometheus | Grafana + Prometheus | 2套 |
| 備份恢復 | mysqldump / XtraBackup | clickhouse-backup | 2套 |
| 容量規劃 | 主從 + 分庫分表 | 分佈式表 + 副本 | 2套 |
| 版本升級 | 滾動升級 | 滾動升級 | 2套 |
| 人員要求 | DBA | ClickHouse專家 | 2個團隊 |
痛點4:查詢效能與資料新鮮度的矛盾
- ClickHouse追求查詢效能,需要MergeTree合併,資料有延遲
- 要即時性就要用ReplacingMergeTree,但查詢需要FINAL,效能暴跌
- 即時性和效能,魚和熊掌不可兼得
痛點5:開發效率低下
// 開發者需要維護兩套資料模型
public class OrderService {
// OLTP: MySQL
@Transactional
public void createOrder(Order order) {
orderMapper.insert(order);
}
// OLAP: ClickHouse - 還要寫一遍
public void syncToClickHouse(Order order) {
clickHouseMapper.insert(order);
// 欄位對映、型別轉換、異常處理...
}
}
6種生產級HTAP實戰模式
模式1:TiFlash列存副本搭建
TiFlash透過Raft Learner角色非同步接收TiKV的資料變更,實現行存到列存的即時同步。
1.1 部署TiFlash節點
# tiup topology.yaml 片段
tiflash_servers:
- host: 10.0.1.21
tcp_port: 9000
http_port: 8123
flash_service_port: 3930
flash_proxy_port: 20170
flash_proxy_status_port: 20292
metrics_interval: 15
data_dir: /data/tiflash/data
deploy_dir: /data/tiflash/deploy
config:
logger:
level: information
storage:
main:
capacity: "2TB"
latest:
capacity: "200GB"
- host: 10.0.1.22
tcp_port: 9000
http_port: 8123
flash_service_port: 3930
flash_proxy_port: 20170
flash_proxy_status_port: 20292
data_dir: /data/tiflash/data
deploy_dir: /data/tiflash/deploy
# 擴容TiFlash節點
tiup cluster scale-out tidb-prod topology-tiflash.yaml
# 驗證TiFlash節點狀態
tiup cluster display tidb-prod | grep tiflash
1.2 為表新增TiFlash副本
-- 為整張表新增TiFlash副本(預設1副本)
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 為表新增2個TiFlash副本(高可用)
ALTER TABLE orders SET TIFLASH REPLICA 2;
-- 為分區表新增TiFlash副本
ALTER TABLE orders_partition SET TIFLASH REPLICA 1;
-- 檢視副本同步進度
SELECT * FROM information_schema.tiflash_replica
WHERE table_name = 'orders';
同步進度輸出範例:
+----------+------------+----------+---------------+------------------+-----------+
| TABLE_ID | TABLE_NAME | REPLICA | LOCATION_LABELS| AVAILABLE | PROGRESS |
+----------+------------+----------+---------------+------------------+-----------+
| 85 | orders | 1 | | True | 1.0 |
+----------+------------+----------+---------------+------------------+-----------+
1.3 按列新增TiFlash副本(節省儲存)
-- 只為分析需要的列建立TiFlash副本
ALTER TABLE orders SET TIFLASH REPLICA 1
COLUMNS (order_id, user_id, product_id, amount, status, created_at);
-- 驗證列級副本
SHOW CREATE TABLE orders;
模式2:MPP分佈式查詢加速
TiDB的MPP引擎將分析查詢下推到多個TiFlash節點平行執行,實現接近ClickHouse的查詢效能。
2.1 啟用MPP模式
-- 檢視當前MPP模式
SHOW VARIABLES LIKE 'tidb_allow_mpp';
-- 啟用MPP(預設已啟用)
SET SESSION tidb_allow_mpp = 1;
-- 強制使用MPP(用於測試驗證)
SET SESSION tidb_enforce_mpp = 1;
-- 設定MPP查詢的Exchange壓縮
SET SESSION tidb_exchange_compression = 'ON';
2.2 MPP查詢實戰:即時銷售分析
-- 場景:即時統計各品類今日銷售額Top10
-- MPP會自動將計算下推到TiFlash節點平行執行
SELECT
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales,
COUNT(DISTINCT o.user_id) AS unique_buyers,
AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
AND o.status IN ('paid', 'shipped', 'completed')
GROUP BY c.category_name
ORDER BY total_sales DESC
LIMIT 10;
2.3 驗證MPP執行計畫
-- 檢視執行計畫,確認MPP算子
EXPLAIN ANALYZE
SELECT
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;
關鍵執行計畫特徵:
+------------------------------------------+----------+---------+--------------+
| id | task | rows | operator info|
+------------------------------------------+----------+---------+--------------+
| TableReader_45 | root | 10 | ... |
| └─ExchangeSender_44 | mpp[tif] | 10 | ExchangeType |
| └─Projection_42 | mpp[tif] | 10 | ... |
| └─HashAgg_43 | mpp[tif] | 10 | group by:... |
| └─ExchangeReceiver_38 | mpp[tif] | 100 | ... |
| └─ExchangeSender_37 | mpp[tif] | 100 | ExchangeType |
| └─HashJoin_36 | mpp[tif] | 100 | inner join |
| ├─Selection_25(Build) | mpp[tif] | 500 | ... |
| │ └─TableFullScan_24 | mpp[tif] | 10000 | tikv:tiflash |
| └─Selection_30(Probe) | mpp[tif] | 800 | ... |
| └─TableFullScan_29 | mpp[tif] | 20000 | tikv:tiflash |
+------------------------------------------+----------+---------+--------------+
看到
mpp[tif]和ExchangeSender/Receiver說明MPP已生效。
2.4 MPP效能對比
-- 關閉MPP,使用傳統TiKV掃描
SET SESSION tidb_allow_mpp = 0;
EXPLAIN ANALYZE
SELECT c.category_name, SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;
-- 重新啟用MPP
SET SESSION tidb_allow_mpp = 1;
| 查詢模式 | 資料量 | 執行時間 | 掃描行數 |
|---|---|---|---|
| TiKV掃描 | 1億行 | 45.2s | 100,000,000 |
| TiFlash非MPP | 1億行 | 12.8s | 100,000,000 |
| TiFlash MPP | 1億行 | 3.2s | 100,000,000 |
模式3:TiCDC即時ETL替代
TiCDC(Change Data Capture)可以替代傳統ETL管道,將TiDB的資料變更即時同步到下游系統。
3.1 部署TiCDC
# tiup topology.yaml 片段
cdc_servers:
- host: 10.0.1.31
port: 8300
deploy_dir: /data/cdc/deploy
data_dir: /data/cdc/data
config:
per-table-memory-quota: 10485760
sink-memory-quota: 536870912
- host: 10.0.1.32
port: 8300
deploy_dir: /data/cdc/deploy
data_dir: /data/cdc/data
# 擴容TiCDC
tiup cluster scale-out tidb-prod topology-cdc.yaml
# 驗證TiCDC狀態
tiup cluster display tidb-prod | grep cdc
3.2 建立同步任務:替代MySQL → Kafka → Flink鏈路
# 建立TiCDC同步任務:TiDB → Kafka(替代Canal + Kafka)
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="kafka://kafka-cluster:9092/tidb-orders?partition-num=6&max-message-bytes=10485760&replication-factor=3" \
--changefeed-id="orders-to-kafka" \
--config=cdc-orders.yaml
# cdc-orders.yaml
filter:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
- 'ecommerce.payments'
ignore_txn_start_ts: []
mounter:
worker-num: 16
sink:
protocol: "canal-json"
dispatchers:
- matcher: ['ecommerce.*']
partition: "index-value"
selector:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
- 'ecommerce.payments'
3.3 直接同步到ClickHouse(如果仍需保留)
# TiCDC → ClickHouse 直連
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="clickhouse://clickhouse-server:8123?database=analytics&username=default&password=***" \
--changefeed-id="orders-to-clickhouse" \
--config=cdc-ch.yaml
3.4 TiCDC同步到S3(資料湖方案)
# TiCDC → S3 資料湖
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="s3://data-lake/tidb-cdc/?endpoint=https://s3.amazonaws.com&access-key=***&secret-access-key=***" \
--changefeed-id="orders-to-s3" \
--config=cdc-s3.yaml
# cdc-s3.yaml
sink:
protocol: "csv"
column-selectors:
- matcher: ['ecommerce.orders']
columns: ['order_id', 'user_id', 'amount', 'status', 'created_at']
file:
size: 67108864
interval: 5m
filter:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
3.5 監控TiCDC同步延遲
-- 檢視同步任務狀態
cdc cli changefeed list \
--server=http://10.0.1.31:8300;
-- 檢視同步延遲
cdc cli changefeed query \
--changefeed-id=orders-to-kafka \
--server=http://10.0.1.31:8300;
關鍵指標:
{
"state": "normal",
"tso": 446685327530688513,
"checkpoint": "2026-06-15 10:30:00.123",
"lag": "1.2s"
}
模式4:看板查詢最佳化
即時看板是HTAP最直接的應用場景,但查詢效能直接影響使用者體驗。
4.1 看板查詢模式識別
-- 典型看板查詢1:即時GMV
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv,
COUNT(*) AS order_count
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;
-- 典型看板查詢2:即時使用者活躍
SELECT
DATE_FORMAT(login_time, '%Y-%m-%d %H:00') AS hour_slot,
COUNT(DISTINCT user_id) AS dau,
COUNT(*) AS total_actions
FROM user_actions
WHERE login_time >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot
ORDER BY hour_slot;
-- 典型看板查詢3:即時轉化漏斗
SELECT
COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END) AS views,
COUNT(DISTINCT CASE WHEN event = 'cart' THEN user_id END) AS carts,
COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) AS pays,
ROUND(COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) * 100.0
/ NULLIF(COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END), 0), 2) AS conversion_rate
FROM user_events
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 HOUR);
4.2 強制查詢走TiFlash
-- 方式1:透過Hint強制走TiFlash
SELECT /*+ read_from_storage(tiflash[orders]) */
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
-- 方式2:設定Session變數
SET SESSION tidb_isolation_read_engines = 'tiflash';
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
SET SESSION tidb_isolation_read_engines = 'tikv,tiflash';
-- 方式3:透過Resource Group隔離OLAP流量
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 5000
PRIORITY = LOW;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
4.3 看板查詢快取最佳化
-- 使用Binding固定執行計畫
CREATE SESSION BINDING FOR
SELECT SUM(amount) FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
USING
SELECT /*+ read_from_storage(tiflash[orders]) */ SUM(amount)
FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR);
-- 檢視Binding
SHOW SESSION BINDINGS;
4.4 看板重新整理策略
-- 看板查詢建議使用READ ONLY模式,避免和OLTP爭搶資源
SET SESSION tidb_isolation_read_engines = 'tiflash';
-- 限制看板查詢的資源消耗
SET SESSION max_execution_time = 30000;
-- 使用prepared statement減少解析開銷
PREPARE dashboard_gmv FROM
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= ?
AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;
EXECUTE dashboard_gmv USING DATE_SUB(NOW(), INTERVAL 24 HOUR);
模式5:物化檢視建構資料市集
TiDB 8.x支援物化檢視(Materialized View),可以預計算聚合資料,建構面向業務的資料市集。
5.1 建立物化檢視
-- 建立日銷售彙總物化檢視
CREATE MATERIALIZED VIEW mv_daily_sales
REFRESH EVERY 1 HOUR
AS
SELECT
DATE(created_at) AS sale_date,
c.category_name,
p.product_name,
SUM(oi.quantity) AS total_quantity,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
COUNT(DISTINCT o.user_id) AS unique_buyers,
AVG(oi.quantity * oi.unit_price) AS avg_price
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.status IN ('paid', 'shipped', 'completed')
GROUP BY DATE(created_at), c.category_name, p.product_name;
5.2 建立使用者行為物化檢視
-- 使用者行為漏斗物化檢視
CREATE MATERIALIZED VIEW mv_user_funnel
REFRESH EVERY 5 MINUTE
AS
SELECT
DATE(created_at) AS event_date,
COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END) AS page_views,
COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END) AS add_to_carts,
COUNT(DISTINCT CASE WHEN event_type = 'checkout' THEN user_id END) AS checkouts,
COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) AS payments,
ROUND(
COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) * 100.0
/ NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END), 0),
2
) AS view_to_pay_rate
FROM user_events
GROUP BY DATE(created_at);
5.3 管理物化檢視
-- 檢視物化檢視狀態
SHOW MATERIALIZED VIEWS;
-- 手動重新整理物化檢視
ALTER MATERIALIZED VIEW mv_daily_sales REFRESH;
-- 暫停自動重新整理
ALTER MATERIALIZED VIEW mv_daily_sales DISABLE;
-- 恢復自動重新整理
ALTER MATERIALIZED VIEW mv_daily_sales ENABLE;
-- 刪除物化檢視
DROP MATERIALIZED VIEW IF EXISTS mv_daily_sales;
5.4 基於物化檢視的看板查詢
-- 查詢物化檢視(毫秒級回應)
SELECT
sale_date,
category_name,
total_revenue,
unique_buyers
FROM mv_daily_sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 7 DAY)
ORDER BY total_revenue DESC
LIMIT 20;
模式6:生產HTAP部署與監控
生產環境的HTAP部署需要考慮資源隔離、監控告警和故障恢復。
6.1 資源管控設定
-- 建立OLTP資源組(高優先級)
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
RU_PER_SEC = 20000
PRIORITY = HIGH;
-- 建立OLAP資源組(低優先級,限制資源)
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 8000
PRIORITY = LOW;
-- 將使用者繫結到資源組
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
ALTER USER etl_user RESOURCE GROUP olap_group;
-- 檢視資源組設定
SELECT * FROM information_schema.resource_groups;
6.2 TiFlash關鍵監控指標
# Grafana Dashboard 關鍵指標
# TiFlash Summary Dashboard
tiflash_system_profile:
- metric: tiflash_storage_write_duration_seconds
alert_threshold: "> 5s"
description: "TiFlash寫入延遲"
- metric: tiflash_mpp_task_duration_seconds
alert_threshold: "> 60s"
description: "MPP任務執行時長"
- metric: tiflash_storage_segment_count
alert_threshold: "> 10000"
description: "Segment數量(過多需compaction)"
- metric: tiflash_cpu_usage
alert_threshold: "> 80%"
description: "TiFlash CPU使用率"
- metric: tiflash_storage_disk_usage_ratio
alert_threshold: "> 0.8"
description: "TiFlash磁碟使用率"
6.3 Prometheus告警規則
# tidb-htap-alerts.yaml
groups:
- name: tidb_htap
rules:
- alert: TiFlashReplicaSyncLag
expr: |
tiflash_replica_available == 0
or (tiflash_replica_progress < 0.95 and tiflash_replica_progress > 0)
for: 10m
labels:
severity: warning
annotations:
summary: "TiFlash副本同步延遲"
description: "表 {{ $labels.table_name }} 的TiFlash副本同步進度僅 {{ $value }}"
- alert: TiFlashHighWriteDuration
expr: |
histogram_quantile(0.95,
rate(tiflash_storage_write_duration_seconds_bucket[5m])
) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "TiFlash寫入延遲過高"
description: "P95寫入延遲 {{ $value }}s"
- alert: MPPQueryTimeout
expr: |
rate(tiflash_mpp_task_error_count{reason="timeout"}[5m]) > 0.1
for: 3m
labels:
severity: critical
annotations:
summary: "MPP查詢超時率過高"
description: "MPP查詢超時率 {{ $value }}/s"
- alert: TiFlashDiskSpaceLow
expr: |
tiflash_storage_disk_usage_ratio > 0.85
for: 5m
labels:
severity: critical
annotations:
summary: "TiFlash磁碟空間不足"
description: "磁碟使用率 {{ $value }}"
6.4 日常運維腳本
#!/bin/bash
# tidb-htap-healthcheck.sh
# TiDB HTAP健康檢查腳本
CLUSTER_NAME="tidb-prod"
LOG_FILE="/var/log/tidb-htap-healthcheck.log"
echo "=== TiDB HTAP Health Check $(date) ===" >> $LOG_FILE
# 檢查TiFlash節點狀態
echo "--- TiFlash Node Status ---" >> $LOG_FILE
tiup cluster display $CLUSTER_NAME 2>/dev/null | grep tiflash >> $LOG_FILE
# 檢查TiFlash副本同步狀態
echo "--- TiFlash Replica Sync Status ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT table_schema, table_name, replica_count, available, progress
FROM information_schema.tiflash_replica
WHERE available = 0 OR progress < 1.0;
" >> $LOG_FILE 2>&1
# 檢查TiCDC同步延遲
echo "--- TiCDC Sync Lag ---" >> $LOG_FILE
cdc cli changefeed list --server=http://10.0.1.31:8300 >> $LOG_FILE 2>&1
# 檢查MPP查詢效能
echo "--- MPP Query Performance ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT DIGEST_TEXT, AVG_LATENCY, EXEC_COUNT
FROM information_schema.statements_summary
WHERE DIGEST_TEXT LIKE '%GROUP BY%'
AND AVG_LATENCY > 1000000000
ORDER BY AVG_LATENCY DESC
LIMIT 10;
" >> $LOG_FILE 2>&1
echo "=== Health Check Complete ===" >> $LOG_FILE
5大踩坑指南
❌ 坑1:TiFlash副本未同步就查詢
-- ❌ 錯誤:剛新增TiFlash副本就查詢,資料不完整
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 立即查詢,可能讀到不完整資料
SELECT COUNT(*) FROM orders;
-- ✅ 正確:等待副本同步完成
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 檢查同步進度
SELECT table_name, progress, available
FROM information_schema.tiflash_replica
WHERE table_name = 'orders';
-- 等待 available = 1 後再查詢
SELECT COUNT(*) FROM orders;
❌ 坑2:小表也加TiFlash副本
-- ❌ 錯誤:小表加TiFlash副本反而更慢(網路開銷 > 掃描開銷)
ALTER TABLE config_dict SET TIFLASH REPLICA 1;
-- ✅ 正確:小表保持在TiKV,只為大表新增TiFlash副本
-- 建議規則:資料量 > 100萬行才考慮TiFlash副本
-- 小表透過Hint強制走TiKV
SELECT /*+ read_from_storage(tikv[config_dict]) */
* FROM config_dict WHERE type = 'payment';
❌ 坑3:OLAP查詢影響OLTP效能
-- ❌ 錯誤:OLAP和OLTP使用同一使用者,資源無隔離
SELECT user, resource_group FROM mysql.user WHERE user = 'app_user';
-- app_user | default
-- ✅ 正確:使用Resource Group隔離
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
RU_PER_SEC = 20000 PRIORITY = HIGH;
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 8000 PRIORITY = LOW;
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
-- OLAP查詢設定超時
SET SESSION max_execution_time = 60000;
❌ 坑4:TiFlash節點磁碟空間不足
# ❌ 錯誤:TiFlash資料目錄和系統碟共用
tiflash_servers:
- host: 10.0.1.21
data_dir: /var/lib/tiflash/data # 系統碟空間有限
# ✅ 正確:TiFlash使用獨立資料碟,預留30%空間
tiflash_servers:
- host: 10.0.1.21
data_dir: /data/tiflash/data # 獨立資料碟
config:
storage:
main:
capacity: "2TB" # 明確容量限制
latest:
capacity: "200GB"
❌ 坑5:忽略TiCDC同步延遲告警
# ❌ 錯誤:不監控TiCDC延遲,資料同步斷了都不知道
# 沒有任何告警設定
# ✅ 正確:設定TiCDC延遲告警
# Prometheus告警規則
- alert: TiCDCReplicationLag
expr: |
tidb_cdc_checkpoint_ts_lag_seconds > 300
for: 5m
labels:
severity: critical
annotations:
summary: "TiCDC同步延遲超過5分鐘"
description: "changefeed {{ $labels.changefeed }} 延遲 {{ $value }}s"
錯誤排查速查表
| 錯誤現象 | 可能原因 | 排查指令/方法 | 解決方案 |
|---|---|---|---|
| TiFlash副本同步卡住 | TiFlash節點磁碟滿或當機 | SELECT * FROM information_schema.tiflash_replica WHERE progress < 1.0; |
擴容磁碟或重啟TiFlash節點 |
MPP查詢報 Too many MPP tasks |
併發MPP查詢過多 | SHOW CONFIG WHERE name LIKE '%mpp%'; |
調整 tidb_mpp_task_limit 或降低併發 |
| 查詢未走TiFlash | 最佳化器判斷TiKV更優 | EXPLAIN SELECT ...; 檢視算子 |
使用 /*+ read_from_storage(tiflash[t]) */ Hint |
| TiCDC同步延遲增大 | 下游寫入慢或網路抖動 | cdc cli changefeed query --changefeed-id=xxx |
擴容下游或增加TiCDC worker數 |
| TiFlash OOM | 查詢資料量過大 | grep OOM /data/tiflash/log/tiflash.log |
增加 max_memory_usage 或縮小查詢範圍 |
| 看板查詢超時 | 未限制查詢資源 | SHOW PROCESSLIST; |
設定 max_execution_time 和 Resource Group |
| 物化檢視重新整理失敗 | 來源表資料量過大或鎖衝突 | SHOW MATERIALIZED VIEWS; |
調整重新整理頻率或最佳化來源表索引 |
| TiFlash寫入延遲高 | Raft Learner同步壓力大 | Grafana TiFlash Dashboard | 擴容TiFlash節點或調整 raft-store.sync-log |
| MPP Join資料傾斜 | Join Key分佈不均 | EXPLAIN ANALYZE 檢視各節點資料量 |
使用 /*+ shuffle_join() */ 或增加分桶數 |
| TiFlash磁碟IO高 | Compaction或大量掃描 | iostat -x 1 檢查IO |
調整 bg_task_io_limit 或擴容磁碟 |
進階最佳化技巧
3.1 TiFlash儲存引擎調優
-- 調整TiFlash Segment分裂閾值(減少小Segment)
-- 在TiFlash設定中修改
# tiflash-config.yaml
storage:
main:
segment_delta_rows: 8192
segment_delta_size: 536870912
segment_stable_rows: 10485760
latest:
segment_delta_rows: 4096
segment_delta_size: 134217728
raft:
apply_batch: 1024
sync_log: false
engine:
dt:
enable_read_thread: true
read_thread_count: 8
enable_leaf_dt_segment_optimize: true
3.2 MPP查詢進階最佳化
-- 1. 使用Runtime Filter減少Join資料傳輸
-- TiDB 8.x預設開啟,確認狀態
SET SESSION tidb_runtime_filter_mode = "LOCAL";
-- 2. 調整MPP Exchange資料壓縮
SET SESSION tidb_exchange_compression = 'ON';
SET SESSION tidb_exchange_compression_level = 3;
-- 3. 大表Join最佳化:使用Hash Join Hint
SELECT /*+ hash_join_build(o) */
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN categories c ON oi.product_id = c.category_id
GROUP BY c.category_name;
-- 4. 使用Index Lookup Join替代Hash Join(適合小表驅動大表)
SELECT /*+ inl_join(p) */
o.order_id, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.created_at >= CURDATE();
3.3 多租戶HTAP資源隔離
-- 為不同業務線建立獨立資源組
CREATE RESOURCE GROUP IF NOT EXISTS rg_ecommerce
RU_PER_SEC = 15000 PRIORITY = HIGH
QUERY_LIMIT = (EXECUTION_TIME = 30s);
CREATE RESOURCE GROUP IF NOT EXISTS rg_analytics
RU_PER_SEC = 10000 PRIORITY = LOW
QUERY_LIMIT = (EXECUTION_TIME = 120s);
CREATE RESOURCE GROUP IF NOT EXISTS rg_data_science
RU_PER_SEC = 5000 PRIORITY = LOW
QUERY_LIMIT = (EXECUTION_TIME = 600s);
-- 繫結使用者到資源組
ALTER USER ecommerce_app RESOURCE GROUP rg_ecommerce;
ALTER USER analytics_user RESOURCE GROUP rg_analytics;
ALTER USER ds_user RESOURCE GROUP rg_data_science;
-- 監控資源組使用情況
SELECT * FROM information_schema.resource_group_usage;
HTAP方案對比
| 維度 | TiDB HTAP | ClickHouse + MySQL | Snowflake | Apache Doris | SingleStore |
|---|---|---|---|---|---|
| 架構 | 一體化HTAP | 雙系統拼接 | 雲原生OLAP | 一體化HTAP | 一體化HTAP |
| OLTP能力 | 強(MySQL相容) | MySQL原生 | 無 | 弱 | 中等 |
| OLAP效能 | 強(MPP) | 極強 | 極強 | 強 | 強 |
| 資料同步 | Raft Learner(自動) | 需自建ETL | 需匯入 | 自動 | 自動 |
| 即時性 | 秒級 | 分鐘級 | 分鐘級 | 秒級 | 秒級 |
| MySQL相容 | 95%+ | 原生 | 不相容 | 中等 | 中等 |
| 水平擴展 | 透明擴展 | 各自擴展 | 自動擴展 | 透明擴展 | 透明擴展 |
| 運維複雜度 | 中等 | 高 | 低 | 中等 | 中等 |
| 成本 | 中等 | 高(雙系統) | 高(按量付費) | 中等 | 高 |
| 開源 | 完全開源 | 部分開源 | 閉源 | 完全開源 | 部分開源 |
| 適用場景 | 通用HTAP | 已有MySQL+需強OLAP | 純分析 | 即時數倉 | 即時分析 |
總結
TiDB HTAP不是「既能做OLTP又能做OLAP」的行銷話術,而是透過Raft Learner實現行存到列存的即時同步,透過MPP引擎實現分佈式平行計算,透過Resource Control實現OLTP/OLAP資源隔離的工程化解決方案。2026年的今天,如果你還在維護MySQL + ClickHouse的雙系統架構,是時候認真評估TiDB HTAP了——一套系統、一份運維、零資料延遲,這才是即時分析該有的樣子。
推薦工具
- JSON 格式化 - 格式化TiDB設定JSON檔案
- Base64 編解碼 - 處理TiCDC憑證編碼
- Hash 計算 - 校驗資料同步一致性
本站提供瀏覽器本地工具,免註冊即可試用 →