TiDB HTAP即時分析實戰:從TiFlash到即時看板的6種生產模式

数据库

T+1分析已死,即時洞察才是未來

凌晨3點,線上訂單暴漲10倍,CEO問:「現在各品類即時GMV多少?」——你只能回答:「資料還在ETL管道裡,明天才能看到。」

這就是傳統T+1分析的困境:

  • ETL管道脆弱:Kafka → Flink → ClickHouse,任何一環故障,資料斷流
  • 資料延遲不可控:從OLTP到OLAP,分鐘級到小時級延遲是常態
  • 雙系統運維噩夢:MySQL負責交易,ClickHouse負責分析,資料一致性靠人肉
  • 成本失控:兩套儲存、兩套計算、兩套運維,資源利用率低下
  • 即時性不足:業務決策需要秒級回應,T+1報表已成歷史遺留

TiDB HTAP的承諾:一套系統,同時承載OLTP和OLAP,資料零延遲從行存同步到列存。這不是PPT架構,而是已經在數千家企業生產驗證的方案。


HTAP核心概念速查

概念 全稱 核心作用 類比
HTAP Hybrid Transactional/Analytical Processing 一套系統同時處理交易和分析 一人分飾交易員和分析師
TiFlash TiDB Flash 列式儲存引擎,承載OLAP查詢 OLAP的專屬加速器
TiKV TiDB Key-Value 行式儲存引擎,承載OLTP交易 OLTP的基石
MPP Massively Parallel Processing 多節點平行計算框架 多人同時算不同分片
列式儲存 Columnar Storage 按列儲存資料,分析查詢只讀需要的列 只翻需要的書頁
Raft Learner Raft一致性協議學習者角色 非同步複製資料到TiFlash,不影響主流程 影子抄寫員
資料市集 Data Mart 面向特定業務域的預聚合資料集合 為每個部門定製的報表櫥窗

💡 使用 JSON 格式化 工具格式化 TiDB 設定 JSON 檔案,便於排查問題。


傳統即時分析架構的5大痛點

痛點1:ETL管道是最大的單點故障

MySQL → Canal → Kafka → Flink → ClickHouse
  │        │       │       │        │
  │        │       │       │        └─ 寫入失敗,資料遺失
  │        │       │       └─ 作業崩潰,資料積壓
  │        │       └─ 分區不均,消費延遲
  │        └─ Binlog解析異常
  └─ 主從延遲

任何一個環節出問題,整條鏈路斷裂。凌晨告警不斷,運維苦不堪言。

痛點2:資料一致性無法保證

OLTP和OLAP是兩套獨立系統,資料同步存在延遲視窗:

-- MySQL: 訂單已退款
UPDATE orders SET status = 'refunded' WHERE id = 10086;

-- ClickHouse: 5分鐘後才看到退款狀態
-- 這5分鐘內,報表資料是錯的

痛點3:雙系統運維成本翻倍

運維項 MySQL ClickHouse 合計
監控告警 Grafana + Prometheus Grafana + Prometheus 2套
備份恢復 mysqldump / XtraBackup clickhouse-backup 2套
容量規劃 主從 + 分庫分表 分佈式表 + 副本 2套
版本升級 滾動升級 滾動升級 2套
人員要求 DBA ClickHouse專家 2個團隊

痛點4:查詢效能與資料新鮮度的矛盾

  • ClickHouse追求查詢效能,需要MergeTree合併,資料有延遲
  • 要即時性就要用ReplacingMergeTree,但查詢需要FINAL,效能暴跌
  • 即時性和效能,魚和熊掌不可兼得

痛點5:開發效率低下

// 開發者需要維護兩套資料模型
public class OrderService {
    // OLTP: MySQL
    @Transactional
    public void createOrder(Order order) {
        orderMapper.insert(order);
    }
    
    // OLAP: ClickHouse - 還要寫一遍
    public void syncToClickHouse(Order order) {
        clickHouseMapper.insert(order);
        // 欄位對映、型別轉換、異常處理...
    }
}

6種生產級HTAP實戰模式

模式1:TiFlash列存副本搭建

TiFlash透過Raft Learner角色非同步接收TiKV的資料變更,實現行存到列存的即時同步。

1.1 部署TiFlash節點

# tiup topology.yaml 片段
tiflash_servers:
  - host: 10.0.1.21
    tcp_port: 9000
    http_port: 8123
    flash_service_port: 3930
    flash_proxy_port: 20170
    flash_proxy_status_port: 20292
    metrics_interval: 15
    data_dir: /data/tiflash/data
    deploy_dir: /data/tiflash/deploy
    config:
      logger:
        level: information
      storage:
        main:
          capacity: "2TB"
        latest:
          capacity: "200GB"
  - host: 10.0.1.22
    tcp_port: 9000
    http_port: 8123
    flash_service_port: 3930
    flash_proxy_port: 20170
    flash_proxy_status_port: 20292
    data_dir: /data/tiflash/data
    deploy_dir: /data/tiflash/deploy
# 擴容TiFlash節點
tiup cluster scale-out tidb-prod topology-tiflash.yaml

# 驗證TiFlash節點狀態
tiup cluster display tidb-prod | grep tiflash

1.2 為表新增TiFlash副本

-- 為整張表新增TiFlash副本(預設1副本)
ALTER TABLE orders SET TIFLASH REPLICA 1;

-- 為表新增2個TiFlash副本(高可用)
ALTER TABLE orders SET TIFLASH REPLICA 2;

-- 為分區表新增TiFlash副本
ALTER TABLE orders_partition SET TIFLASH REPLICA 1;

-- 檢視副本同步進度
SELECT * FROM information_schema.tiflash_replica
WHERE table_name = 'orders';

同步進度輸出範例:

+----------+------------+----------+---------------+------------------+-----------+
| TABLE_ID | TABLE_NAME | REPLICA  | LOCATION_LABELS| AVAILABLE        | PROGRESS  |
+----------+------------+----------+---------------+------------------+-----------+
|       85 | orders     |        1 |               | True             | 1.0       |
+----------+------------+----------+---------------+------------------+-----------+

1.3 按列新增TiFlash副本(節省儲存)

-- 只為分析需要的列建立TiFlash副本
ALTER TABLE orders SET TIFLASH REPLICA 1
  COLUMNS (order_id, user_id, product_id, amount, status, created_at);

-- 驗證列級副本
SHOW CREATE TABLE orders;

模式2:MPP分佈式查詢加速

TiDB的MPP引擎將分析查詢下推到多個TiFlash節點平行執行,實現接近ClickHouse的查詢效能。

2.1 啟用MPP模式

-- 檢視當前MPP模式
SHOW VARIABLES LIKE 'tidb_allow_mpp';

-- 啟用MPP(預設已啟用)
SET SESSION tidb_allow_mpp = 1;

-- 強制使用MPP(用於測試驗證)
SET SESSION tidb_enforce_mpp = 1;

-- 設定MPP查詢的Exchange壓縮
SET SESSION tidb_exchange_compression = 'ON';

2.2 MPP查詢實戰:即時銷售分析

-- 場景:即時統計各品類今日銷售額Top10
-- MPP會自動將計算下推到TiFlash節點平行執行

SELECT
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales,
    COUNT(DISTINCT o.user_id) AS unique_buyers,
    AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
  AND o.status IN ('paid', 'shipped', 'completed')
GROUP BY c.category_name
ORDER BY total_sales DESC
LIMIT 10;

2.3 驗證MPP執行計畫

-- 檢視執行計畫,確認MPP算子
EXPLAIN ANALYZE
SELECT
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;

關鍵執行計畫特徵:

+------------------------------------------+----------+---------+--------------+
| id                                       | task     | rows    | operator info|
+------------------------------------------+----------+---------+--------------+
| TableReader_45                           | root     | 10      | ...          |
| └─ExchangeSender_44                      | mpp[tif] | 10      | ExchangeType |
|   └─Projection_42                        | mpp[tif] | 10      | ...          |
|     └─HashAgg_43                         | mpp[tif] | 10      | group by:... |
|       └─ExchangeReceiver_38              | mpp[tif] | 100     | ...          |
|         └─ExchangeSender_37              | mpp[tif] | 100     | ExchangeType |
|           └─HashJoin_36                  | mpp[tif] | 100     | inner join   |
|             ├─Selection_25(Build)        | mpp[tif] | 500     | ...          |
|             │ └─TableFullScan_24         | mpp[tif] | 10000   | tikv:tiflash |
|             └─Selection_30(Probe)        | mpp[tif] | 800     | ...          |
|               └─TableFullScan_29         | mpp[tif] | 20000   | tikv:tiflash |
+------------------------------------------+----------+---------+--------------+

看到 mpp[tif]ExchangeSender/Receiver 說明MPP已生效。

2.4 MPP效能對比

-- 關閉MPP,使用傳統TiKV掃描
SET SESSION tidb_allow_mpp = 0;

EXPLAIN ANALYZE
SELECT c.category_name, SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;

-- 重新啟用MPP
SET SESSION tidb_allow_mpp = 1;
查詢模式 資料量 執行時間 掃描行數
TiKV掃描 1億行 45.2s 100,000,000
TiFlash非MPP 1億行 12.8s 100,000,000
TiFlash MPP 1億行 3.2s 100,000,000

模式3:TiCDC即時ETL替代

TiCDC(Change Data Capture)可以替代傳統ETL管道,將TiDB的資料變更即時同步到下游系統。

3.1 部署TiCDC

# tiup topology.yaml 片段
cdc_servers:
  - host: 10.0.1.31
    port: 8300
    deploy_dir: /data/cdc/deploy
    data_dir: /data/cdc/data
    config:
      per-table-memory-quota: 10485760
      sink-memory-quota: 536870912
  - host: 10.0.1.32
    port: 8300
    deploy_dir: /data/cdc/deploy
    data_dir: /data/cdc/data
# 擴容TiCDC
tiup cluster scale-out tidb-prod topology-cdc.yaml

# 驗證TiCDC狀態
tiup cluster display tidb-prod | grep cdc

3.2 建立同步任務:替代MySQL → Kafka → Flink鏈路

# 建立TiCDC同步任務:TiDB → Kafka(替代Canal + Kafka)
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="kafka://kafka-cluster:9092/tidb-orders?partition-num=6&max-message-bytes=10485760&replication-factor=3" \
  --changefeed-id="orders-to-kafka" \
  --config=cdc-orders.yaml
# cdc-orders.yaml
filter:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'
    - 'ecommerce.payments'
  ignore_txn_start_ts: []

mounter:
  worker-num: 16

sink:
  protocol: "canal-json"
  dispatchers:
    - matcher: ['ecommerce.*']
      partition: "index-value"

selector:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'
    - 'ecommerce.payments'

3.3 直接同步到ClickHouse(如果仍需保留)

# TiCDC → ClickHouse 直連
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="clickhouse://clickhouse-server:8123?database=analytics&username=default&password=***" \
  --changefeed-id="orders-to-clickhouse" \
  --config=cdc-ch.yaml

3.4 TiCDC同步到S3(資料湖方案)

# TiCDC → S3 資料湖
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="s3://data-lake/tidb-cdc/?endpoint=https://s3.amazonaws.com&access-key=***&secret-access-key=***" \
  --changefeed-id="orders-to-s3" \
  --config=cdc-s3.yaml
# cdc-s3.yaml
sink:
  protocol: "csv"
  column-selectors:
    - matcher: ['ecommerce.orders']
      columns: ['order_id', 'user_id', 'amount', 'status', 'created_at']

file:
  size: 67108864
  interval: 5m

filter:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'

3.5 監控TiCDC同步延遲

-- 檢視同步任務狀態
cdc cli changefeed list \
  --server=http://10.0.1.31:8300;

-- 檢視同步延遲
cdc cli changefeed query \
  --changefeed-id=orders-to-kafka \
  --server=http://10.0.1.31:8300;

關鍵指標:

{
  "state": "normal",
  "tso": 446685327530688513,
  "checkpoint": "2026-06-15 10:30:00.123",
  "lag": "1.2s"
}

模式4:看板查詢最佳化

即時看板是HTAP最直接的應用場景,但查詢效能直接影響使用者體驗。

4.1 看板查詢模式識別

-- 典型看板查詢1:即時GMV
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv,
    COUNT(*) AS order_count
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
  AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;

-- 典型看板查詢2:即時使用者活躍
SELECT
    DATE_FORMAT(login_time, '%Y-%m-%d %H:00') AS hour_slot,
    COUNT(DISTINCT user_id) AS dau,
    COUNT(*) AS total_actions
FROM user_actions
WHERE login_time >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot
ORDER BY hour_slot;

-- 典型看板查詢3:即時轉化漏斗
SELECT
    COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END) AS views,
    COUNT(DISTINCT CASE WHEN event = 'cart' THEN user_id END) AS carts,
    COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) AS pays,
    ROUND(COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) * 100.0
      / NULLIF(COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END), 0), 2) AS conversion_rate
FROM user_events
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 HOUR);

4.2 強制查詢走TiFlash

-- 方式1:透過Hint強制走TiFlash
SELECT /*+ read_from_storage(tiflash[orders]) */
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;

-- 方式2:設定Session變數
SET SESSION tidb_isolation_read_engines = 'tiflash';
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
SET SESSION tidb_isolation_read_engines = 'tikv,tiflash';

-- 方式3:透過Resource Group隔離OLAP流量
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 5000
  PRIORITY = LOW;

ALTER USER dashboard_user RESOURCE GROUP olap_group;

4.3 看板查詢快取最佳化

-- 使用Binding固定執行計畫
CREATE SESSION BINDING FOR
SELECT SUM(amount) FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
USING
SELECT /*+ read_from_storage(tiflash[orders]) */ SUM(amount)
FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR);

-- 檢視Binding
SHOW SESSION BINDINGS;

4.4 看板重新整理策略

-- 看板查詢建議使用READ ONLY模式,避免和OLTP爭搶資源
SET SESSION tidb_isolation_read_engines = 'tiflash';

-- 限制看板查詢的資源消耗
SET SESSION max_execution_time = 30000;

-- 使用prepared statement減少解析開銷
PREPARE dashboard_gmv FROM
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= ?
  AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;

EXECUTE dashboard_gmv USING DATE_SUB(NOW(), INTERVAL 24 HOUR);

模式5:物化檢視建構資料市集

TiDB 8.x支援物化檢視(Materialized View),可以預計算聚合資料,建構面向業務的資料市集。

5.1 建立物化檢視

-- 建立日銷售彙總物化檢視
CREATE MATERIALIZED VIEW mv_daily_sales
REFRESH EVERY 1 HOUR
AS
SELECT
    DATE(created_at) AS sale_date,
    c.category_name,
    p.product_name,
    SUM(oi.quantity) AS total_quantity,
    SUM(oi.quantity * oi.unit_price) AS total_revenue,
    COUNT(DISTINCT o.user_id) AS unique_buyers,
    AVG(oi.quantity * oi.unit_price) AS avg_price
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.status IN ('paid', 'shipped', 'completed')
GROUP BY DATE(created_at), c.category_name, p.product_name;

5.2 建立使用者行為物化檢視

-- 使用者行為漏斗物化檢視
CREATE MATERIALIZED VIEW mv_user_funnel
REFRESH EVERY 5 MINUTE
AS
SELECT
    DATE(created_at) AS event_date,
    COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END) AS page_views,
    COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END) AS add_to_carts,
    COUNT(DISTINCT CASE WHEN event_type = 'checkout' THEN user_id END) AS checkouts,
    COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) AS payments,
    ROUND(
        COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) * 100.0
        / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END), 0),
        2
    ) AS view_to_pay_rate
FROM user_events
GROUP BY DATE(created_at);

5.3 管理物化檢視

-- 檢視物化檢視狀態
SHOW MATERIALIZED VIEWS;

-- 手動重新整理物化檢視
ALTER MATERIALIZED VIEW mv_daily_sales REFRESH;

-- 暫停自動重新整理
ALTER MATERIALIZED VIEW mv_daily_sales DISABLE;

-- 恢復自動重新整理
ALTER MATERIALIZED VIEW mv_daily_sales ENABLE;

-- 刪除物化檢視
DROP MATERIALIZED VIEW IF EXISTS mv_daily_sales;

5.4 基於物化檢視的看板查詢

-- 查詢物化檢視(毫秒級回應)
SELECT
    sale_date,
    category_name,
    total_revenue,
    unique_buyers
FROM mv_daily_sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 7 DAY)
ORDER BY total_revenue DESC
LIMIT 20;

模式6:生產HTAP部署與監控

生產環境的HTAP部署需要考慮資源隔離、監控告警和故障恢復。

6.1 資源管控設定

-- 建立OLTP資源組(高優先級)
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
  RU_PER_SEC = 20000
  PRIORITY = HIGH;

-- 建立OLAP資源組(低優先級,限制資源)
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 8000
  PRIORITY = LOW;

-- 將使用者繫結到資源組
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
ALTER USER etl_user RESOURCE GROUP olap_group;

-- 檢視資源組設定
SELECT * FROM information_schema.resource_groups;

6.2 TiFlash關鍵監控指標

# Grafana Dashboard 關鍵指標
# TiFlash Summary Dashboard

tiflash_system_profile:
  - metric: tiflash_storage_write_duration_seconds
    alert_threshold: "> 5s"
    description: "TiFlash寫入延遲"
  
  - metric: tiflash_mpp_task_duration_seconds
    alert_threshold: "> 60s"
    description: "MPP任務執行時長"
  
  - metric: tiflash_storage_segment_count
    alert_threshold: "> 10000"
    description: "Segment數量(過多需compaction)"
  
  - metric: tiflash_cpu_usage
    alert_threshold: "> 80%"
    description: "TiFlash CPU使用率"
  
  - metric: tiflash_storage_disk_usage_ratio
    alert_threshold: "> 0.8"
    description: "TiFlash磁碟使用率"

6.3 Prometheus告警規則

# tidb-htap-alerts.yaml
groups:
  - name: tidb_htap
    rules:
      - alert: TiFlashReplicaSyncLag
        expr: |
          tiflash_replica_available == 0
          or (tiflash_replica_progress < 0.95 and tiflash_replica_progress > 0)
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "TiFlash副本同步延遲"
          description: "表 {{ $labels.table_name }} 的TiFlash副本同步進度僅 {{ $value }}"

      - alert: TiFlashHighWriteDuration
        expr: |
          histogram_quantile(0.95,
            rate(tiflash_storage_write_duration_seconds_bucket[5m])
          ) > 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "TiFlash寫入延遲過高"
          description: "P95寫入延遲 {{ $value }}s"

      - alert: MPPQueryTimeout
        expr: |
          rate(tiflash_mpp_task_error_count{reason="timeout"}[5m]) > 0.1
        for: 3m
        labels:
          severity: critical
        annotations:
          summary: "MPP查詢超時率過高"
          description: "MPP查詢超時率 {{ $value }}/s"

      - alert: TiFlashDiskSpaceLow
        expr: |
          tiflash_storage_disk_usage_ratio > 0.85
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "TiFlash磁碟空間不足"
          description: "磁碟使用率 {{ $value }}"

6.4 日常運維腳本

#!/bin/bash
# tidb-htap-healthcheck.sh
# TiDB HTAP健康檢查腳本

CLUSTER_NAME="tidb-prod"
LOG_FILE="/var/log/tidb-htap-healthcheck.log"

echo "=== TiDB HTAP Health Check $(date) ===" >> $LOG_FILE

# 檢查TiFlash節點狀態
echo "--- TiFlash Node Status ---" >> $LOG_FILE
tiup cluster display $CLUSTER_NAME 2>/dev/null | grep tiflash >> $LOG_FILE

# 檢查TiFlash副本同步狀態
echo "--- TiFlash Replica Sync Status ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT table_schema, table_name, replica_count, available, progress
FROM information_schema.tiflash_replica
WHERE available = 0 OR progress < 1.0;
" >> $LOG_FILE 2>&1

# 檢查TiCDC同步延遲
echo "--- TiCDC Sync Lag ---" >> $LOG_FILE
cdc cli changefeed list --server=http://10.0.1.31:8300 >> $LOG_FILE 2>&1

# 檢查MPP查詢效能
echo "--- MPP Query Performance ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT DIGEST_TEXT, AVG_LATENCY, EXEC_COUNT
FROM information_schema.statements_summary
WHERE DIGEST_TEXT LIKE '%GROUP BY%'
  AND AVG_LATENCY > 1000000000
ORDER BY AVG_LATENCY DESC
LIMIT 10;
" >> $LOG_FILE 2>&1

echo "=== Health Check Complete ===" >> $LOG_FILE

5大踩坑指南

❌ 坑1:TiFlash副本未同步就查詢

-- ❌ 錯誤:剛新增TiFlash副本就查詢,資料不完整
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 立即查詢,可能讀到不完整資料
SELECT COUNT(*) FROM orders;

-- ✅ 正確:等待副本同步完成
ALTER TABLE orders SET TIFLASH REPLICA 1;

-- 檢查同步進度
SELECT table_name, progress, available
FROM information_schema.tiflash_replica
WHERE table_name = 'orders';

-- 等待 available = 1 後再查詢
SELECT COUNT(*) FROM orders;

❌ 坑2:小表也加TiFlash副本

-- ❌ 錯誤:小表加TiFlash副本反而更慢(網路開銷 > 掃描開銷)
ALTER TABLE config_dict SET TIFLASH REPLICA 1;

-- ✅ 正確:小表保持在TiKV,只為大表新增TiFlash副本
-- 建議規則:資料量 > 100萬行才考慮TiFlash副本
-- 小表透過Hint強制走TiKV
SELECT /*+ read_from_storage(tikv[config_dict]) */
    * FROM config_dict WHERE type = 'payment';

❌ 坑3:OLAP查詢影響OLTP效能

-- ❌ 錯誤:OLAP和OLTP使用同一使用者,資源無隔離
SELECT user, resource_group FROM mysql.user WHERE user = 'app_user';
-- app_user | default

-- ✅ 正確:使用Resource Group隔離
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
  RU_PER_SEC = 20000 PRIORITY = HIGH;
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 8000 PRIORITY = LOW;

ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;

-- OLAP查詢設定超時
SET SESSION max_execution_time = 60000;

❌ 坑4:TiFlash節點磁碟空間不足

# ❌ 錯誤:TiFlash資料目錄和系統碟共用
tiflash_servers:
  - host: 10.0.1.21
    data_dir: /var/lib/tiflash/data  # 系統碟空間有限

# ✅ 正確:TiFlash使用獨立資料碟,預留30%空間
tiflash_servers:
  - host: 10.0.1.21
    data_dir: /data/tiflash/data    # 獨立資料碟
    config:
      storage:
        main:
          capacity: "2TB"           # 明確容量限制
        latest:
          capacity: "200GB"

❌ 坑5:忽略TiCDC同步延遲告警

# ❌ 錯誤:不監控TiCDC延遲,資料同步斷了都不知道
# 沒有任何告警設定

# ✅ 正確:設定TiCDC延遲告警
# Prometheus告警規則
- alert: TiCDCReplicationLag
  expr: |
    tidb_cdc_checkpoint_ts_lag_seconds > 300
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "TiCDC同步延遲超過5分鐘"
    description: "changefeed {{ $labels.changefeed }} 延遲 {{ $value }}s"

錯誤排查速查表

錯誤現象 可能原因 排查指令/方法 解決方案
TiFlash副本同步卡住 TiFlash節點磁碟滿或當機 SELECT * FROM information_schema.tiflash_replica WHERE progress < 1.0; 擴容磁碟或重啟TiFlash節點
MPP查詢報 Too many MPP tasks 併發MPP查詢過多 SHOW CONFIG WHERE name LIKE '%mpp%'; 調整 tidb_mpp_task_limit 或降低併發
查詢未走TiFlash 最佳化器判斷TiKV更優 EXPLAIN SELECT ...; 檢視算子 使用 /*+ read_from_storage(tiflash[t]) */ Hint
TiCDC同步延遲增大 下游寫入慢或網路抖動 cdc cli changefeed query --changefeed-id=xxx 擴容下游或增加TiCDC worker數
TiFlash OOM 查詢資料量過大 grep OOM /data/tiflash/log/tiflash.log 增加 max_memory_usage 或縮小查詢範圍
看板查詢超時 未限制查詢資源 SHOW PROCESSLIST; 設定 max_execution_time 和 Resource Group
物化檢視重新整理失敗 來源表資料量過大或鎖衝突 SHOW MATERIALIZED VIEWS; 調整重新整理頻率或最佳化來源表索引
TiFlash寫入延遲高 Raft Learner同步壓力大 Grafana TiFlash Dashboard 擴容TiFlash節點或調整 raft-store.sync-log
MPP Join資料傾斜 Join Key分佈不均 EXPLAIN ANALYZE 檢視各節點資料量 使用 /*+ shuffle_join() */ 或增加分桶數
TiFlash磁碟IO高 Compaction或大量掃描 iostat -x 1 檢查IO 調整 bg_task_io_limit 或擴容磁碟

進階最佳化技巧

3.1 TiFlash儲存引擎調優

-- 調整TiFlash Segment分裂閾值(減少小Segment)
-- 在TiFlash設定中修改
# tiflash-config.yaml
storage:
  main:
    segment_delta_rows: 8192
    segment_delta_size: 536870912
    segment_stable_rows: 10485760
  latest:
    segment_delta_rows: 4096
    segment_delta_size: 134217728

raft:
  apply_batch: 1024
  sync_log: false

engine:
  dt:
    enable_read_thread: true
    read_thread_count: 8
    enable_leaf_dt_segment_optimize: true

3.2 MPP查詢進階最佳化

-- 1. 使用Runtime Filter減少Join資料傳輸
-- TiDB 8.x預設開啟,確認狀態
SET SESSION tidb_runtime_filter_mode = "LOCAL";

-- 2. 調整MPP Exchange資料壓縮
SET SESSION tidb_exchange_compression = 'ON';
SET SESSION tidb_exchange_compression_level = 3;

-- 3. 大表Join最佳化:使用Hash Join Hint
SELECT /*+ hash_join_build(o) */
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN categories c ON oi.product_id = c.category_id
GROUP BY c.category_name;

-- 4. 使用Index Lookup Join替代Hash Join(適合小表驅動大表)
SELECT /*+ inl_join(p) */
    o.order_id, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.created_at >= CURDATE();

3.3 多租戶HTAP資源隔離

-- 為不同業務線建立獨立資源組
CREATE RESOURCE GROUP IF NOT EXISTS rg_ecommerce
  RU_PER_SEC = 15000 PRIORITY = HIGH
  QUERY_LIMIT = (EXECUTION_TIME = 30s);

CREATE RESOURCE GROUP IF NOT EXISTS rg_analytics
  RU_PER_SEC = 10000 PRIORITY = LOW
  QUERY_LIMIT = (EXECUTION_TIME = 120s);

CREATE RESOURCE GROUP IF NOT EXISTS rg_data_science
  RU_PER_SEC = 5000 PRIORITY = LOW
  QUERY_LIMIT = (EXECUTION_TIME = 600s);

-- 繫結使用者到資源組
ALTER USER ecommerce_app RESOURCE GROUP rg_ecommerce;
ALTER USER analytics_user RESOURCE GROUP rg_analytics;
ALTER USER ds_user RESOURCE GROUP rg_data_science;

-- 監控資源組使用情況
SELECT * FROM information_schema.resource_group_usage;

HTAP方案對比

維度 TiDB HTAP ClickHouse + MySQL Snowflake Apache Doris SingleStore
架構 一體化HTAP 雙系統拼接 雲原生OLAP 一體化HTAP 一體化HTAP
OLTP能力 強(MySQL相容) MySQL原生 中等
OLAP效能 強(MPP) 極強 極強
資料同步 Raft Learner(自動) 需自建ETL 需匯入 自動 自動
即時性 秒級 分鐘級 分鐘級 秒級 秒級
MySQL相容 95%+ 原生 不相容 中等 中等
水平擴展 透明擴展 各自擴展 自動擴展 透明擴展 透明擴展
運維複雜度 中等 中等 中等
成本 中等 高(雙系統) 高(按量付費) 中等
開源 完全開源 部分開源 閉源 完全開源 部分開源
適用場景 通用HTAP 已有MySQL+需強OLAP 純分析 即時數倉 即時分析

總結

TiDB HTAP不是「既能做OLTP又能做OLAP」的行銷話術,而是透過Raft Learner實現行存到列存的即時同步,透過MPP引擎實現分佈式平行計算,透過Resource Control實現OLTP/OLAP資源隔離的工程化解決方案。2026年的今天,如果你還在維護MySQL + ClickHouse的雙系統架構,是時候認真評估TiDB HTAP了——一套系統、一份運維、零資料延遲,這才是即時分析該有的樣子。


推薦工具

本站提供瀏覽器本地工具,免註冊即可試用 →

#TiDB#HTAP#实时分析#TiFlash#分布式数据库#2026#列存