TiDB HTAP实时分析实战:从TiFlash到实时看板的6种生产模式

数据库

T+1分析已死,实时洞察才是未来

凌晨3点,线上订单暴涨10倍,CEO问:"现在各品类实时GMV多少?"——你只能回答:"数据还在ETL管道里,明天才能看到。"

这就是传统T+1分析的困境:

  • ETL管道脆弱:Kafka → Flink → ClickHouse,任何一环故障,数据断流
  • 数据延迟不可控:从OLTP到OLAP,分钟级到小时级延迟是常态
  • 双系统运维噩梦:MySQL负责交易,ClickHouse负责分析,数据一致性靠人肉
  • 成本失控:两套存储、两套计算、两套运维,资源利用率低下
  • 实时性不足:业务决策需要秒级响应,T+1报表已成历史遗留

TiDB HTAP的承诺:一套系统,同时承载OLTP和OLAP,数据零延迟从行存同步到列存。这不是PPT架构,而是已经在数千家企业生产验证的方案。


HTAP核心概念速查

概念 全称 核心作用 类比
HTAP Hybrid Transactional/Analytical Processing 一套系统同时处理事务和分析 一人分饰交易员和分析师
TiFlash TiDB Flash 列式存储引擎,承载OLAP查询 OLAP的专属加速器
TiKV TiDB Key-Value 行式存储引擎,承载OLTP事务 OLTP的基石
MPP Massively Parallel Processing 多节点并行计算框架 多人同时算不同分片
列式存储 Columnar Storage 按列存储数据,分析查询只读需要的列 只翻需要的书页
Raft Learner Raft一致性协议学习者角色 异步复制数据到TiFlash,不影响主流程 影子抄写员
数据集市 Data Mart 面向特定业务域的预聚合数据集合 为每个部门定制的报表橱窗

💡 使用 JSON 格式化 工具格式化 TiDB 配置文件,便于排查问题。


传统实时分析架构的5大痛点

痛点1:ETL管道是最大的单点故障

MySQL → Canal → Kafka → Flink → ClickHouse
  │        │       │       │        │
  │        │       │       │        └─ 写入失败,数据丢失
  │        │       │       └─ 作业崩溃,数据积压
  │        │       └─ 分区不均,消费延迟
  │        └─ Binlog解析异常
  └─ 主从延迟

任何一个环节出问题,整条链路断裂。凌晨告警不断,运维苦不堪言。

痛点2:数据一致性无法保证

OLTP和OLAP是两套独立系统,数据同步存在延迟窗口:

-- MySQL: 订单已退款
UPDATE orders SET status = 'refunded' WHERE id = 10086;

-- ClickHouse: 5分钟后才看到退款状态
-- 这5分钟内,报表数据是错的

痛点3:双系统运维成本翻倍

运维项 MySQL ClickHouse 合计
监控告警 Grafana + Prometheus Grafana + Prometheus 2套
备份恢复 mysqldump / XtraBackup clickhouse-backup 2套
容量规划 主从 + 分库分表 分布式表 + 副本 2套
版本升级 滚动升级 滚动升级 2套
人员要求 DBA ClickHouse专家 2个团队

痛点4:查询性能与数据新鲜度的矛盾

  • ClickHouse追求查询性能,需要MergeTree合并,数据有延迟
  • 要实时性就要用ReplacingMergeTree,但查询需要FINAL,性能暴跌
  • 实时性和性能,鱼和熊掌不可兼得

痛点5:开发效率低下

// 开发者需要维护两套数据模型
public class OrderService {
    // OLTP: MySQL
    @Transactional
    public void createOrder(Order order) {
        orderMapper.insert(order);
    }
    
    // OLAP: ClickHouse - 还要写一遍
    public void syncToClickHouse(Order order) {
        clickHouseMapper.insert(order);
        // 字段映射、类型转换、异常处理...
    }
}

6种生产级HTAP实战模式

模式1:TiFlash列存副本搭建

TiFlash通过Raft Learner角色异步接收TiKV的数据变更,实现行存到列存的实时同步。

1.1 部署TiFlash节点

# tiup topology.yaml 片段
tiflash_servers:
  - host: 10.0.1.21
    tcp_port: 9000
    http_port: 8123
    flash_service_port: 3930
    flash_proxy_port: 20170
    flash_proxy_status_port: 20292
    metrics_interval: 15
    data_dir: /data/tiflash/data
    deploy_dir: /data/tiflash/deploy
    config:
      logger:
        level: information
      storage:
        main:
          capacity: "2TB"
        latest:
          capacity: "200GB"
  - host: 10.0.1.22
    tcp_port: 9000
    http_port: 8123
    flash_service_port: 3930
    flash_proxy_port: 20170
    flash_proxy_status_port: 20292
    data_dir: /data/tiflash/data
    deploy_dir: /data/tiflash/deploy
# 扩容TiFlash节点
tiup cluster scale-out tidb-prod topology-tiflash.yaml

# 验证TiFlash节点状态
tiup cluster display tidb-prod | grep tiflash

1.2 为表添加TiFlash副本

-- 为整张表添加TiFlash副本(默认1副本)
ALTER TABLE orders SET TIFLASH REPLICA 1;

-- 为表添加2个TiFlash副本(高可用)
ALTER TABLE orders SET TIFLASH REPLICA 2;

-- 为分区表添加TiFlash副本
ALTER TABLE orders_partition SET TIFLASH REPLICA 1;

-- 查看副本同步进度
SELECT * FROM information_schema.tiflash_replica
WHERE table_name = 'orders';

同步进度输出示例:

+----------+------------+----------+---------------+------------------+-----------+
| TABLE_ID | TABLE_NAME | REPLICA  | LOCATION_LABELS| AVAILABLE        | PROGRESS  |
+----------+------------+----------+---------------+------------------+-----------+
|       85 | orders     |        1 |               | True             | 1.0       |
+----------+------------+----------+---------------+------------------+-----------+

1.3 按列添加TiFlash副本(节省存储)

-- 只为分析需要的列创建TiFlash副本
ALTER TABLE orders SET TIFLASH REPLICA 1
  COLUMNS (order_id, user_id, product_id, amount, status, created_at);

-- 验证列级副本
SHOW CREATE TABLE orders;

模式2:MPP分布式查询加速

TiDB的MPP引擎将分析查询下推到多个TiFlash节点并行执行,实现接近ClickHouse的查询性能。

2.1 启用MPP模式

-- 查看当前MPP模式
SHOW VARIABLES LIKE 'tidb_allow_mpp';

-- 启用MPP(默认已启用)
SET SESSION tidb_allow_mpp = 1;

-- 强制使用MPP(用于测试验证)
SET SESSION tidb_enforce_mpp = 1;

-- 设置MPP查询的Exchange压缩
SET SESSION tidb_exchange_compression = 'ON';

2.2 MPP查询实战:实时销售分析

-- 场景:实时统计各品类今日销售额Top10
-- MPP会自动将计算下推到TiFlash节点并行执行

SELECT
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales,
    COUNT(DISTINCT o.user_id) AS unique_buyers,
    AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
  AND o.status IN ('paid', 'shipped', 'completed')
GROUP BY c.category_name
ORDER BY total_sales DESC
LIMIT 10;

2.3 验证MPP执行计划

-- 查看执行计划,确认MPP算子
EXPLAIN ANALYZE
SELECT
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;

关键执行计划特征:

+------------------------------------------+----------+---------+--------------+
| id                                       | task     | rows    | operator info|
+------------------------------------------+----------+---------+--------------+
| TableReader_45                           | root     | 10      | ...          |
| └─ExchangeSender_44                      | mpp[tif] | 10      | ExchangeType |
|   └─Projection_42                        | mpp[tif] | 10      | ...          |
|     └─HashAgg_43                         | mpp[tif] | 10      | group by:... |
|       └─ExchangeReceiver_38              | mpp[tif] | 100     | ...          |
|         └─ExchangeSender_37              | mpp[tif] | 100     | ExchangeType |
|           └─HashJoin_36                  | mpp[tif] | 100     | inner join   |
|             ├─Selection_25(Build)        | mpp[tif] | 500     | ...          |
|             │ └─TableFullScan_24         | mpp[tif] | 10000   | tikv:tiflash |
|             └─Selection_30(Probe)        | mpp[tif] | 800     | ...          |
|               └─TableFullScan_29         | mpp[tif] | 20000   | tikv:tiflash |
+------------------------------------------+----------+---------+--------------+

看到 mpp[tif]ExchangeSender/Receiver 说明MPP已生效。

2.4 MPP性能对比

-- 关闭MPP,使用传统TiKV扫描
SET SESSION tidb_allow_mpp = 0;

EXPLAIN ANALYZE
SELECT c.category_name, SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;

-- 重新启用MPP
SET SESSION tidb_allow_mpp = 1;
查询模式 数据量 执行时间 扫描行数
TiKV扫描 1亿行 45.2s 100,000,000
TiFlash非MPP 1亿行 12.8s 100,000,000
TiFlash MPP 1亿行 3.2s 100,000,000

模式3:TiCDC实时ETL替代

TiCDC(Change Data Capture)可以替代传统ETL管道,将TiDB的数据变更实时同步到下游系统。

3.1 部署TiCDC

# tiup topology.yaml 片段
cdc_servers:
  - host: 10.0.1.31
    port: 8300
    deploy_dir: /data/cdc/deploy
    data_dir: /data/cdc/data
    config:
      per-table-memory-quota: 10485760
      sink-memory-quota: 536870912
  - host: 10.0.1.32
    port: 8300
    deploy_dir: /data/cdc/deploy
    data_dir: /data/cdc/data
# 扩容TiCDC
tiup cluster scale-out tidb-prod topology-cdc.yaml

# 验证TiCDC状态
tiup cluster display tidb-prod | grep cdc

3.2 创建同步任务:替代MySQL → Kafka → Flink链路

# 创建TiCDC同步任务:TiDB → Kafka(替代Canal + Kafka)
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="kafka://kafka-cluster:9092/tidb-orders?partition-num=6&max-message-bytes=10485760&replication-factor=3" \
  --changefeed-id="orders-to-kafka" \
  --config=cdc-orders.yaml
# cdc-orders.yaml
filter:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'
    - 'ecommerce.payments'
  ignore_txn_start_ts: []

mounter:
  worker-num: 16

sink:
  protocol: "canal-json"
  dispatchers:
    - matcher: ['ecommerce.*']
      partition: "index-value"

selector:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'
    - 'ecommerce.payments'

3.3 直接同步到ClickHouse(如果仍需保留)

# TiCDC → ClickHouse 直连
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="clickhouse://clickhouse-server:8123?database=analytics&username=default&password=***" \
  --changefeed-id="orders-to-clickhouse" \
  --config=cdc-ch.yaml

3.4 TiCDC同步到S3(数据湖方案)

# TiCDC → S3 数据湖
cdc cli changefeed create \
  --server=http://10.0.1.31:8300 \
  --sink-uri="s3://data-lake/tidb-cdc/?endpoint=https://s3.amazonaws.com&access-key=***&secret-access-key=***" \
  --changefeed-id="orders-to-s3" \
  --config=cdc-s3.yaml
# cdc-s3.yaml
sink:
  protocol: "csv"
  column-selectors:
    - matcher: ['ecommerce.orders']
      columns: ['order_id', 'user_id', 'amount', 'status', 'created_at']

file:
  size: 67108864
  interval: 5m

filter:
  rules:
    - 'ecommerce.orders'
    - 'ecommerce.order_items'

3.5 监控TiCDC同步延迟

-- 查看同步任务状态
cdc cli changefeed list \
  --server=http://10.0.1.31:8300;

-- 查看同步延迟
cdc cli changefeed query \
  --changefeed-id=orders-to-kafka \
  --server=http://10.0.1.31:8300;

关键指标:

{
  "state": "normal",
  "tso": 446685327530688513,
  "checkpoint": "2026-06-15 10:30:00.123",
  "lag": "1.2s"
}

模式4:看板查询优化

实时看板是HTAP最直接的应用场景,但查询性能直接影响用户体验。

4.1 看板查询模式识别

-- 典型看板查询1:实时GMV
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv,
    COUNT(*) AS order_count
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
  AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;

-- 典型看板查询2:实时用户活跃
SELECT
    DATE_FORMAT(login_time, '%Y-%m-%d %H:00') AS hour_slot,
    COUNT(DISTINCT user_id) AS dau,
    COUNT(*) AS total_actions
FROM user_actions
WHERE login_time >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot
ORDER BY hour_slot;

-- 典型看板查询3:实时转化漏斗
SELECT
    COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END) AS views,
    COUNT(DISTINCT CASE WHEN event = 'cart' THEN user_id END) AS carts,
    COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) AS pays,
    ROUND(COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) * 100.0
      / NULLIF(COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END), 0), 2) AS conversion_rate
FROM user_events
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 HOUR);

4.2 强制查询走TiFlash

-- 方式1:通过Hint强制走TiFlash
SELECT /*+ read_from_storage(tiflash[orders]) */
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;

-- 方式2:设置Session变量
SET SESSION tidb_isolation_read_engines = 'tiflash';
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
SET SESSION tidb_isolation_read_engines = 'tikv,tiflash';

-- 方式3:通过Resource Group隔离OLAP流量
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 5000
  PRIORITY = LOW;

ALTER USER dashboard_user RESOURCE GROUP olap_group;

4.3 看板查询缓存优化

-- 使用Binding固定执行计划
CREATE SESSION BINDING FOR
SELECT SUM(amount) FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
USING
SELECT /*+ read_from_storage(tiflash[orders]) */ SUM(amount)
FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR);

-- 查看Binding
SHOW SESSION BINDINGS;

4.4 看板刷新策略

-- 看板查询建议使用READ ONLY模式,避免和OLTP争抢资源
SET SESSION tidb_isolation_read_engines = 'tiflash';

-- 限制看板查询的资源消耗
SET SESSION max_execution_time = 30000;

-- 使用prepared statement减少解析开销
PREPARE dashboard_gmv FROM
SELECT
    DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
    SUM(amount) AS gmv
FROM orders
WHERE created_at >= ?
  AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;

EXECUTE dashboard_gmv USING DATE_SUB(NOW(), INTERVAL 24 HOUR);

模式5:物化视图构建数据集市

TiDB 8.x支持物化视图(Materialized View),可以预计算聚合数据,构建面向业务的数据集市。

5.1 创建物化视图

-- 创建日销售汇总物化视图
CREATE MATERIALIZED VIEW mv_daily_sales
REFRESH EVERY 1 HOUR
AS
SELECT
    DATE(created_at) AS sale_date,
    c.category_name,
    p.product_name,
    SUM(oi.quantity) AS total_quantity,
    SUM(oi.quantity * oi.unit_price) AS total_revenue,
    COUNT(DISTINCT o.user_id) AS unique_buyers,
    AVG(oi.quantity * oi.unit_price) AS avg_price
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.status IN ('paid', 'shipped', 'completed')
GROUP BY DATE(created_at), c.category_name, p.product_name;

5.2 创建用户行为物化视图

-- 用户行为漏斗物化视图
CREATE MATERIALIZED VIEW mv_user_funnel
REFRESH EVERY 5 MINUTE
AS
SELECT
    DATE(created_at) AS event_date,
    COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END) AS page_views,
    COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END) AS add_to_carts,
    COUNT(DISTINCT CASE WHEN event_type = 'checkout' THEN user_id END) AS checkouts,
    COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) AS payments,
    ROUND(
        COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) * 100.0
        / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END), 0),
        2
    ) AS view_to_pay_rate
FROM user_events
GROUP BY DATE(created_at);

5.3 管理物化视图

-- 查看物化视图状态
SHOW MATERIALIZED VIEWS;

-- 手动刷新物化视图
ALTER MATERIALIZED VIEW mv_daily_sales REFRESH;

-- 暂停自动刷新
ALTER MATERIALIZED VIEW mv_daily_sales DISABLE;

-- 恢复自动刷新
ALTER MATERIALIZED VIEW mv_daily_sales ENABLE;

-- 删除物化视图
DROP MATERIALIZED VIEW IF EXISTS mv_daily_sales;

5.4 基于物化视图的看板查询

-- 查询物化视图(毫秒级响应)
SELECT
    sale_date,
    category_name,
    total_revenue,
    unique_buyers
FROM mv_daily_sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 7 DAY)
ORDER BY total_revenue DESC
LIMIT 20;

模式6:生产HTAP部署与监控

生产环境的HTAP部署需要考虑资源隔离、监控告警和故障恢复。

6.1 资源管控配置

-- 创建OLTP资源组(高优先级)
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
  RU_PER_SEC = 20000
  PRIORITY = HIGH;

-- 创建OLAP资源组(低优先级,限制资源)
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 8000
  PRIORITY = LOW;

-- 将用户绑定到资源组
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
ALTER USER etl_user RESOURCE GROUP olap_group;

-- 查看资源组配置
SELECT * FROM information_schema.resource_groups;

6.2 TiFlash关键监控指标

# Grafana Dashboard 关键指标
# TiFlash Summary Dashboard

tiflash_system_profile:
  - metric: tiflash_storage_write_duration_seconds
    alert_threshold: "> 5s"
    description: "TiFlash写入延迟"
  
  - metric: tiflash_mpp_task_duration_seconds
    alert_threshold: "> 60s"
    description: "MPP任务执行时长"
  
  - metric: tiflash_storage_segment_count
    alert_threshold: "> 10000"
    description: "Segment数量(过多需compaction)"
  
  - metric: tiflash_cpu_usage
    alert_threshold: "> 80%"
    description: "TiFlash CPU使用率"
  
  - metric: tiflash_storage_disk_usage_ratio
    alert_threshold: "> 0.8"
    description: "TiFlash磁盘使用率"

6.3 Prometheus告警规则

# tidb-htap-alerts.yaml
groups:
  - name: tidb_htap
    rules:
      - alert: TiFlashReplicaSyncLag
        expr: |
          tiflash_replica_available == 0
          or (tiflash_replica_progress < 0.95 and tiflash_replica_progress > 0)
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "TiFlash副本同步延迟"
          description: "表 {{ $labels.table_name }} 的TiFlash副本同步进度仅 {{ $value }}"

      - alert: TiFlashHighWriteDuration
        expr: |
          histogram_quantile(0.95,
            rate(tiflash_storage_write_duration_seconds_bucket[5m])
          ) > 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "TiFlash写入延迟过高"
          description: "P95写入延迟 {{ $value }}s"

      - alert: MPPQueryTimeout
        expr: |
          rate(tiflash_mpp_task_error_count{reason="timeout"}[5m]) > 0.1
        for: 3m
        labels:
          severity: critical
        annotations:
          summary: "MPP查询超时率过高"
          description: "MPP查询超时率 {{ $value }}/s"

      - alert: TiFlashDiskSpaceLow
        expr: |
          tiflash_storage_disk_usage_ratio > 0.85
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "TiFlash磁盘空间不足"
          description: "磁盘使用率 {{ $value }}"

6.4 日常运维脚本

#!/bin/bash
# tidb-htap-healthcheck.sh
# TiDB HTAP健康检查脚本

CLUSTER_NAME="tidb-prod"
LOG_FILE="/var/log/tidb-htap-healthcheck.log"

echo "=== TiDB HTAP Health Check $(date) ===" >> $LOG_FILE

# 检查TiFlash节点状态
echo "--- TiFlash Node Status ---" >> $LOG_FILE
tiup cluster display $CLUSTER_NAME 2>/dev/null | grep tiflash >> $LOG_FILE

# 检查TiFlash副本同步状态
echo "--- TiFlash Replica Sync Status ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT table_schema, table_name, replica_count, available, progress
FROM information_schema.tiflash_replica
WHERE available = 0 OR progress < 1.0;
" >> $LOG_FILE 2>&1

# 检查TiCDC同步延迟
echo "--- TiCDC Sync Lag ---" >> $LOG_FILE
cdc cli changefeed list --server=http://10.0.1.31:8300 >> $LOG_FILE 2>&1

# 检查MPP查询性能
echo "--- MPP Query Performance ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT DIGEST_TEXT, AVG_LATENCY, EXEC_COUNT
FROM information_schema.statements_summary
WHERE DIGEST_TEXT LIKE '%GROUP BY%'
  AND AVG_LATENCY > 1000000000
ORDER BY AVG_LATENCY DESC
LIMIT 10;
" >> $LOG_FILE 2>&1

echo "=== Health Check Complete ===" >> $LOG_FILE

5大踩坑指南

❌ 坑1:TiFlash副本未同步就查询

-- ❌ 错误:刚添加TiFlash副本就查询,数据不完整
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 立即查询,可能读到不完整数据
SELECT COUNT(*) FROM orders;

-- ✅ 正确:等待副本同步完成
ALTER TABLE orders SET TIFLASH REPLICA 1;

-- 检查同步进度
SELECT table_name, progress, available
FROM information_schema.tiflash_replica
WHERE table_name = 'orders';

-- 等待 available = 1 后再查询
SELECT COUNT(*) FROM orders;

❌ 坑2:小表也加TiFlash副本

-- ❌ 错误:小表加TiFlash副本反而更慢(网络开销 > 扫描开销)
ALTER TABLE config_dict SET TIFLASH REPLICA 1;

-- ✅ 正确:小表保持在TiKV,只为大表添加TiFlash副本
-- 建议规则:数据量 > 100万行才考虑TiFlash副本
-- 小表通过Hint强制走TiKV
SELECT /*+ read_from_storage(tikv[config_dict]) */
    * FROM config_dict WHERE type = 'payment';

❌ 坑3:OLAP查询影响OLTP性能

-- ❌ 错误:OLAP和OLTP使用同一用户,资源无隔离
SELECT user, resource_group FROM mysql.user WHERE user = 'app_user';
-- app_user | default

-- ✅ 正确:使用Resource Group隔离
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
  RU_PER_SEC = 20000 PRIORITY = HIGH;
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
  RU_PER_SEC = 8000 PRIORITY = LOW;

ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;

-- OLAP查询设置超时
SET SESSION max_execution_time = 60000;

❌ 坑4:TiFlash节点磁盘空间不足

# ❌ 错误:TiFlash数据目录和系统盘共用
tiflash_servers:
  - host: 10.0.1.21
    data_dir: /var/lib/tiflash/data  # 系统盘空间有限

# ✅ 正确:TiFlash使用独立数据盘,预留30%空间
tiflash_servers:
  - host: 10.0.1.21
    data_dir: /data/tiflash/data    # 独立数据盘
    config:
      storage:
        main:
          capacity: "2TB"           # 明确容量限制
        latest:
          capacity: "200GB"

❌ 坑5:忽略TiCDC同步延迟告警

# ❌ 错误:不监控TiCDC延迟,数据同步断了都不知道
# 没有任何告警配置

# ✅ 正确:配置TiCDC延迟告警
# Prometheus告警规则
- alert: TiCDCReplicationLag
  expr: |
    tidb_cdc_checkpoint_ts_lag_seconds > 300
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "TiCDC同步延迟超过5分钟"
    description: "changefeed {{ $labels.changefeed }} 延迟 {{ $value }}s"

错误排查速查表

错误现象 可能原因 排查命令/方法 解决方案
TiFlash副本同步卡住 TiFlash节点磁盘满或宕机 SELECT * FROM information_schema.tiflash_replica WHERE progress < 1.0; 扩容磁盘或重启TiFlash节点
MPP查询报 Too many MPP tasks 并发MPP查询过多 SHOW CONFIG WHERE name LIKE '%mpp%'; 调整 tidb_mpp_task_limit 或降低并发
查询未走TiFlash 优化器判断TiKV更优 EXPLAIN SELECT ...; 查看算子 使用 /*+ read_from_storage(tiflash[t]) */ Hint
TiCDC同步延迟增大 下游写入慢或网络抖动 cdc cli changefeed query --changefeed-id=xxx 扩容下游或增加TiCDC worker数
TiFlash OOM 查询数据量过大 grep OOM /data/tiflash/log/tiflash.log 增加 max_memory_usage 或缩小查询范围
看板查询超时 未限制查询资源 SHOW PROCESSLIST; 设置 max_execution_time 和 Resource Group
物化视图刷新失败 源表数据量过大或锁冲突 SHOW MATERIALIZED VIEWS; 调整刷新频率或优化源表索引
TiFlash写入延迟高 Raft Learner同步压力大 Grafana TiFlash Dashboard 扩容TiFlash节点或调整 raft-store.sync-log
MPP Join数据倾斜 Join Key分布不均 EXPLAIN ANALYZE 查看各节点数据量 使用 /*+ shuffle_join() */ 或增加分桶数
TiFlash磁盘IO高 Compaction或大量扫描 iostat -x 1 检查IO 调整 bg_task_io_limit 或扩容磁盘

高级优化技巧

3.1 TiFlash存储引擎调优

-- 调整TiFlash Segment分裂阈值(减少小Segment)
-- 在TiFlash配置中修改
# tiflash-config.yaml
storage:
  main:
    segment_delta_rows: 8192
    segment_delta_size: 536870912
    segment_stable_rows: 10485760
  latest:
    segment_delta_rows: 4096
    segment_delta_size: 134217728

raft:
  apply_batch: 1024
  sync_log: false

engine:
  dt:
    enable_read_thread: true
    read_thread_count: 8
    enable_leaf_dt_segment_optimize: true

3.2 MPP查询高级优化

-- 1. 使用Runtime Filter减少Join数据传输
-- TiDB 8.x默认开启,确认状态
SET SESSION tidb_runtime_filter_mode = "LOCAL";

-- 2. 调整MPP Exchange数据压缩
SET SESSION tidb_exchange_compression = 'ON';
SET SESSION tidb_exchange_compression_level = 3;

-- 3. 大表Join优化:使用Hash Join Hint
SELECT /*+ hash_join_build(o) */
    c.category_name,
    SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN categories c ON oi.product_id = c.category_id
GROUP BY c.category_name;

-- 4. 使用Index Lookup Join替代Hash Join(适合小表驱动大表)
SELECT /*+ inl_join(p) */
    o.order_id, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.created_at >= CURDATE();

3.3 多租户HTAP资源隔离

-- 为不同业务线创建独立资源组
CREATE RESOURCE GROUP IF NOT EXISTS rg_ecommerce
  RU_PER_SEC = 15000 PRIORITY = HIGH
  QUERY_LIMIT = (EXECUTION_TIME = 30s);

CREATE RESOURCE GROUP IF NOT EXISTS rg_analytics
  RU_PER_SEC = 10000 PRIORITY = LOW
  QUERY_LIMIT = (EXECUTION_TIME = 120s);

CREATE RESOURCE GROUP IF NOT EXISTS rg_data_science
  RU_PER_SEC = 5000 PRIORITY = LOW
  QUERY_LIMIT = (EXECUTION_TIME = 600s);

-- 绑定用户到资源组
ALTER USER ecommerce_app RESOURCE GROUP rg_ecommerce;
ALTER USER analytics_user RESOURCE GROUP rg_analytics;
ALTER USER ds_user RESOURCE GROUP rg_data_science;

-- 监控资源组使用情况
SELECT * FROM information_schema.resource_group_usage;

HTAP方案对比

维度 TiDB HTAP ClickHouse + MySQL Snowflake Apache Doris SingleStore
架构 一体化HTAP 双系统拼接 云原生OLAP 一体化HTAP 一体化HTAP
OLTP能力 强(MySQL兼容) MySQL原生 中等
OLAP性能 强(MPP) 极强 极强
数据同步 Raft Learner(自动) 需自建ETL 需导入 自动 自动
实时性 秒级 分钟级 分钟级 秒级 秒级
MySQL兼容 95%+ 原生 不兼容 中等 中等
水平扩展 透明扩展 各自扩展 自动扩展 透明扩展 透明扩展
运维复杂度 中等 中等 中等
成本 中等 高(双系统) 高(按量付费) 中等
开源 完全开源 部分开源 闭源 完全开源 部分开源
适用场景 通用HTAP 已有MySQL+需强OLAP 纯分析 实时数仓 实时分析

总结

TiDB HTAP不是"既能做OLTP又能做OLAP"的营销话术,而是通过Raft Learner实现行存到列存的实时同步,通过MPP引擎实现分布式并行计算,通过Resource Control实现OLTP/OLAP资源隔离的工程化解决方案。2026年的今天,如果你还在维护MySQL + ClickHouse的双系统架构,是时候认真评估TiDB HTAP了——一套系统、一份运维、零数据延迟,这才是实时分析该有的样子。


推荐工具

本站提供浏览器本地工具,免注册即可试用 →

#TiDB#HTAP#实时分析#TiFlash#分布式数据库#2026#列存