TiDB HTAP实时分析实战:从TiFlash到实时看板的6种生产模式
T+1分析已死,实时洞察才是未来
凌晨3点,线上订单暴涨10倍,CEO问:"现在各品类实时GMV多少?"——你只能回答:"数据还在ETL管道里,明天才能看到。"
这就是传统T+1分析的困境:
- ETL管道脆弱:Kafka → Flink → ClickHouse,任何一环故障,数据断流
- 数据延迟不可控:从OLTP到OLAP,分钟级到小时级延迟是常态
- 双系统运维噩梦:MySQL负责交易,ClickHouse负责分析,数据一致性靠人肉
- 成本失控:两套存储、两套计算、两套运维,资源利用率低下
- 实时性不足:业务决策需要秒级响应,T+1报表已成历史遗留
TiDB HTAP的承诺:一套系统,同时承载OLTP和OLAP,数据零延迟从行存同步到列存。这不是PPT架构,而是已经在数千家企业生产验证的方案。
HTAP核心概念速查
| 概念 | 全称 | 核心作用 | 类比 |
|---|---|---|---|
| HTAP | Hybrid Transactional/Analytical Processing | 一套系统同时处理事务和分析 | 一人分饰交易员和分析师 |
| TiFlash | TiDB Flash | 列式存储引擎,承载OLAP查询 | OLAP的专属加速器 |
| TiKV | TiDB Key-Value | 行式存储引擎,承载OLTP事务 | OLTP的基石 |
| MPP | Massively Parallel Processing | 多节点并行计算框架 | 多人同时算不同分片 |
| 列式存储 | Columnar Storage | 按列存储数据,分析查询只读需要的列 | 只翻需要的书页 |
| Raft Learner | Raft一致性协议学习者角色 | 异步复制数据到TiFlash,不影响主流程 | 影子抄写员 |
| 数据集市 | Data Mart | 面向特定业务域的预聚合数据集合 | 为每个部门定制的报表橱窗 |
💡 使用 JSON 格式化 工具格式化 TiDB 配置文件,便于排查问题。
传统实时分析架构的5大痛点
痛点1:ETL管道是最大的单点故障
MySQL → Canal → Kafka → Flink → ClickHouse
│ │ │ │ │
│ │ │ │ └─ 写入失败,数据丢失
│ │ │ └─ 作业崩溃,数据积压
│ │ └─ 分区不均,消费延迟
│ └─ Binlog解析异常
└─ 主从延迟
任何一个环节出问题,整条链路断裂。凌晨告警不断,运维苦不堪言。
痛点2:数据一致性无法保证
OLTP和OLAP是两套独立系统,数据同步存在延迟窗口:
-- MySQL: 订单已退款
UPDATE orders SET status = 'refunded' WHERE id = 10086;
-- ClickHouse: 5分钟后才看到退款状态
-- 这5分钟内,报表数据是错的
痛点3:双系统运维成本翻倍
| 运维项 | MySQL | ClickHouse | 合计 |
|---|---|---|---|
| 监控告警 | Grafana + Prometheus | Grafana + Prometheus | 2套 |
| 备份恢复 | mysqldump / XtraBackup | clickhouse-backup | 2套 |
| 容量规划 | 主从 + 分库分表 | 分布式表 + 副本 | 2套 |
| 版本升级 | 滚动升级 | 滚动升级 | 2套 |
| 人员要求 | DBA | ClickHouse专家 | 2个团队 |
痛点4:查询性能与数据新鲜度的矛盾
- ClickHouse追求查询性能,需要MergeTree合并,数据有延迟
- 要实时性就要用ReplacingMergeTree,但查询需要FINAL,性能暴跌
- 实时性和性能,鱼和熊掌不可兼得
痛点5:开发效率低下
// 开发者需要维护两套数据模型
public class OrderService {
// OLTP: MySQL
@Transactional
public void createOrder(Order order) {
orderMapper.insert(order);
}
// OLAP: ClickHouse - 还要写一遍
public void syncToClickHouse(Order order) {
clickHouseMapper.insert(order);
// 字段映射、类型转换、异常处理...
}
}
6种生产级HTAP实战模式
模式1:TiFlash列存副本搭建
TiFlash通过Raft Learner角色异步接收TiKV的数据变更,实现行存到列存的实时同步。
1.1 部署TiFlash节点
# tiup topology.yaml 片段
tiflash_servers:
- host: 10.0.1.21
tcp_port: 9000
http_port: 8123
flash_service_port: 3930
flash_proxy_port: 20170
flash_proxy_status_port: 20292
metrics_interval: 15
data_dir: /data/tiflash/data
deploy_dir: /data/tiflash/deploy
config:
logger:
level: information
storage:
main:
capacity: "2TB"
latest:
capacity: "200GB"
- host: 10.0.1.22
tcp_port: 9000
http_port: 8123
flash_service_port: 3930
flash_proxy_port: 20170
flash_proxy_status_port: 20292
data_dir: /data/tiflash/data
deploy_dir: /data/tiflash/deploy
# 扩容TiFlash节点
tiup cluster scale-out tidb-prod topology-tiflash.yaml
# 验证TiFlash节点状态
tiup cluster display tidb-prod | grep tiflash
1.2 为表添加TiFlash副本
-- 为整张表添加TiFlash副本(默认1副本)
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 为表添加2个TiFlash副本(高可用)
ALTER TABLE orders SET TIFLASH REPLICA 2;
-- 为分区表添加TiFlash副本
ALTER TABLE orders_partition SET TIFLASH REPLICA 1;
-- 查看副本同步进度
SELECT * FROM information_schema.tiflash_replica
WHERE table_name = 'orders';
同步进度输出示例:
+----------+------------+----------+---------------+------------------+-----------+
| TABLE_ID | TABLE_NAME | REPLICA | LOCATION_LABELS| AVAILABLE | PROGRESS |
+----------+------------+----------+---------------+------------------+-----------+
| 85 | orders | 1 | | True | 1.0 |
+----------+------------+----------+---------------+------------------+-----------+
1.3 按列添加TiFlash副本(节省存储)
-- 只为分析需要的列创建TiFlash副本
ALTER TABLE orders SET TIFLASH REPLICA 1
COLUMNS (order_id, user_id, product_id, amount, status, created_at);
-- 验证列级副本
SHOW CREATE TABLE orders;
模式2:MPP分布式查询加速
TiDB的MPP引擎将分析查询下推到多个TiFlash节点并行执行,实现接近ClickHouse的查询性能。
2.1 启用MPP模式
-- 查看当前MPP模式
SHOW VARIABLES LIKE 'tidb_allow_mpp';
-- 启用MPP(默认已启用)
SET SESSION tidb_allow_mpp = 1;
-- 强制使用MPP(用于测试验证)
SET SESSION tidb_enforce_mpp = 1;
-- 设置MPP查询的Exchange压缩
SET SESSION tidb_exchange_compression = 'ON';
2.2 MPP查询实战:实时销售分析
-- 场景:实时统计各品类今日销售额Top10
-- MPP会自动将计算下推到TiFlash节点并行执行
SELECT
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales,
COUNT(DISTINCT o.user_id) AS unique_buyers,
AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
AND o.status IN ('paid', 'shipped', 'completed')
GROUP BY c.category_name
ORDER BY total_sales DESC
LIMIT 10;
2.3 验证MPP执行计划
-- 查看执行计划,确认MPP算子
EXPLAIN ANALYZE
SELECT
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;
关键执行计划特征:
+------------------------------------------+----------+---------+--------------+
| id | task | rows | operator info|
+------------------------------------------+----------+---------+--------------+
| TableReader_45 | root | 10 | ... |
| └─ExchangeSender_44 | mpp[tif] | 10 | ExchangeType |
| └─Projection_42 | mpp[tif] | 10 | ... |
| └─HashAgg_43 | mpp[tif] | 10 | group by:... |
| └─ExchangeReceiver_38 | mpp[tif] | 100 | ... |
| └─ExchangeSender_37 | mpp[tif] | 100 | ExchangeType |
| └─HashJoin_36 | mpp[tif] | 100 | inner join |
| ├─Selection_25(Build) | mpp[tif] | 500 | ... |
| │ └─TableFullScan_24 | mpp[tif] | 10000 | tikv:tiflash |
| └─Selection_30(Probe) | mpp[tif] | 800 | ... |
| └─TableFullScan_29 | mpp[tif] | 20000 | tikv:tiflash |
+------------------------------------------+----------+---------+--------------+
看到
mpp[tif]和ExchangeSender/Receiver说明MPP已生效。
2.4 MPP性能对比
-- 关闭MPP,使用传统TiKV扫描
SET SESSION tidb_allow_mpp = 0;
EXPLAIN ANALYZE
SELECT c.category_name, SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.created_at >= CURDATE()
GROUP BY c.category_name;
-- 重新启用MPP
SET SESSION tidb_allow_mpp = 1;
| 查询模式 | 数据量 | 执行时间 | 扫描行数 |
|---|---|---|---|
| TiKV扫描 | 1亿行 | 45.2s | 100,000,000 |
| TiFlash非MPP | 1亿行 | 12.8s | 100,000,000 |
| TiFlash MPP | 1亿行 | 3.2s | 100,000,000 |
模式3:TiCDC实时ETL替代
TiCDC(Change Data Capture)可以替代传统ETL管道,将TiDB的数据变更实时同步到下游系统。
3.1 部署TiCDC
# tiup topology.yaml 片段
cdc_servers:
- host: 10.0.1.31
port: 8300
deploy_dir: /data/cdc/deploy
data_dir: /data/cdc/data
config:
per-table-memory-quota: 10485760
sink-memory-quota: 536870912
- host: 10.0.1.32
port: 8300
deploy_dir: /data/cdc/deploy
data_dir: /data/cdc/data
# 扩容TiCDC
tiup cluster scale-out tidb-prod topology-cdc.yaml
# 验证TiCDC状态
tiup cluster display tidb-prod | grep cdc
3.2 创建同步任务:替代MySQL → Kafka → Flink链路
# 创建TiCDC同步任务:TiDB → Kafka(替代Canal + Kafka)
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="kafka://kafka-cluster:9092/tidb-orders?partition-num=6&max-message-bytes=10485760&replication-factor=3" \
--changefeed-id="orders-to-kafka" \
--config=cdc-orders.yaml
# cdc-orders.yaml
filter:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
- 'ecommerce.payments'
ignore_txn_start_ts: []
mounter:
worker-num: 16
sink:
protocol: "canal-json"
dispatchers:
- matcher: ['ecommerce.*']
partition: "index-value"
selector:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
- 'ecommerce.payments'
3.3 直接同步到ClickHouse(如果仍需保留)
# TiCDC → ClickHouse 直连
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="clickhouse://clickhouse-server:8123?database=analytics&username=default&password=***" \
--changefeed-id="orders-to-clickhouse" \
--config=cdc-ch.yaml
3.4 TiCDC同步到S3(数据湖方案)
# TiCDC → S3 数据湖
cdc cli changefeed create \
--server=http://10.0.1.31:8300 \
--sink-uri="s3://data-lake/tidb-cdc/?endpoint=https://s3.amazonaws.com&access-key=***&secret-access-key=***" \
--changefeed-id="orders-to-s3" \
--config=cdc-s3.yaml
# cdc-s3.yaml
sink:
protocol: "csv"
column-selectors:
- matcher: ['ecommerce.orders']
columns: ['order_id', 'user_id', 'amount', 'status', 'created_at']
file:
size: 67108864
interval: 5m
filter:
rules:
- 'ecommerce.orders'
- 'ecommerce.order_items'
3.5 监控TiCDC同步延迟
-- 查看同步任务状态
cdc cli changefeed list \
--server=http://10.0.1.31:8300;
-- 查看同步延迟
cdc cli changefeed query \
--changefeed-id=orders-to-kafka \
--server=http://10.0.1.31:8300;
关键指标:
{
"state": "normal",
"tso": 446685327530688513,
"checkpoint": "2026-06-15 10:30:00.123",
"lag": "1.2s"
}
模式4:看板查询优化
实时看板是HTAP最直接的应用场景,但查询性能直接影响用户体验。
4.1 看板查询模式识别
-- 典型看板查询1:实时GMV
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv,
COUNT(*) AS order_count
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;
-- 典型看板查询2:实时用户活跃
SELECT
DATE_FORMAT(login_time, '%Y-%m-%d %H:00') AS hour_slot,
COUNT(DISTINCT user_id) AS dau,
COUNT(*) AS total_actions
FROM user_actions
WHERE login_time >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot
ORDER BY hour_slot;
-- 典型看板查询3:实时转化漏斗
SELECT
COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END) AS views,
COUNT(DISTINCT CASE WHEN event = 'cart' THEN user_id END) AS carts,
COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) AS pays,
ROUND(COUNT(DISTINCT CASE WHEN event = 'pay' THEN user_id END) * 100.0
/ NULLIF(COUNT(DISTINCT CASE WHEN event = 'view' THEN user_id END), 0), 2) AS conversion_rate
FROM user_events
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 HOUR);
4.2 强制查询走TiFlash
-- 方式1:通过Hint强制走TiFlash
SELECT /*+ read_from_storage(tiflash[orders]) */
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
-- 方式2:设置Session变量
SET SESSION tidb_isolation_read_engines = 'tiflash';
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY hour_slot;
SET SESSION tidb_isolation_read_engines = 'tikv,tiflash';
-- 方式3:通过Resource Group隔离OLAP流量
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 5000
PRIORITY = LOW;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
4.3 看板查询缓存优化
-- 使用Binding固定执行计划
CREATE SESSION BINDING FOR
SELECT SUM(amount) FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
USING
SELECT /*+ read_from_storage(tiflash[orders]) */ SUM(amount)
FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR);
-- 查看Binding
SHOW SESSION BINDINGS;
4.4 看板刷新策略
-- 看板查询建议使用READ ONLY模式,避免和OLTP争抢资源
SET SESSION tidb_isolation_read_engines = 'tiflash';
-- 限制看板查询的资源消耗
SET SESSION max_execution_time = 30000;
-- 使用prepared statement减少解析开销
PREPARE dashboard_gmv FROM
SELECT
DATE_FORMAT(created_at, '%Y-%m-%d %H:00') AS hour_slot,
SUM(amount) AS gmv
FROM orders
WHERE created_at >= ?
AND status != 'cancelled'
GROUP BY hour_slot
ORDER BY hour_slot;
EXECUTE dashboard_gmv USING DATE_SUB(NOW(), INTERVAL 24 HOUR);
模式5:物化视图构建数据集市
TiDB 8.x支持物化视图(Materialized View),可以预计算聚合数据,构建面向业务的数据集市。
5.1 创建物化视图
-- 创建日销售汇总物化视图
CREATE MATERIALIZED VIEW mv_daily_sales
REFRESH EVERY 1 HOUR
AS
SELECT
DATE(created_at) AS sale_date,
c.category_name,
p.product_name,
SUM(oi.quantity) AS total_quantity,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
COUNT(DISTINCT o.user_id) AS unique_buyers,
AVG(oi.quantity * oi.unit_price) AS avg_price
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE o.status IN ('paid', 'shipped', 'completed')
GROUP BY DATE(created_at), c.category_name, p.product_name;
5.2 创建用户行为物化视图
-- 用户行为漏斗物化视图
CREATE MATERIALIZED VIEW mv_user_funnel
REFRESH EVERY 5 MINUTE
AS
SELECT
DATE(created_at) AS event_date,
COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END) AS page_views,
COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END) AS add_to_carts,
COUNT(DISTINCT CASE WHEN event_type = 'checkout' THEN user_id END) AS checkouts,
COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) AS payments,
ROUND(
COUNT(DISTINCT CASE WHEN event_type = 'payment' THEN user_id END) * 100.0
/ NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN user_id END), 0),
2
) AS view_to_pay_rate
FROM user_events
GROUP BY DATE(created_at);
5.3 管理物化视图
-- 查看物化视图状态
SHOW MATERIALIZED VIEWS;
-- 手动刷新物化视图
ALTER MATERIALIZED VIEW mv_daily_sales REFRESH;
-- 暂停自动刷新
ALTER MATERIALIZED VIEW mv_daily_sales DISABLE;
-- 恢复自动刷新
ALTER MATERIALIZED VIEW mv_daily_sales ENABLE;
-- 删除物化视图
DROP MATERIALIZED VIEW IF EXISTS mv_daily_sales;
5.4 基于物化视图的看板查询
-- 查询物化视图(毫秒级响应)
SELECT
sale_date,
category_name,
total_revenue,
unique_buyers
FROM mv_daily_sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 7 DAY)
ORDER BY total_revenue DESC
LIMIT 20;
模式6:生产HTAP部署与监控
生产环境的HTAP部署需要考虑资源隔离、监控告警和故障恢复。
6.1 资源管控配置
-- 创建OLTP资源组(高优先级)
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
RU_PER_SEC = 20000
PRIORITY = HIGH;
-- 创建OLAP资源组(低优先级,限制资源)
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 8000
PRIORITY = LOW;
-- 将用户绑定到资源组
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
ALTER USER etl_user RESOURCE GROUP olap_group;
-- 查看资源组配置
SELECT * FROM information_schema.resource_groups;
6.2 TiFlash关键监控指标
# Grafana Dashboard 关键指标
# TiFlash Summary Dashboard
tiflash_system_profile:
- metric: tiflash_storage_write_duration_seconds
alert_threshold: "> 5s"
description: "TiFlash写入延迟"
- metric: tiflash_mpp_task_duration_seconds
alert_threshold: "> 60s"
description: "MPP任务执行时长"
- metric: tiflash_storage_segment_count
alert_threshold: "> 10000"
description: "Segment数量(过多需compaction)"
- metric: tiflash_cpu_usage
alert_threshold: "> 80%"
description: "TiFlash CPU使用率"
- metric: tiflash_storage_disk_usage_ratio
alert_threshold: "> 0.8"
description: "TiFlash磁盘使用率"
6.3 Prometheus告警规则
# tidb-htap-alerts.yaml
groups:
- name: tidb_htap
rules:
- alert: TiFlashReplicaSyncLag
expr: |
tiflash_replica_available == 0
or (tiflash_replica_progress < 0.95 and tiflash_replica_progress > 0)
for: 10m
labels:
severity: warning
annotations:
summary: "TiFlash副本同步延迟"
description: "表 {{ $labels.table_name }} 的TiFlash副本同步进度仅 {{ $value }}"
- alert: TiFlashHighWriteDuration
expr: |
histogram_quantile(0.95,
rate(tiflash_storage_write_duration_seconds_bucket[5m])
) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "TiFlash写入延迟过高"
description: "P95写入延迟 {{ $value }}s"
- alert: MPPQueryTimeout
expr: |
rate(tiflash_mpp_task_error_count{reason="timeout"}[5m]) > 0.1
for: 3m
labels:
severity: critical
annotations:
summary: "MPP查询超时率过高"
description: "MPP查询超时率 {{ $value }}/s"
- alert: TiFlashDiskSpaceLow
expr: |
tiflash_storage_disk_usage_ratio > 0.85
for: 5m
labels:
severity: critical
annotations:
summary: "TiFlash磁盘空间不足"
description: "磁盘使用率 {{ $value }}"
6.4 日常运维脚本
#!/bin/bash
# tidb-htap-healthcheck.sh
# TiDB HTAP健康检查脚本
CLUSTER_NAME="tidb-prod"
LOG_FILE="/var/log/tidb-htap-healthcheck.log"
echo "=== TiDB HTAP Health Check $(date) ===" >> $LOG_FILE
# 检查TiFlash节点状态
echo "--- TiFlash Node Status ---" >> $LOG_FILE
tiup cluster display $CLUSTER_NAME 2>/dev/null | grep tiflash >> $LOG_FILE
# 检查TiFlash副本同步状态
echo "--- TiFlash Replica Sync Status ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT table_schema, table_name, replica_count, available, progress
FROM information_schema.tiflash_replica
WHERE available = 0 OR progress < 1.0;
" >> $LOG_FILE 2>&1
# 检查TiCDC同步延迟
echo "--- TiCDC Sync Lag ---" >> $LOG_FILE
cdc cli changefeed list --server=http://10.0.1.31:8300 >> $LOG_FILE 2>&1
# 检查MPP查询性能
echo "--- MPP Query Performance ---" >> $LOG_FILE
mysql -h10.0.1.10 -P4000 -uroot -e "
SELECT DIGEST_TEXT, AVG_LATENCY, EXEC_COUNT
FROM information_schema.statements_summary
WHERE DIGEST_TEXT LIKE '%GROUP BY%'
AND AVG_LATENCY > 1000000000
ORDER BY AVG_LATENCY DESC
LIMIT 10;
" >> $LOG_FILE 2>&1
echo "=== Health Check Complete ===" >> $LOG_FILE
5大踩坑指南
❌ 坑1:TiFlash副本未同步就查询
-- ❌ 错误:刚添加TiFlash副本就查询,数据不完整
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 立即查询,可能读到不完整数据
SELECT COUNT(*) FROM orders;
-- ✅ 正确:等待副本同步完成
ALTER TABLE orders SET TIFLASH REPLICA 1;
-- 检查同步进度
SELECT table_name, progress, available
FROM information_schema.tiflash_replica
WHERE table_name = 'orders';
-- 等待 available = 1 后再查询
SELECT COUNT(*) FROM orders;
❌ 坑2:小表也加TiFlash副本
-- ❌ 错误:小表加TiFlash副本反而更慢(网络开销 > 扫描开销)
ALTER TABLE config_dict SET TIFLASH REPLICA 1;
-- ✅ 正确:小表保持在TiKV,只为大表添加TiFlash副本
-- 建议规则:数据量 > 100万行才考虑TiFlash副本
-- 小表通过Hint强制走TiKV
SELECT /*+ read_from_storage(tikv[config_dict]) */
* FROM config_dict WHERE type = 'payment';
❌ 坑3:OLAP查询影响OLTP性能
-- ❌ 错误:OLAP和OLTP使用同一用户,资源无隔离
SELECT user, resource_group FROM mysql.user WHERE user = 'app_user';
-- app_user | default
-- ✅ 正确:使用Resource Group隔离
CREATE RESOURCE GROUP IF NOT EXISTS oltp_group
RU_PER_SEC = 20000 PRIORITY = HIGH;
CREATE RESOURCE GROUP IF NOT EXISTS olap_group
RU_PER_SEC = 8000 PRIORITY = LOW;
ALTER USER app_user RESOURCE GROUP oltp_group;
ALTER USER dashboard_user RESOURCE GROUP olap_group;
-- OLAP查询设置超时
SET SESSION max_execution_time = 60000;
❌ 坑4:TiFlash节点磁盘空间不足
# ❌ 错误:TiFlash数据目录和系统盘共用
tiflash_servers:
- host: 10.0.1.21
data_dir: /var/lib/tiflash/data # 系统盘空间有限
# ✅ 正确:TiFlash使用独立数据盘,预留30%空间
tiflash_servers:
- host: 10.0.1.21
data_dir: /data/tiflash/data # 独立数据盘
config:
storage:
main:
capacity: "2TB" # 明确容量限制
latest:
capacity: "200GB"
❌ 坑5:忽略TiCDC同步延迟告警
# ❌ 错误:不监控TiCDC延迟,数据同步断了都不知道
# 没有任何告警配置
# ✅ 正确:配置TiCDC延迟告警
# Prometheus告警规则
- alert: TiCDCReplicationLag
expr: |
tidb_cdc_checkpoint_ts_lag_seconds > 300
for: 5m
labels:
severity: critical
annotations:
summary: "TiCDC同步延迟超过5分钟"
description: "changefeed {{ $labels.changefeed }} 延迟 {{ $value }}s"
错误排查速查表
| 错误现象 | 可能原因 | 排查命令/方法 | 解决方案 |
|---|---|---|---|
| TiFlash副本同步卡住 | TiFlash节点磁盘满或宕机 | SELECT * FROM information_schema.tiflash_replica WHERE progress < 1.0; |
扩容磁盘或重启TiFlash节点 |
MPP查询报 Too many MPP tasks |
并发MPP查询过多 | SHOW CONFIG WHERE name LIKE '%mpp%'; |
调整 tidb_mpp_task_limit 或降低并发 |
| 查询未走TiFlash | 优化器判断TiKV更优 | EXPLAIN SELECT ...; 查看算子 |
使用 /*+ read_from_storage(tiflash[t]) */ Hint |
| TiCDC同步延迟增大 | 下游写入慢或网络抖动 | cdc cli changefeed query --changefeed-id=xxx |
扩容下游或增加TiCDC worker数 |
| TiFlash OOM | 查询数据量过大 | grep OOM /data/tiflash/log/tiflash.log |
增加 max_memory_usage 或缩小查询范围 |
| 看板查询超时 | 未限制查询资源 | SHOW PROCESSLIST; |
设置 max_execution_time 和 Resource Group |
| 物化视图刷新失败 | 源表数据量过大或锁冲突 | SHOW MATERIALIZED VIEWS; |
调整刷新频率或优化源表索引 |
| TiFlash写入延迟高 | Raft Learner同步压力大 | Grafana TiFlash Dashboard | 扩容TiFlash节点或调整 raft-store.sync-log |
| MPP Join数据倾斜 | Join Key分布不均 | EXPLAIN ANALYZE 查看各节点数据量 |
使用 /*+ shuffle_join() */ 或增加分桶数 |
| TiFlash磁盘IO高 | Compaction或大量扫描 | iostat -x 1 检查IO |
调整 bg_task_io_limit 或扩容磁盘 |
高级优化技巧
3.1 TiFlash存储引擎调优
-- 调整TiFlash Segment分裂阈值(减少小Segment)
-- 在TiFlash配置中修改
# tiflash-config.yaml
storage:
main:
segment_delta_rows: 8192
segment_delta_size: 536870912
segment_stable_rows: 10485760
latest:
segment_delta_rows: 4096
segment_delta_size: 134217728
raft:
apply_batch: 1024
sync_log: false
engine:
dt:
enable_read_thread: true
read_thread_count: 8
enable_leaf_dt_segment_optimize: true
3.2 MPP查询高级优化
-- 1. 使用Runtime Filter减少Join数据传输
-- TiDB 8.x默认开启,确认状态
SET SESSION tidb_runtime_filter_mode = "LOCAL";
-- 2. 调整MPP Exchange数据压缩
SET SESSION tidb_exchange_compression = 'ON';
SET SESSION tidb_exchange_compression_level = 3;
-- 3. 大表Join优化:使用Hash Join Hint
SELECT /*+ hash_join_build(o) */
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN categories c ON oi.product_id = c.category_id
GROUP BY c.category_name;
-- 4. 使用Index Lookup Join替代Hash Join(适合小表驱动大表)
SELECT /*+ inl_join(p) */
o.order_id, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.created_at >= CURDATE();
3.3 多租户HTAP资源隔离
-- 为不同业务线创建独立资源组
CREATE RESOURCE GROUP IF NOT EXISTS rg_ecommerce
RU_PER_SEC = 15000 PRIORITY = HIGH
QUERY_LIMIT = (EXECUTION_TIME = 30s);
CREATE RESOURCE GROUP IF NOT EXISTS rg_analytics
RU_PER_SEC = 10000 PRIORITY = LOW
QUERY_LIMIT = (EXECUTION_TIME = 120s);
CREATE RESOURCE GROUP IF NOT EXISTS rg_data_science
RU_PER_SEC = 5000 PRIORITY = LOW
QUERY_LIMIT = (EXECUTION_TIME = 600s);
-- 绑定用户到资源组
ALTER USER ecommerce_app RESOURCE GROUP rg_ecommerce;
ALTER USER analytics_user RESOURCE GROUP rg_analytics;
ALTER USER ds_user RESOURCE GROUP rg_data_science;
-- 监控资源组使用情况
SELECT * FROM information_schema.resource_group_usage;
HTAP方案对比
| 维度 | TiDB HTAP | ClickHouse + MySQL | Snowflake | Apache Doris | SingleStore |
|---|---|---|---|---|---|
| 架构 | 一体化HTAP | 双系统拼接 | 云原生OLAP | 一体化HTAP | 一体化HTAP |
| OLTP能力 | 强(MySQL兼容) | MySQL原生 | 无 | 弱 | 中等 |
| OLAP性能 | 强(MPP) | 极强 | 极强 | 强 | 强 |
| 数据同步 | Raft Learner(自动) | 需自建ETL | 需导入 | 自动 | 自动 |
| 实时性 | 秒级 | 分钟级 | 分钟级 | 秒级 | 秒级 |
| MySQL兼容 | 95%+ | 原生 | 不兼容 | 中等 | 中等 |
| 水平扩展 | 透明扩展 | 各自扩展 | 自动扩展 | 透明扩展 | 透明扩展 |
| 运维复杂度 | 中等 | 高 | 低 | 中等 | 中等 |
| 成本 | 中等 | 高(双系统) | 高(按量付费) | 中等 | 高 |
| 开源 | 完全开源 | 部分开源 | 闭源 | 完全开源 | 部分开源 |
| 适用场景 | 通用HTAP | 已有MySQL+需强OLAP | 纯分析 | 实时数仓 | 实时分析 |
总结
TiDB HTAP不是"既能做OLTP又能做OLAP"的营销话术,而是通过Raft Learner实现行存到列存的实时同步,通过MPP引擎实现分布式并行计算,通过Resource Control实现OLTP/OLAP资源隔离的工程化解决方案。2026年的今天,如果你还在维护MySQL + ClickHouse的双系统架构,是时候认真评估TiDB HTAP了——一套系统、一份运维、零数据延迟,这才是实时分析该有的样子。
推荐工具
- JSON 格式化 - 格式化TiDB配置JSON文件
- Base64 编解码 - 处理TiCDC证书编码
- Hash 计算 - 校验数据同步一致性
本站提供浏览器本地工具,免注册即可试用 →