CockroachDB Distributed SQL: Complete Online Migration from MySQL to NewSQL in 2026

编程语言

Are You Exhausted by MySQL Sharding?

Wrong shard keys causing hotspots, cross-shard JOIN performance disasters, distributed transactions cobbled together with message queues, scaling requiring downtime for data migration — when your MySQL cluster reaches TB scale, sharding operational costs far exceed business value. In 2026, NewSQL databases offer a more elegant solution: CockroachDB, a natively distributed, strongly consistent SQL database compatible with PostgreSQL protocol.

This article guides you through a zero-downtime online migration from MySQL to CockroachDB, from architecture principles to hands-on steps, from distributed transactions to multi-region deployment.


CockroachDB Architecture Core

Concept Description MySQL Comparison
Range Basic unit of data sharding by range (default 64MB) Like MySQL Partition, but auto-managed
Replica 3 replicas per Range by default, using Raft consensus Similar to MySQL MGR, but more mature
Lease Holder Read/write leader for a Range, handles local reads Like MySQL primary, but finer granularity
DistSQL Distributed SQL engine, auto-parallelizes queries MySQL requires manual sharding + middleware
HLC Hybrid Logical Clock, provides global ordering MySQL relies on GTID, no global ordering across shards

Distributed Transaction ACID Guarantees

CockroachDB uses Parallel Commits protocol for distributed transactions, eliminating the 2PC coordinator single point:

Transaction Flow:
1. Write Intent to all involved Ranges
2. Send PREPARE and COMMIT in parallel to all Ranges
3. If any node crashes, others recover via Write Intent
4. Transaction latency = 1 network round trip (vs 2 for 2PC)

Problem Analysis: MySQL to CockroachDB Migration Challenges

  1. SQL Compatibility: CockroachDB is PostgreSQL-compatible, MySQL syntax needs rewriting
  2. Auto-increment ID: MySQL's AUTO_INCREMENT is non-sequential in distributed mode, use UUID or SEQUENCE
  3. Foreign Key Cascades: Supported but not recommended (performance issues)
  4. Storage Engine Difference: InnoDB clustered index vs CockroachDB LSM-tree
  5. Character Set: MySQL's utf8mb4 maps to CockroachDB's UTF8 (default)

Step-by-Step: MySQL Online Migration

Step 1: Install CockroachDB

# Linux
curl -L https://binaries.cockroachdb.com/cockroach-v24.3.0.linux-amd64.tgz | tar -xz
sudo cp cockroach-v24.3.0.linux-amd64/cockroach /usr/local/bin/

# Docker
docker pull cockroachdb/cockroach:v24.3.0

# Start single node (dev/test)
cockroach start-single-node --insecure --listen-addr=localhost:26257 --store=path=/data/cockroach

Step 2: Start Production Cluster

# Node 1
cockroach start \
  --insecure \
  --store=path=/data/cockroach1 \
  --listen-addr=192.168.1.10:26257 \
  --http-addr=192.168.1.10:8080 \
  --join=192.168.1.10:26257,192.168.1.11:26257,192.168.1.12:26257 \
  --background

# Initialize cluster (first time only)
cockroach init --insecure --host=192.168.1.10:26257

Step 3: Use MOLT Migration Tool

# Install MOLT
curl -L https://binaries.cockroachdb.com/molt/v1.0.0/molt_linux_amd64.tar.gz | tar -xz
sudo cp molt /usr/local/bin/

# Assess migration feasibility
molt assess \
  --source-type=mysql \
  --source="root:password@tcp(mysql-host:3306)/mydb" \
  --target="postgresql://root@cockroach-host:26257/mydb?sslmode=disable"

# Export Schema
molt dump-schema \
  --source-type=mysql \
  --source="root:password@tcp(mysql-host:3306)/mydb" \
  --output-dir=./schema

# Convert Schema (MySQL→CockroachDB)
molt convert-schema \
  --input-dir=./schema \
  --output-dir=./converted-schema

# Apply Schema to CockroachDB
cockroach sql --insecure --host=cockroach-host < ./converted-schema/schema.sql

# Online data migration (CDC mode, zero downtime)
molt migrate-data \
  --source-type=mysql \
  --source="root:password@tcp(mysql-host:3306)/mydb" \
  --target="postgresql://root@cockroach-host:26257/mydb?sslmode=disable" \
  --mode=cdc \
  --batch-size=10000 \
  --concurrency=8

Step 4: SQL Syntax Rewriting

-- MySQL: AUTO_INCREMENT → CockroachDB: SERIAL
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100)
);

-- MySQL: LIMIT offset, count → CockroachDB: LIMIT count OFFSET offset
SELECT * FROM orders LIMIT 20 OFFSET 10;

-- MySQL: GROUP_CONCAT → CockroachDB: STRING_AGG
SELECT dept, STRING_AGG(name, ',') FROM employees GROUP BY dept;

-- MySQL: ON DUPLICATE KEY UPDATE → CockroachDB: UPSERT
UPSERT INTO users (id, name) VALUES (1, 'Alice');

Complete Code: Application Migration Example

package main

import (
    "context"
    "database/sql"
    "fmt"
    "log"
    "time"

    _ "github.com/lib/pq"
)

type Order struct {
    ID        int
    UserID    int
    Amount    float64
    Status    string
    CreatedAt time.Time
}

type OrderRepo struct {
    db *sql.DB
}

func NewOrderRepo(connStr string) (*OrderRepo, error) {
    db, err := sql.Open("postgres", connStr)
    if err != nil {
        return nil, fmt.Errorf("open db: %w", err)
    }
    db.SetMaxOpenConns(25)
    db.SetMaxIdleConns(10)
    db.SetConnMaxLifetime(5 * time.Minute)
    return &OrderRepo{db: db}, nil
}

func (r *OrderRepo) CreateOrder(ctx context.Context, order *Order) error {
    query := `
        INSERT INTO orders (user_id, amount, status, created_at)
        VALUES ($1, $2, $3, $4)
        RETURNING id`
    return r.db.QueryRowContext(ctx, query,
        order.UserID, order.Amount, order.Status, order.CreatedAt,
    ).Scan(&order.ID)
}

func (r *OrderRepo) TransferOrder(ctx context.Context, fromUserID, toUserID int, amount float64) error {
    tx, err := r.db.BeginTx(ctx, nil)
    if err != nil {
        return fmt.Errorf("begin tx: %w", err)
    }
    defer tx.Rollback()

    deductQuery := `UPDATE accounts SET balance = balance - $1 WHERE user_id = $2 AND balance >= $1`
    result, err := tx.ExecContext(ctx, deductQuery, amount, fromUserID)
    if err != nil {
        return fmt.Errorf("deduct: %w", err)
    }
    if n, _ := result.RowsAffected(); n == 0 {
        return fmt.Errorf("insufficient balance")
    }

    creditQuery := `UPDATE accounts SET balance = balance + $1 WHERE user_id = $2`
    if _, err := tx.ExecContext(ctx, creditQuery, amount, toUserID); err != nil {
        return fmt.Errorf("credit: %w", err)
    }

    logQuery := `INSERT INTO transfer_logs (from_user, to_user, amount, created_at) VALUES ($1, $2, $3, NOW())`
    if _, err := tx.ExecContext(ctx, logQuery, fromUserID, toUserID, amount); err != nil {
        return fmt.Errorf("log: %w", err)
    }

    return tx.Commit()
}

func main() {
    repo, err := NewOrderRepo("postgresql://root@localhost:26257/mydb?sslmode=disable")
    if err != nil {
        log.Fatal(err)
    }
    defer repo.db.Close()

    ctx := context.Background()
    err = repo.TransferOrder(ctx, 1, 2, 100.50)
    if err != nil {
        log.Fatal("Transfer failed:", err)
    }
    fmt.Println("Transfer succeeded")
}

Multi-Region Deployment

ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5, constraints = '{"+us-east-1":2, "+eu-west-1":2, "+ap-southeast-1":1}';

ALTER DATABASE mydb CONFIGURE ZONE USING num_replicas = 5, constraints = '{"+us-east-1":2, "+eu-west-1":2, "+ap-southeast-1":1}';

ALTER TABLE users CONFIGURE ZONE USING num_replicas = 5, lease_preferences = '[[+us-east-1]]';

ALTER PARTITION us_east OF TABLE orders CONFIGURE ZONE USING constraints = '{"+us-east-1":2, "+eu-west-1":1}';
ALTER PARTITION eu_west OF TABLE orders CONFIGURE ZONE USING constraints = '{"+eu-west-1":2, "+us-east-1":1}';

Pitfall Guide

Pitfall 1: SERIAL IDs Are Not Sequential

-- ❌ Expected: 1, 2, 3, 4...
-- ✅ Actual: SERIAL uses unique_rowid(), generates large numbers
-- Fix: Use SEQUENCE for sequential IDs
CREATE SEQUENCE order_seq;
CREATE TABLE orders (id BIGINT DEFAULT nextval('order_seq') PRIMARY KEY, ...);

Pitfall 2: Foreign Key Cascade Performance

-- ❌ Cascade updates trigger cross-Range transactions
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(id) ON UPDATE CASCADE
);

-- ✅ Don't use cascades, handle in application
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(id) ON DELETE RESTRICT
);

Pitfall 3: Large Transaction Timeout

-- ❌ Single transaction updating 100K rows exceeds 5min timeout
BEGIN;
UPDATE large_table SET status = 'processed' WHERE created_at < '2026-01-01';
COMMIT;

-- ✅ Batch updates
BEGIN;
UPDATE large_table SET status = 'processed' WHERE created_at < '2026-01-01' LIMIT 1000;
COMMIT;

Pitfall 4: SELECT FOR UPDATE Causing Range Lock Contention

-- ❌ Locks entire Range
SELECT * FROM orders WHERE user_id = 1 FOR UPDATE;

-- ✅ Use AS OF SYSTEM TIME for historical snapshot
SELECT * FROM orders AS OF SYSTEM TIME '-5s' WHERE user_id = 1;

Pitfall 5: Ignoring Zone Configuration Leading to High Read Latency

-- ❌ Default 3 replicas randomly distributed, cross-region latency 200ms+
-- ✅ Configure lease_preferences for local reads
ALTER TABLE users CONFIGURE ZONE USING lease_preferences = '[[+us-east-1]]';

Error Troubleshooting

# Error Message Cause Solution
1 TransactionRetryWithProtoRefreshError Transaction conflict, needs retry Use CRDB transaction retry wrapper, or reduce concurrency
2 transaction deadline exceeded Transaction exceeded 5min timeout Split large transactions into batches
3 duplicate key value violates unique constraint SERIAL ID collision (rare) Use UUID or explicit SEQUENCE
4 foreign key violation FK constraint check failed Ensure referenced data exists, check insert order
5 value type tuple doesn't match type of column MySQL implicit type conversion not supported Explicit CAST type conversion
6 unimplemented: this syntax is not supported MySQL-specific syntax not supported Check compatibility docs, rewrite to PG syntax
7 memory budget exceeded Query intermediate results exceed memory Add LIMIT, optimize JOIN order, increase sql.memory.cluster_client
8 node is draining Node is shutting down Wait for drain to complete or connect to another node
9 replica not available Insufficient replicas (node down) Ensure live nodes >= quorum, wait for replica recovery
10 incompatible schema change Online schema change conflict Wait for previous schema change to complete

Advanced Optimization

1. Read Historical Data to Avoid Lock Contention

SELECT * FROM orders AS OF SYSTEM TIME INTERVAL '-5s'
WHERE status = 'pending';

2. Bulk Insert Optimization

IMPORT INTO orders (id, user_id, amount, status)
CSV DATA (
    'nfs://data/orders_part1.csv',
    'nfs://data/orders_part2.csv'
) WITH skip = 1;

3. Auto-Retry Transaction Wrapper

func RetryTx(ctx context.Context, db *sql.DB, fn func(*sql.Tx) error) error {
    for i := 0; i < 5; i++ {
        tx, err := db.BeginTx(ctx, nil)
        if err != nil {
            return err
        }
        err = fn(tx)
        if err == nil {
            return tx.Commit()
        }
        tx.Rollback()
        if !isRetryableError(err) {
            return err
        }
        time.Sleep(time.Duration(100*(i+1)) * time.Millisecond)
    }
    return fmt.Errorf("max retries exceeded")
}

Comparison Analysis

Dimension MySQL Sharding CockroachDB TiDB Spanner
Distributed Tx Needs middleware Native ACID Native ACID Native ACID
SQL Compatible MySQL PostgreSQL MySQL PostgreSQL
Online Scaling Needs downtime Auto Rebalance Auto Rebalance Auto
Multi-Region Self-built Native support Limited Native
Consistency Eventual Strong (Serializable) RC/SI Strong
Ops Complexity High Medium Medium Low (managed)
Cost Low Medium Medium High
Open Source ✅ (BSL)

Summary: Migrating from MySQL to CockroachDB isn't just "switching databases" — it's an architectural upgrade from "sharding + middleware" to "native distributed SQL". Key path: online migration with MOLT → rewrite incompatible SQL → configure multi-region Zones → batch large transactions → configure transaction retries. CockroachDB's strong consistency and auto-rebalancing reduce ops costs by 80%, but only if you understand its distributed transaction mechanics and SQL compatibility boundaries.


Try these browser-local tools — no sign-up required →

#CockroachDB#分布式SQL#NewSQL#分布式事务#多区域#在线迁移#高可用#一致性