CockroachDB Distributed SQL: Complete Online Migration from MySQL to NewSQL in 2026
Are You Exhausted by MySQL Sharding?
Wrong shard keys causing hotspots, cross-shard JOIN performance disasters, distributed transactions cobbled together with message queues, scaling requiring downtime for data migration — when your MySQL cluster reaches TB scale, sharding operational costs far exceed business value. In 2026, NewSQL databases offer a more elegant solution: CockroachDB, a natively distributed, strongly consistent SQL database compatible with PostgreSQL protocol.
This article guides you through a zero-downtime online migration from MySQL to CockroachDB, from architecture principles to hands-on steps, from distributed transactions to multi-region deployment.
CockroachDB Architecture Core
| Concept | Description | MySQL Comparison |
|---|---|---|
| Range | Basic unit of data sharding by range (default 64MB) | Like MySQL Partition, but auto-managed |
| Replica | 3 replicas per Range by default, using Raft consensus | Similar to MySQL MGR, but more mature |
| Lease Holder | Read/write leader for a Range, handles local reads | Like MySQL primary, but finer granularity |
| DistSQL | Distributed SQL engine, auto-parallelizes queries | MySQL requires manual sharding + middleware |
| HLC | Hybrid Logical Clock, provides global ordering | MySQL relies on GTID, no global ordering across shards |
Distributed Transaction ACID Guarantees
CockroachDB uses Parallel Commits protocol for distributed transactions, eliminating the 2PC coordinator single point:
Transaction Flow:
1. Write Intent to all involved Ranges
2. Send PREPARE and COMMIT in parallel to all Ranges
3. If any node crashes, others recover via Write Intent
4. Transaction latency = 1 network round trip (vs 2 for 2PC)
Problem Analysis: MySQL to CockroachDB Migration Challenges
- SQL Compatibility: CockroachDB is PostgreSQL-compatible, MySQL syntax needs rewriting
- Auto-increment ID: MySQL's AUTO_INCREMENT is non-sequential in distributed mode, use UUID or SEQUENCE
- Foreign Key Cascades: Supported but not recommended (performance issues)
- Storage Engine Difference: InnoDB clustered index vs CockroachDB LSM-tree
- Character Set: MySQL's utf8mb4 maps to CockroachDB's UTF8 (default)
Step-by-Step: MySQL Online Migration
Step 1: Install CockroachDB
# Linux
curl -L https://binaries.cockroachdb.com/cockroach-v24.3.0.linux-amd64.tgz | tar -xz
sudo cp cockroach-v24.3.0.linux-amd64/cockroach /usr/local/bin/
# Docker
docker pull cockroachdb/cockroach:v24.3.0
# Start single node (dev/test)
cockroach start-single-node --insecure --listen-addr=localhost:26257 --store=path=/data/cockroach
Step 2: Start Production Cluster
# Node 1
cockroach start \
--insecure \
--store=path=/data/cockroach1 \
--listen-addr=192.168.1.10:26257 \
--http-addr=192.168.1.10:8080 \
--join=192.168.1.10:26257,192.168.1.11:26257,192.168.1.12:26257 \
--background
# Initialize cluster (first time only)
cockroach init --insecure --host=192.168.1.10:26257
Step 3: Use MOLT Migration Tool
# Install MOLT
curl -L https://binaries.cockroachdb.com/molt/v1.0.0/molt_linux_amd64.tar.gz | tar -xz
sudo cp molt /usr/local/bin/
# Assess migration feasibility
molt assess \
--source-type=mysql \
--source="root:password@tcp(mysql-host:3306)/mydb" \
--target="postgresql://root@cockroach-host:26257/mydb?sslmode=disable"
# Export Schema
molt dump-schema \
--source-type=mysql \
--source="root:password@tcp(mysql-host:3306)/mydb" \
--output-dir=./schema
# Convert Schema (MySQL→CockroachDB)
molt convert-schema \
--input-dir=./schema \
--output-dir=./converted-schema
# Apply Schema to CockroachDB
cockroach sql --insecure --host=cockroach-host < ./converted-schema/schema.sql
# Online data migration (CDC mode, zero downtime)
molt migrate-data \
--source-type=mysql \
--source="root:password@tcp(mysql-host:3306)/mydb" \
--target="postgresql://root@cockroach-host:26257/mydb?sslmode=disable" \
--mode=cdc \
--batch-size=10000 \
--concurrency=8
Step 4: SQL Syntax Rewriting
-- MySQL: AUTO_INCREMENT → CockroachDB: SERIAL
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100)
);
-- MySQL: LIMIT offset, count → CockroachDB: LIMIT count OFFSET offset
SELECT * FROM orders LIMIT 20 OFFSET 10;
-- MySQL: GROUP_CONCAT → CockroachDB: STRING_AGG
SELECT dept, STRING_AGG(name, ',') FROM employees GROUP BY dept;
-- MySQL: ON DUPLICATE KEY UPDATE → CockroachDB: UPSERT
UPSERT INTO users (id, name) VALUES (1, 'Alice');
Complete Code: Application Migration Example
package main
import (
"context"
"database/sql"
"fmt"
"log"
"time"
_ "github.com/lib/pq"
)
type Order struct {
ID int
UserID int
Amount float64
Status string
CreatedAt time.Time
}
type OrderRepo struct {
db *sql.DB
}
func NewOrderRepo(connStr string) (*OrderRepo, error) {
db, err := sql.Open("postgres", connStr)
if err != nil {
return nil, fmt.Errorf("open db: %w", err)
}
db.SetMaxOpenConns(25)
db.SetMaxIdleConns(10)
db.SetConnMaxLifetime(5 * time.Minute)
return &OrderRepo{db: db}, nil
}
func (r *OrderRepo) CreateOrder(ctx context.Context, order *Order) error {
query := `
INSERT INTO orders (user_id, amount, status, created_at)
VALUES ($1, $2, $3, $4)
RETURNING id`
return r.db.QueryRowContext(ctx, query,
order.UserID, order.Amount, order.Status, order.CreatedAt,
).Scan(&order.ID)
}
func (r *OrderRepo) TransferOrder(ctx context.Context, fromUserID, toUserID int, amount float64) error {
tx, err := r.db.BeginTx(ctx, nil)
if err != nil {
return fmt.Errorf("begin tx: %w", err)
}
defer tx.Rollback()
deductQuery := `UPDATE accounts SET balance = balance - $1 WHERE user_id = $2 AND balance >= $1`
result, err := tx.ExecContext(ctx, deductQuery, amount, fromUserID)
if err != nil {
return fmt.Errorf("deduct: %w", err)
}
if n, _ := result.RowsAffected(); n == 0 {
return fmt.Errorf("insufficient balance")
}
creditQuery := `UPDATE accounts SET balance = balance + $1 WHERE user_id = $2`
if _, err := tx.ExecContext(ctx, creditQuery, amount, toUserID); err != nil {
return fmt.Errorf("credit: %w", err)
}
logQuery := `INSERT INTO transfer_logs (from_user, to_user, amount, created_at) VALUES ($1, $2, $3, NOW())`
if _, err := tx.ExecContext(ctx, logQuery, fromUserID, toUserID, amount); err != nil {
return fmt.Errorf("log: %w", err)
}
return tx.Commit()
}
func main() {
repo, err := NewOrderRepo("postgresql://root@localhost:26257/mydb?sslmode=disable")
if err != nil {
log.Fatal(err)
}
defer repo.db.Close()
ctx := context.Background()
err = repo.TransferOrder(ctx, 1, 2, 100.50)
if err != nil {
log.Fatal("Transfer failed:", err)
}
fmt.Println("Transfer succeeded")
}
Multi-Region Deployment
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5, constraints = '{"+us-east-1":2, "+eu-west-1":2, "+ap-southeast-1":1}';
ALTER DATABASE mydb CONFIGURE ZONE USING num_replicas = 5, constraints = '{"+us-east-1":2, "+eu-west-1":2, "+ap-southeast-1":1}';
ALTER TABLE users CONFIGURE ZONE USING num_replicas = 5, lease_preferences = '[[+us-east-1]]';
ALTER PARTITION us_east OF TABLE orders CONFIGURE ZONE USING constraints = '{"+us-east-1":2, "+eu-west-1":1}';
ALTER PARTITION eu_west OF TABLE orders CONFIGURE ZONE USING constraints = '{"+eu-west-1":2, "+us-east-1":1}';
Pitfall Guide
Pitfall 1: SERIAL IDs Are Not Sequential
-- ❌ Expected: 1, 2, 3, 4...
-- ✅ Actual: SERIAL uses unique_rowid(), generates large numbers
-- Fix: Use SEQUENCE for sequential IDs
CREATE SEQUENCE order_seq;
CREATE TABLE orders (id BIGINT DEFAULT nextval('order_seq') PRIMARY KEY, ...);
Pitfall 2: Foreign Key Cascade Performance
-- ❌ Cascade updates trigger cross-Range transactions
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users(id) ON UPDATE CASCADE
);
-- ✅ Don't use cascades, handle in application
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users(id) ON DELETE RESTRICT
);
Pitfall 3: Large Transaction Timeout
-- ❌ Single transaction updating 100K rows exceeds 5min timeout
BEGIN;
UPDATE large_table SET status = 'processed' WHERE created_at < '2026-01-01';
COMMIT;
-- ✅ Batch updates
BEGIN;
UPDATE large_table SET status = 'processed' WHERE created_at < '2026-01-01' LIMIT 1000;
COMMIT;
Pitfall 4: SELECT FOR UPDATE Causing Range Lock Contention
-- ❌ Locks entire Range
SELECT * FROM orders WHERE user_id = 1 FOR UPDATE;
-- ✅ Use AS OF SYSTEM TIME for historical snapshot
SELECT * FROM orders AS OF SYSTEM TIME '-5s' WHERE user_id = 1;
Pitfall 5: Ignoring Zone Configuration Leading to High Read Latency
-- ❌ Default 3 replicas randomly distributed, cross-region latency 200ms+
-- ✅ Configure lease_preferences for local reads
ALTER TABLE users CONFIGURE ZONE USING lease_preferences = '[[+us-east-1]]';
Error Troubleshooting
| # | Error Message | Cause | Solution |
|---|---|---|---|
| 1 | TransactionRetryWithProtoRefreshError |
Transaction conflict, needs retry | Use CRDB transaction retry wrapper, or reduce concurrency |
| 2 | transaction deadline exceeded |
Transaction exceeded 5min timeout | Split large transactions into batches |
| 3 | duplicate key value violates unique constraint |
SERIAL ID collision (rare) | Use UUID or explicit SEQUENCE |
| 4 | foreign key violation |
FK constraint check failed | Ensure referenced data exists, check insert order |
| 5 | value type tuple doesn't match type of column |
MySQL implicit type conversion not supported | Explicit CAST type conversion |
| 6 | unimplemented: this syntax is not supported |
MySQL-specific syntax not supported | Check compatibility docs, rewrite to PG syntax |
| 7 | memory budget exceeded |
Query intermediate results exceed memory | Add LIMIT, optimize JOIN order, increase sql.memory.cluster_client |
| 8 | node is draining |
Node is shutting down | Wait for drain to complete or connect to another node |
| 9 | replica not available |
Insufficient replicas (node down) | Ensure live nodes >= quorum, wait for replica recovery |
| 10 | incompatible schema change |
Online schema change conflict | Wait for previous schema change to complete |
Advanced Optimization
1. Read Historical Data to Avoid Lock Contention
SELECT * FROM orders AS OF SYSTEM TIME INTERVAL '-5s'
WHERE status = 'pending';
2. Bulk Insert Optimization
IMPORT INTO orders (id, user_id, amount, status)
CSV DATA (
'nfs://data/orders_part1.csv',
'nfs://data/orders_part2.csv'
) WITH skip = 1;
3. Auto-Retry Transaction Wrapper
func RetryTx(ctx context.Context, db *sql.DB, fn func(*sql.Tx) error) error {
for i := 0; i < 5; i++ {
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return err
}
err = fn(tx)
if err == nil {
return tx.Commit()
}
tx.Rollback()
if !isRetryableError(err) {
return err
}
time.Sleep(time.Duration(100*(i+1)) * time.Millisecond)
}
return fmt.Errorf("max retries exceeded")
}
Comparison Analysis
| Dimension | MySQL Sharding | CockroachDB | TiDB | Spanner |
|---|---|---|---|---|
| Distributed Tx | Needs middleware | Native ACID | Native ACID | Native ACID |
| SQL Compatible | MySQL | PostgreSQL | MySQL | PostgreSQL |
| Online Scaling | Needs downtime | Auto Rebalance | Auto Rebalance | Auto |
| Multi-Region | Self-built | Native support | Limited | Native |
| Consistency | Eventual | Strong (Serializable) | RC/SI | Strong |
| Ops Complexity | High | Medium | Medium | Low (managed) |
| Cost | Low | Medium | Medium | High |
| Open Source | ✅ | ✅ (BSL) | ✅ | ❌ |
Summary: Migrating from MySQL to CockroachDB isn't just "switching databases" — it's an architectural upgrade from "sharding + middleware" to "native distributed SQL". Key path: online migration with MOLT → rewrite incompatible SQL → configure multi-region Zones → batch large transactions → configure transaction retries. CockroachDB's strong consistency and auto-rebalancing reduce ops costs by 80%, but only if you understand its distributed transaction mechanics and SQL compatibility boundaries.
Recommended Online Tools
- JSON Formatter: /en/json/format
- Base64 Encode/Decode: /en/encode/base64
- cURL to Code: /en/dev/curl-to-code
- SQL Formatter: /en/dev/sql-format
Try these browser-local tools — no sign-up required →