PostgreSQL Index Optimization: 6 Key Tuning Strategies from Slow Queries to Millisecond Response

数据库

A Query Takes 47 Seconds, the Boss Says Replace the Database

Your order list query went from 2 seconds to 47 seconds, users are complaining wildly. You added an index, and the query got even slower. EXPLAIN ANALYZE outputs a bunch of incomprehensible nodes, and you don't know where the problem is. In 2026, PostgreSQL index optimization is the key from slow queries to millisecond response — choosing the right index type, understanding execution plans, and avoiding index invalidation can improve query performance by 1000x.

This article starts from PostgreSQL index principles and guides you through 6 key tuning strategies, from index type selection to execution plan analysis, from partial indexes to concurrent index creation.


PostgreSQL Index Core Concepts

Concept Description
B-tree Index Default index type, suitable for equality, range queries, and sorting
Hash Index Only for equality queries, no sort info, limited use cases
GIN Index Inverted index, suitable for full-text search, JSONB, arrays
GiST Index Generalized Search Tree, suitable for geospatial data, range types
BRIN Index Block Range Index, suitable for physically ordered large tables, very small size
Partial Index Only indexes rows matching a condition, reduces size and maintenance cost
Covering Index INCLUDE columns store extra data, enables Index-Only Scan
Concurrent Index Creates index without locking the table, essential for production

Index Type Selection Decision Tree

Query Type Decision:
├── Equality (=) → B-tree (default) or Hash
├── Range (>, <, BETWEEN) → B-tree
├── Full-text search (tsvector) → GIN
├── JSONB query (@>, ?) → GIN
├── Array contains (@>, &&) → GIN
├── Geospatial (ST_Contains) → GiST
├── Range overlap (&&) → GiST
├── Physically ordered large table (time series) → BRIN
└── Fuzzy search (LIKE 'abc%') → B-tree (prefix match)

Problem Analysis: 5 Major PostgreSQL Index Optimization Challenges

  1. Index selection difficulty: 5 index types each with specific use cases, wrong type leads to full table scan
  2. Index invalidation: Function conversion, implicit type casting, OR conditions prevent index usage
  3. Execution plan complexity: EXPLAIN ANALYZE output is complex, Seq Scan vs Index Scan choice unclear
  4. Index bloat: Frequent UPDATE/DELETE causes index fragmentation, query performance degrades
  5. Production index creation locks table: CREATE INDEX defaults to write lock, large table indexing causes downtime

Step-by-Step: 6 Key Tuning Strategies

Strategy 1: B-tree Index — The Most Common Optimization

CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL,
    status VARCHAR(20) NOT NULL,
    total_amount DECIMAL(12,2),
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ
);

INSERT INTO orders (user_id, status, total_amount, created_at)
SELECT
    (random() * 100000)::bigint,
    (ARRAY['pending','paid','shipped','completed','cancelled'])[floor(random()*5+1)::int],
    (random() * 10000)::decimal(12,2),
    NOW() - (random() * interval '365 days')
FROM generate_series(1, 5000000);

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_orders_status_created ON orders(status, created_at);

EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 12345;

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE status = 'pending' AND created_at > '2025-01-01'
ORDER BY created_at DESC LIMIT 50;
CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    attributes JSONB NOT NULL DEFAULT '{}',
    tags TEXT[] DEFAULT '{}',
    search_vector TSVECTOR
);

INSERT INTO products (name, attributes, tags)
SELECT
    'Product ' || i,
    jsonb_build_object(
        'color', (ARRAY['red','blue','green','black','white'])[floor(random()*5+1)::int],
        'size', (ARRAY['S','M','L','XL'])[floor(random()*4+1)::int],
        'price', (random() * 500)::numeric(10,2),
        'in_stock', random() > 0.3
    ),
    ARRAY[(ARRAY['electronics','clothing','food','toys'])[floor(random()*4+1)::int]]
FROM generate_series(1, 1000000) AS i;

CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
CREATE INDEX idx_products_tags ON products USING GIN (tags);

EXPLAIN ANALYZE
SELECT * FROM products WHERE attributes @> '{"color": "red", "in_stock": true}';

EXPLAIN ANALYZE
SELECT * FROM products WHERE tags @> ARRAY['electronics'];

ALTER TABLE products ADD COLUMN search_vector TSVECTOR
    GENERATED ALWAYS AS (to_tsvector('english', name)) STORED;

CREATE INDEX idx_products_search ON products USING GIN (search_vector);

EXPLAIN ANALYZE
SELECT * FROM products
WHERE search_vector @@ to_tsquery('english', 'product & 999')
ORDER BY ts_rank(search_vector, to_tsquery('english', 'product & 999')) DESC
LIMIT 20;

Strategy 3: BRIN Index — Time Series Large Tables

CREATE TABLE sensor_data (
    id BIGSERIAL,
    device_id INT NOT NULL,
    temperature DECIMAL(5,2),
    humidity DECIMAL(5,2),
    recorded_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (recorded_at);

CREATE TABLE sensor_data_2026_q1 PARTITION OF sensor_data
    FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
CREATE TABLE sensor_data_2026_q2 PARTITION OF sensor_data
    FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');
CREATE TABLE sensor_data_2026_q3 PARTITION OF sensor_data
    FOR VALUES FROM ('2026-07-01') TO ('2026-10-01');
CREATE TABLE sensor_data_2026_q4 PARTITION OF sensor_data
    FOR VALUES FROM ('2026-10-01') TO ('2027-01-01');

INSERT INTO sensor_data (device_id, temperature, humidity, recorded_at)
SELECT
    (random() * 100)::int,
    (15 + random() * 25)::decimal(5,2),
    (30 + random() * 50)::decimal(5,2),
    '2026-01-01'::timestamptz + (random() * interval '365 days')
FROM generate_series(1, 20000000);

CREATE INDEX idx_sensor_brin_recorded ON sensor_data
    USING BRIN (recorded_at) WITH (pages_per_range = 32);

CREATE INDEX idx_sensor_brin_device ON sensor_data
    USING BRIN (device_id) WITH (pages_per_range = 32);

EXPLAIN ANALYZE
SELECT * FROM sensor_data
WHERE recorded_at BETWEEN '2026-03-01' AND '2026-03-15'
ORDER BY recorded_at;

SELECT
    pg_size_pretty(pg_relation_size('idx_sensor_brin_recorded')) AS brin_size,
    pg_size_pretty(pg_relation_size('sensor_data')) AS table_size;

Strategy 4: Partial Index and Covering Index

CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending' AND created_at > '2026-01-01';

CREATE INDEX idx_orders_covering ON orders(user_id)
INCLUDE (status, total_amount, created_at);

EXPLAIN ANALYZE
SELECT user_id, status, total_amount, created_at
FROM orders WHERE user_id = 12345;

CREATE INDEX idx_orders_active_user ON orders(user_id, created_at DESC)
INCLUDE (status, total_amount)
WHERE status IN ('pending', 'paid', 'shipped');

EXPLAIN ANALYZE
SELECT user_id, status, total_amount, created_at
FROM orders
WHERE user_id = 12345
  AND status IN ('pending', 'paid', 'shipped')
ORDER BY created_at DESC
LIMIT 20;

Strategy 5: EXPLAIN ANALYZE Deep Dive

EXPLAIN (ANYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total_amount, o.created_at
FROM orders o
JOIN (
    SELECT user_id, MAX(created_at) AS last_order
    FROM orders
    WHERE status = 'completed'
    GROUP BY user_id
) latest ON o.user_id = latest.user_id AND o.created_at = latest.last_order
WHERE o.total_amount > 1000
ORDER BY o.total_amount DESC
LIMIT 50;

SELECT pg_stat_user_indexes.schemaname,
       pg_stat_user_indexes.relname AS table_name,
       pg_stat_user_indexes.indexrelname AS index_name,
       pg_stat_user_indexes.idx_scan AS index_scans,
       pg_stat_user_indexes.idx_tup_read AS tuples_read,
       pg_stat_user_indexes.idx_tup_fetch AS tuples_fetched,
       pg_indexes.indexdef AS index_definition
FROM pg_stat_user_indexes
JOIN pg_indexes ON pg_stat_user_indexes.indexrelname = pg_indexes.indexname
WHERE pg_stat_user_indexes.relname = 'orders'
ORDER BY pg_stat_user_indexes.idx_scan ASC;

SELECT schemaname, relname AS table_name, indexrelname AS index_name,
       idx_scan AS scans, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;

Strategy 6: Concurrent Index Creation and Maintenance

CREATE INDEX CONCURRENTLY idx_orders_updated_at ON orders(updated_at);

CREATE INDEX CONCURRENTLY idx_products_name_trgm ON products
    USING GIN (name gin_trgm_ops);

REINDEX INDEX CONCURRENTLY idx_orders_user_id;

VACUUM ANALYZE orders;

SELECT indexrelname AS index_name,
       pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
       idx_scan AS scans
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY pg_relation_size(indexrelid) DESC;

SELECT pg_size_pretty(pg_total_relation_size('orders')) AS total_size,
       pg_size_pretty(pg_relation_size('orders')) AS table_size,
       pg_size_pretty(pg_indexes_size('orders')) AS indexes_size;

CREATE OR REPLACE FUNCTION maintain_indexes()
RETURNS void AS $$
DECLARE
    idx_record RECORD;
    bloat_ratio NUMERIC;
BEGIN
    FOR idx_record IN
        SELECT schemaname, relname, indexrelname, indexrelid
        FROM pg_stat_user_indexes
        WHERE schemaname = 'public'
    LOOP
        SELECT COALESCE(
            (pg_relation_size(idx_record.indexrelid)::numeric /
             NULLIF(pg_relation_size(
                (SELECT relfilenode FROM pg_class WHERE oid = idx_record.indexrelid)
             ), 0)) * 100,
            0
        ) INTO bloat_ratio;

        IF bloat_ratio > 50 THEN
            EXECUTE format('REINDEX INDEX CONCURRENTLY %I', idx_record.indexrelname);
            RAISE NOTICE 'Reindexed % (bloat: %%%)', idx_record.indexrelname, bloat_ratio;
        END IF;
    END LOOP;

    ANALYZE;
END;
$$ LANGUAGE plpgsql;

Pitfall Guide

Pitfall 1: Using Functions on Indexed Columns

-- ❌ Wrong: Function on indexed column, PostgreSQL can't use B-tree index
SELECT * FROM orders WHERE DATE(created_at) = '2026-01-15';

-- ✅ Correct: Use range query instead of function conversion
SELECT * FROM orders
WHERE created_at >= '2026-01-15' AND created_at < '2026-01-16';

Pitfall 2: Implicit Type Casting

-- ❌ Wrong: varchar column compared with text may cause implicit cast
SELECT * FROM orders WHERE status = 'pending'::text;

-- ✅ Correct: Ensure comparison type matches column type
SELECT * FROM orders WHERE status::text = 'pending';
-- Or better: use the same type directly
SELECT * FROM orders WHERE status = 'pending';

Pitfall 3: LIKE '%abc%' Can't Use B-tree Index

-- ❌ Wrong: Leading wildcard can't use B-tree index
SELECT * FROM products WHERE name LIKE '%phone%';

-- ✅ Correct: Use pg_trgm GIN index for fuzzy search
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_products_name_trgm ON products USING GIN (name gin_trgm_ops);
SELECT * FROM products WHERE name % 'phone';

Pitfall 4: OR Conditions Cause Index Selection Issues

-- ❌ Wrong: OR condition may cause optimizer to abandon index
SELECT * FROM orders WHERE user_id = 123 OR status = 'pending';

-- ✅ Correct: Split query with UNION ALL
SELECT * FROM orders WHERE user_id = 123
UNION ALL
SELECT * FROM orders WHERE status = 'pending' AND user_id != 123;

Pitfall 5: Over-indexing Degrades Write Performance

-- ❌ Wrong: Index every column, INSERT/UPDATE performance tanks
CREATE INDEX idx_orders_col1 ON orders(col1);
CREATE INDEX idx_orders_col2 ON orders(col2);
CREATE INDEX idx_orders_col3 ON orders(col3);
CREATE INDEX idx_orders_col4 ON orders(col4);
CREATE INDEX idx_orders_col5 ON orders(col5);

-- ✅ Correct: Only index columns used in queries, regularly clean unused indexes
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE relname = 'orders' AND idx_scan = 0;

Error Troubleshooting

# Error Message Cause Solution
1 could not create unique index, duplicate key Duplicate values in unique index column Deduplicate first, or use CREATE UNIQUE INDEX ... WHERE
2 index row size exceeds maximum GIN index row too large Use gin_pending_list_limit or optimize data
3 concurrent index creation failed Table modified during CONCURRENTLY creation Retry, check for long transactions blocking
4 cannot create index on partitioned table Direct indexing not supported on partitioned tables Create index on each partition, or use PG11+ partition indexes
5 operator does not exist: jsonb @> text JSONB query operator type mismatch Ensure right side is also jsonb: '{"key":"val"}'::jsonb
6 function gin_trgm_ops does not exist pg_trgm extension not installed CREATE EXTENSION pg_trgm;
7 out of memory Insufficient memory for large table indexing Increase maintenance_work_mem, or use CONCURRENTLY
8 relation already exists Index name conflict Use IF NOT EXISTS or change index name
9 access exclusive lock Table locked during index creation Use CREATE INDEX CONCURRENTLY to avoid locking
10 index "xxx" is not valid CONCURRENTLY creation failed, index invalid DROP INDEX xxx; then recreate

Advanced Optimization

1. Index Usage Monitoring Dashboard

CREATE OR REPLACE VIEW index_health_dashboard AS
SELECT
    schemaname,
    relname AS table_name,
    indexrelname AS index_name,
    idx_scan AS total_scans,
    idx_tup_read AS tuples_read,
    idx_tup_fetch AS tuples_fetched,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    CASE
        WHEN idx_scan = 0 THEN 'UNUSED'
        WHEN idx_scan < 100 THEN 'LOW_USAGE'
        ELSE 'ACTIVE'
    END AS health_status,
    COALESCE(
        ROUND(
            idx_tup_fetch::numeric / NULLIF(idx_scan, 0),
            2
        ),
        0
    ) AS avg_tuples_per_scan
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY
    CASE health_status
        WHEN 'UNUSED' THEN 0
        WHEN 'LOW_USAGE' THEN 1
        ELSE 2
    END,
    pg_relation_size(indexrelid) DESC;

SELECT * FROM index_health_dashboard LIMIT 20;

2. Slow Query Auto-Capture and Index Suggestions

ALTER SYSTEM SET log_min_duration_statement = 1000;
ALTER SYSTEM SET auto_explain.log_min_duration = 1000;
ALTER SYSTEM SET auto_explain.log_analyze = true;
SELECT pg_reload_conf();

CREATE TABLE slow_query_log (
    id BIGSERIAL PRIMARY KEY,
    query_text TEXT NOT NULL,
    duration_ms NUMERIC NOT NULL,
    plan_text TEXT,
    captured_at TIMESTAMPTZ DEFAULT NOW(),
    suggested_index TEXT
);

CREATE OR REPLACE FUNCTION capture_slow_query()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.duration_ms > 5000 THEN
        INSERT INTO slow_query_log (query_text, duration_ms, suggested_index)
        VALUES (
            NEW.query_text,
            NEW.duration_ms,
            CASE
                WHEN NEW.query_text ~* 'WHERE\s+\w+\s*=' THEN
                    'Consider index on equality column'
                WHEN NEW.query_text ~* 'ORDER BY' THEN
                    'Consider index on sort column'
                WHEN NEW.query_text ~* 'LIKE' THEN
                    'Consider pg_trgm GIN index'
                ELSE 'Review EXPLAIN ANALYZE'
            END
        );
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

3. Index Bloat Detection and Auto-Maintenance

CREATE OR REPLACE FUNCTION check_index_bloat(
    p_schema TEXT DEFAULT 'public',
    p_bloat_threshold NUMERIC DEFAULT 30
)
RETURNS TABLE(
    table_name TEXT,
    index_name TEXT,
    index_size TEXT,
    bloat_pct NUMERIC,
    action TEXT
) AS $$
BEGIN
    RETURN QUERY
    SELECT
        schemaname::TEXT,
        indexrelname::TEXT,
        pg_size_pretty(pg_relation_size(indexrelid))::TEXT,
        COALESCE(
            ROUND(
                100.0 * (pg_relation_size(indexrelid) -
                    COALESCE(pg_stat_user_indexes.idx_tup_fetch, 0) * 8
                ) / NULLIF(pg_relation_size(indexrelid), 0),
                1
            ),
            0
        ),
        CASE
            WHEN COALESCE(
                ROUND(
                    100.0 * (pg_relation_size(indexrelid) -
                        COALESCE(pg_stat_user_indexes.idx_tup_fetch, 0) * 8
                    ) / NULLIF(pg_relation_size(indexrelid), 0),
                    1
                ),
                0
            ) > p_bloat_threshold
            THEN 'REINDEX CONCURRENTLY ' || quote_ident(indexrelname)
            ELSE 'OK'
        END::TEXT
    FROM pg_stat_user_indexes
    WHERE schemaname = p_schema
    ORDER BY pg_relation_size(indexrelid) DESC;
END;
$$ LANGUAGE plpgsql;

SELECT * FROM check_index_bloat('public', 30);

Comparison Analysis

Dimension B-tree Hash GIN GiST BRIN
Equality query ⭐ Fast ⭐ Fast ⚠️ Slow ⚠️ Slow ⚠️ Rough
Range query ⭐ Fast ❌ No ❌ No ⚠️ Limited ⚠️ Rough
Sorting ⭐ Yes ❌ No ❌ No ❌ No ❌ No
Full-text search ❌ No ❌ No ⭐ Best ⚠️ Possible ❌ No
JSONB ❌ Limited ❌ No ⭐ Best ⚠️ Possible ❌ No
Geospatial ❌ No ❌ No ⚠️ Limited ⭐ Best ❌ No
Index size Medium Small Large Large Tiny
Build speed Fast Fast Slow Slow Very fast
Maintenance cost Low Low High High Very low
Data volume M~B M M~B M B+

Summary: PostgreSQL index optimization isn't just "add an index" — choosing the right type is the prerequisite, understanding execution plans is the key, and avoiding index invalidation is the baseline. The 6 key strategies build progressively: 1) B-tree basics, 2) GIN for JSONB/full-text, 3) BRIN for time-series large tables, 4) Partial and covering indexes to reduce size, 5) EXPLAIN ANALYZE deep dive, 6) Concurrent index creation and auto-maintenance. Core principle: more indexes isn't better — each index has a write cost. Regularly clean unused indexes with pg_stat_user_indexes, use CONCURRENTLY to avoid table locks.


Try these browser-local tools — no sign-up required →

#PostgreSQL#索引优化#慢查询#数据库性能#SQL调优#2026#执行计划