PostgreSQL Performance Tuning Complete Guide

Why Choose PostgreSQL in 2026?

PostgreSQL has firmly established itself as the world's most popular open-source relational database in 2026. With its powerful extensibility, rich data types, and excellent SQL standard compliance, it has become the database of choice from startups to large enterprises.

PostgreSQL Core Advantages

Dimension	PostgreSQL	MySQL	SQL Server	Oracle
Extensibility	Extremely strong (extension framework)	Medium	Medium	Strong
Data types	Extremely rich (JSONB, GIS, arrays)	Medium	Rich	Rich
SQL standard compliance	Closest to standard	Medium	High	High
Open source	Fully open (PostgreSQL License)	GPL	Partially open	Closed
Concurrency control	MVCC (excellent)	MVCC	MVCC	MVCC
Partitioning	Native declarative partitioning	Native partitioning	Native partitioning	Native partitioning

Key PostgreSQL Milestones in 2026

PostgreSQL 18: Further enhanced parallel queries, significantly improved logical replication performance
Incremental Sort optimization: 3-5x performance improvement in multi-column sort scenarios
JSON path expression enhancements: More powerful JSONB query and index support
Async I/O support: Underlying I/O architecture improvements,大幅提升 high-concurrency throughput
Logical replication improvements: DDL replication support, greatly simplifying data sync architecture

💡 Use the SQL Formatter tool to beautify PostgreSQL queries for easier execution plan analysis.

EXPLAIN ANALYZE Deep Dive

EXPLAIN ANALYZE is PostgreSQL's most critical performance tuning tool. It not only displays the query plan but also executes the query and returns actual timing statistics.

Basic Usage

-- View execution plan (without executing)
EXPLAIN SELECT o.order_id, c.customer_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2026-01-01'
  AND o.status = 'shipped';

-- Execute query and return actual statistics
EXPLAIN ANALYZE SELECT o.order_id, c.customer_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2026-01-01'
  AND o.status = 'shipped';

-- View detailed output (with buffer statistics)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE status = 'shipped';

Scan Types Explained

Sequential Scan (Seq Scan)

EXPLAIN ANALYZE SELECT * FROM orders WHERE amount > 1000;
-- Seq Scan on orders  (cost=0.00..15432.00 rows=100000 width=88)
--   Filter: (amount > 1000)

Use case: Small table, query returns most rows, no suitable index
Optimization: If returned rows < 5% of total, consider adding an index

Index Scan

EXPLAIN ANALYZE SELECT * FROM orders WHERE order_id = 12345;
-- Index Scan using orders_pkey on orders  (cost=0.42..8.44 rows=1 width=88)
--   Index Cond: (order_id = 12345)

Use case: Equality queries, high selectivity conditions, returning few rows
Optimization: Ensure index column order matches query conditions

Bitmap Scan

EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'shipped';
-- Bitmap Heap Scan on orders  (cost=523.12..8234.56 rows=15000 width=88)
--   Recheck Cond: (status = 'shipped')
--   -> Bitmap Index Scan on idx_orders_status  (cost=0.00..519.37 rows=15000)

Use case: Returning moderate rows (5%-15% of table), multi-condition combined queries
Optimization: Bitmap Index Scan + Bitmap Heap Scan combination, suitable for medium selectivity

Join Types Explained

Nested Loop Join

EXPLAIN ANALYZE
SELECT * FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE o.customer_id = 100;
-- Nested Loop  (cost=0.85..234.56 rows=50 width=200)
--   -> Index Scan using idx_orders_customer on orders o
--   -> Index Scan using idx_items_order on order_items oi

Use case: Small table driving large table, inner table has index, returning few rows
Optimization: Ensure inner table join column has an index

Hash Join

EXPLAIN ANALYZE
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;
-- Hash Join  (cost=345.67..12345.89 rows=100000 width=200)
--   Hash Cond: (o.customer_id = c.customer_id)
--   -> Seq Scan on orders o
--   -> Hash
--       -> Seq Scan on customers c

Use case: Large table joining large table, equality join, no sort requirement
Optimization: Ensure Build side (Hash table) fits in work_mem

Merge Join

EXPLAIN ANALYZE
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
ORDER BY o.customer_id;
-- Merge Join  (cost=0.86..15678.90 rows=100000 width=200)
--   Merge Cond: (o.customer_id = c.customer_id)
--   -> Index Scan using idx_orders_customer on orders o
--   -> Index Scan using customers_pkey on customers c

Use case: Both sides are sorted, equality join, large data volumes
Optimization: Ensure both sides' join columns have indexes (sorted)

EXPLAIN Output Key Fields

Field	Meaning	Focus
cost	Estimated cost (startup..total)	High total cost needs optimization
rows	Estimated returned rows	Large deviation from actual requires ANALYZE
width	Average row width (bytes)	Too wide — consider reducing SELECT columns
actual time	Actual time (milliseconds)	Watch for slow nodes
actual rows	Actually returned rows	Large deviation from estimate — update statistics
loops	Execution count	In nested loops: loops × time = total time
Buffers	Buffer hit statistics	High shared hit = cache-friendly

💡 Use the JSON Formatter tool to parse FORMAT JSON EXPLAIN output.

Index Types & Selection Practice

PostgreSQL offers a rich set of index types. Choosing the right index type is key to performance tuning.

B-tree Index (Default)

-- Create B-tree index
CREATE INDEX idx_orders_date ON orders (order_date);

-- Composite B-tree index
CREATE INDEX idx_orders_status_date ON orders (status, order_date);

-- Use case: Equality queries, range queries, sorting, LIKE 'prefix%'
-- Most commonly used index type, covering 90%+ scenarios

GIN Index (Inverted Index)

-- JSONB field index
CREATE INDEX idx_products_attrs ON products USING GIN (attributes);

-- Array field index
CREATE INDEX idx_articles_tags ON articles USING GIN (tags);

-- Full-text search index
CREATE INDEX idx_docs_content ON documents USING GIN (to_tsvector('english', content));

-- Query examples
SELECT * FROM products WHERE attributes @> '{"color": "red"}';
SELECT * FROM articles WHERE tags @> ARRAY['postgresql', 'performance'];
SELECT * FROM documents WHERE to_tsvector('english', content) @@ to_tsquery('postgresql & tuning');

Use case: JSONB, arrays, full-text search
Note: GIN indexes are slow to build but fast to query; consider fastupdate = on for faster writes

GiST Index (Generalized Search Tree)

-- Geospatial index (PostGIS)
CREATE INDEX idx_stores_location ON stores USING GiST (location);

-- Range type index
CREATE INDEX idx_events_period ON events USING GiST (time_range);

-- Query examples
SELECT * FROM stores WHERE ST_DWithin(location, ST_Point(116.4, 39.9), 5000);
SELECT * FROM events WHERE time_range && '[2026-01-01, 2026-06-30)'::daterange;

Use case: Geospatial, range types, tree structures
Note: GiST supports custom operator classes

BRIN Index (Block Range Index)

-- Time-series data index
CREATE INDEX idx_logs_created ON logs USING BRIN (created_at);

-- Custom block size
CREATE INDEX idx_logs_created ON logs USING BRIN (created_at) WITH (pages_per_range = 32);

-- Query example
SELECT * FROM logs WHERE created_at >= '2026-06-01' ORDER BY created_at DESC LIMIT 100;

Use case: Naturally sorted columns in large tables (time-series, auto-increment IDs)
Advantage: Extremely small index size (typically 1/100 of B-tree)
Note: Only suitable for physically ordered columns; selectivity queries are worse than B-tree

Hash Index

-- Create Hash index
CREATE INDEX idx_users_email_hash ON users USING HASH (email);

-- Use case: Pure equality queries
-- Note: PostgreSQL 10+ Hash indexes support WAL and are safe to use
-- But B-tree is usually sufficient; Hash index use cases are limited

Index Type Selection Decision Table

Scenario	Recommended Index	Reason
Equality/range queries	B-tree	Most versatile, stable performance
JSONB containment queries	GIN	Native @> operator support
Full-text search	GIN	Efficient retrieval with tsvector
Geospatial queries	GiST	PostGIS standard pairing
Range overlap queries	GiST	Native && operator support
Time-series range queries	BRIN	Tiny index, suitable for physically ordered data
Pure equality queries (huge tables)	Hash	Slightly faster than B-tree for equality

Partial Indexes & Expression Indexes

Partial Index

-- Only index pending orders (reduce index size)
CREATE INDEX idx_orders_pending ON orders (customer_id, order_date)
WHERE status = 'pending';

-- Only index active users
CREATE INDEX idx_users_active ON users (last_login_at)
WHERE is_active = true;

-- Query automatically uses partial index
SELECT * FROM orders WHERE status = 'pending' AND customer_id = 100;

Advantage: Significantly reduced index size, lower write overhead
Note: Query conditions must satisfy the WHERE clause to use the index

Expression Index

-- Case-insensitive query
CREATE INDEX idx_users_email_lower ON users (LOWER(email));

-- Date truncation query
CREATE INDEX idx_orders_date_trunc ON orders (DATE(order_date));

-- JSONB specific key index
CREATE INDEX idx_products_color ON products ((attributes->>'color'));

-- Query examples
SELECT * FROM users WHERE LOWER(email) = 'admin@example.com';
SELECT * FROM orders WHERE DATE(order_date) = '2026-06-11';
SELECT * FROM products WHERE attributes->>'color' = 'red';

Advantage: Avoids computing expressions at query time, uses index directly
Note: Query conditions must exactly match the index expression

Composite Index Column Order Principles

-- Follow: equality columns → range columns → sort columns
CREATE INDEX idx_orders_optimal ON orders (status, customer_id, order_date DESC);

-- Covering index: include all queried columns to avoid table lookup
CREATE INDEX idx_orders_covering ON orders (status, order_date)
INCLUDE (customer_id, amount);

Query Optimization Patterns

Avoid SELECT *

-- Anti-pattern
SELECT * FROM orders WHERE customer_id = 100;

-- Optimization: only query needed columns
SELECT order_id, order_date, amount
FROM orders WHERE customer_id = 100;

Pagination Optimization

-- Anti-pattern: OFFSET performs poorly at large offsets
SELECT * FROM orders ORDER BY order_id OFFSET 100000 LIMIT 20;

-- Optimization 1: Keyset pagination (cursor pagination)
SELECT * FROM orders WHERE order_id > 100000 ORDER BY order_id LIMIT 20;

-- Optimization 2: Covering index + JOIN
SELECT o.* FROM orders o
JOIN (SELECT order_id FROM orders ORDER BY order_id OFFSET 100000 LIMIT 20) t
ON o.order_id = t.order_id;

Subquery Optimization

-- Anti-pattern: correlated subquery
SELECT * FROM orders o
WHERE EXISTS (SELECT 1 FROM order_items oi WHERE oi.order_id = o.order_id AND oi.price > 100);

-- Optimization: rewrite as JOIN
SELECT DISTINCT o.* FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.price > 100;

CTE Optimization

-- PostgreSQL 12+: CTE can be inlined by default
WITH recent_orders AS (
  SELECT * FROM orders WHERE order_date >= '2026-01-01'
)
SELECT ro.*, c.customer_name
FROM recent_orders ro
JOIN customers c ON ro.customer_id = c.customer_id;

-- Explicitly materialize CTE when needed
WITH MATERIALIZED expensive_calc AS (
  SELECT customer_id, SUM(amount) AS total
  FROM orders GROUP BY customer_id
)
SELECT * FROM expensive_calc WHERE total > 10000;

Batch Operation Optimization

-- Batch insert (single INSERT with multiple values)
INSERT INTO orders (customer_id, amount, status)
VALUES
  (1, 100.00, 'pending'),
  (2, 200.00, 'shipped'),
  (3, 150.00, 'pending');

-- Batch update using CASE
UPDATE orders SET status = CASE
  WHEN order_id = 1 THEN 'shipped'
  WHEN order_id = 2 THEN 'delivered'
  WHEN order_id = 3 THEN 'cancelled'
  ELSE status
END
WHERE order_id IN (1, 2, 3);

-- Fast import using COPY
COPY orders FROM '/data/orders.csv' WITH (FORMAT csv, HEADER true);

Connection Pooling (PgBouncer)

PgBouncer is PostgreSQL's most popular connection pool, effectively solving performance issues caused by too many connections.

Why Connection Pooling?

Each PostgreSQL connection forks a process, consuming ~10MB of memory. 1000 connections = 10GB memory overhead, plus high process switching costs.

PgBouncer Installation & Configuration

# Install PgBouncer
sudo apt install pgbouncer

# Start PgBouncer
sudo systemctl start pgbouncer
sudo systemctl enable pgbouncer

; /etc/pgbouncer/pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 50
min_pool_size = 10
reserve_pool_size = 10
reserve_pool_timeout = 3
server_idle_timeout = 300
server_lifetime = 3600
log_connections = 0
log_disconnections = 0
stats_period = 60

Connection Pool Modes

Mode	Feature	Use Case
session	Session-level reuse, connection bound to client	Needs session variables, temp tables
transaction	Transaction-level reuse, connection returned after transaction	Most web apps (recommended)
statement	Statement-level reuse, returned after each statement	Short queries, no transactions

Connection Planning

-- View current connection count
SELECT count(*) FROM pg_stat_activity;

-- Group by state
SELECT state, count(*) FROM pg_stat_activity GROUP BY state ORDER BY count DESC;

-- View PostgreSQL max connections
SHOW max_connections;

-- Recommended formula
-- PgBouncer default_pool_size = CPU cores × 2 + disk count
-- PostgreSQL max_connections = sum of all PgBouncer pool_size + reserve

Core Parameter Tuning

shared_buffers

-- Shared buffer size, PostgreSQL's most critical parameter
-- Recommended: 25% of total system memory, but not exceeding 8GB (Linux)
ALTER SYSTEM SET shared_buffers = '4GB';

work_mem

-- Memory limit for sort and hash operations
-- Recommended: total memory / (max_connections × 3), typically 16MB-256MB
ALTER SYSTEM SET work_mem = '64MB';

-- Check current work_mem causing disk sorts
SELECT query, calls, temp_blks_written
FROM pg_stat_statements
WHERE temp_blks_written > 0
ORDER BY temp_blks_written DESC LIMIT 10;

effective_cache_size

-- Query optimizer's estimate of system cache (doesn't actually allocate memory)
-- Recommended: 50%-75% of total system memory
ALTER SYSTEM SET effective_cache_size = '12GB';

-- WAL write method: open_datasync (Linux default), fdatasync, fsync, etc.
ALTER SYSTEM SET wal_sync_method = 'open_datasync';

-- WAL buffer size
ALTER SYSTEM SET wal_buffers = '64MB';

-- Checkpoint interval
ALTER SYSTEM SET checkpoint_timeout = '10min';
ALTER SYSTEM SET max_wal_size = '2GB';
ALTER SYSTEM SET min_wal_size = '512MB';

-- Integrity guarantee (must be ON in production)
ALTER SYSTEM SET fsync = 'on';
ALTER SYSTEM SET synchronous_commit = 'on';

Key Parameter Quick Reference

Parameter	Recommended (16GB RAM)	Description
shared_buffers	4GB	Shared buffers
work_mem	64MB	Sort/hash memory
effective_cache_size	12GB	Optimizer cache estimate
maintenance_work_mem	1GB	Maintenance operation memory
wal_buffers	64MB	WAL buffer
max_wal_size	2GB	Max WAL size
checkpoint_timeout	10min	Checkpoint interval
random_page_cost	1.1 (SSD)	Random page cost
effective_io_concurrency	200 (SSD)	Concurrent I/O count
max_connections	100-200	Max connections

# Use pgtune to auto-generate recommended configuration
pgtune -i postgresql.conf -M 16GB -T web -c 200

Partitioning Strategies

PostgreSQL 10+ supports native declarative partitioning, the core approach for handling large tables.

Range Partitioning

-- Partition orders table by month
CREATE TABLE orders (
  order_id BIGINT,
  customer_id INT,
  order_date DATE,
  amount DECIMAL(10,2),
  status TEXT
) PARTITION BY RANGE (order_date);

CREATE TABLE orders_2026_q1 PARTITION OF orders
  FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
CREATE TABLE orders_2026_q2 PARTITION OF orders
  FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');
CREATE TABLE orders_2026_q3 PARTITION OF orders
  FOR VALUES FROM ('2026-07-01') TO ('2026-10-01');
CREATE TABLE orders_2026_q4 PARTITION OF orders
  FOR VALUES FROM ('2026-10-01') TO ('2027-01-01');
CREATE TABLE orders_default PARTITION OF orders DEFAULT;

List Partitioning

-- Partition by region
CREATE TABLE customers (
  customer_id BIGINT,
  name TEXT,
  region TEXT,
  created_at TIMESTAMP
) PARTITION BY LIST (region);

CREATE TABLE customers_north PARTITION OF customers
  FOR VALUES IN ('Beijing', 'Tianjin', 'Hebei');
CREATE TABLE customers_south PARTITION OF customers
  FOR VALUES IN ('Guangdong', 'Fujian', 'Hainan');
CREATE TABLE customers_default PARTITION OF customers DEFAULT;

Hash Partitioning

-- Hash partitioning (even distribution)
CREATE TABLE access_logs (
  log_id BIGINT,
  user_id INT,
  path TEXT,
  created_at TIMESTAMP
) PARTITION BY HASH (user_id);

CREATE TABLE access_logs_p0 PARTITION OF access_logs
  FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE access_logs_p1 PARTITION OF access_logs
  FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE access_logs_p2 PARTITION OF access_logs
  FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE access_logs_p3 PARTITION OF access_logs
  FOR VALUES WITH (MODULUS 4, REMAINDER 3);

Partition Maintenance

-- Auto-create future partitions (using pg_partman extension)
CREATE EXTENSION pg_partman;
SELECT partman.create_parent(
  p_parent_table := 'public.orders',
  p_control := 'order_date',
  p_type := 'range',
  p_interval := '1 month',
  p_premake := 6
);

-- Verify partition pruning
EXPLAIN SELECT * FROM orders WHERE order_date = '2026-06-11';
-- Should only scan orders_2026_q2 partition

-- Drop old partition (after archiving)
DROP TABLE orders_2024_q1;

Parallel Query

PostgreSQL 9.6+ introduced parallel queries, and version 18 has significantly enhanced them.

Enabling Parallel Query

-- Enable parallel queries
ALTER SYSTEM SET max_parallel_workers_per_gather = 4;
ALTER SYSTEM SET max_parallel_workers = 8;
ALTER SYSTEM SET parallel_tuple_cost = 0.01;
ALTER SYSTEM SET parallel_setup_cost = 100;

-- Minimum table size to trigger parallelism (default 8MB)
ALTER SYSTEM SET min_parallel_table_scan_size = '8MB';

-- Parallel worker count scales with table size
ALTER SYSTEM SET min_parallel_index_scan_size = '512kB';

Parallel Query Verification

EXPLAIN ANALYZE SELECT COUNT(*), SUM(amount)
FROM orders WHERE order_date >= '2026-01-01';
-- Gather Merge
--   -> Sort
--       -> Partial Aggregate
--           -> Parallel Seq Scan on orders

Parallel Query Applicable Scenarios

Scenario	Parallel Effect	Notes
Large table full scan	Excellent	Parallel Seq Scan
Large table aggregation	Excellent	Parallel Aggregation
Large table JOIN	Good	Parallel Hash Join
Index scan	Moderate	Parallel Bitmap Scan
Small table query	No effect	Below min_parallel_table_scan_size
CTE (materialized)	Not supported	MATERIALIZED CTE not parallelized

VACUUM & autovacuum Tuning

PostgreSQL's MVCC mechanism produces dead tuples that must be reclaimed through VACUUM.

Manual VACUUM

-- Regular VACUUM (doesn't reclaim disk space, only marks reusable)
VACUUM orders;

-- VACUUM FULL (reclaims disk space, locks table!)
VACUUM FULL orders;

-- Parallel VACUUM (PostgreSQL 12+, no table lock)
VACUUM (PARALLEL 4) orders;

-- Analyze statistics
ANALYZE orders;

-- VACUUM + ANALYZE together
VACUUM ANALYZE orders;

autovacuum Tuning

-- Enable autovacuum (enabled by default)
ALTER SYSTEM SET autovacuum = 'on';

-- autovacuum worker count
ALTER SYSTEM SET autovacuum_max_workers = 4;

-- Trigger threshold (fires when dead tuples exceed this value)
-- Formula: autovacuum_vacuum_scale_factor × table rows + autovacuum_vacuum_threshold
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1;
ALTER SYSTEM SET autovacuum_vacuum_threshold = 50;

-- Analyze trigger threshold
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.05;
ALTER SYSTEM SET autovacuum_analyze_threshold = 50;

-- autovacuum per-cycle cost limit
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 2000;
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '2ms';

Per-Table autovacuum Tuning

-- Lower trigger threshold for frequently updated tables
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,
  autovacuum_analyze_scale_factor = 0.01,
  autovacuum_vacuum_cost_limit = 5000
);

-- View table VACUUM statistics
SELECT relname, n_live_tup, n_dead_tup,
  last_vacuum, last_autovacuum,
  last_analyze, last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC LIMIT 10;

Monitoring VACUUM Progress

-- PostgreSQL 12+ view VACUUM progress
SELECT pid, datname, relid::regclass, phase,
  heap_blks_total, heap_blks_scanned, heap_blks_vacuumed,
  index_vacuum_count, num_dead_item_ids
FROM pg_stat_progress_vacuum;

pg_stat_statements for Slow Query Analysis

pg_stat_statements is PostgreSQL's most powerful slow query analysis tool.

Installation & Enabling

-- Load extension
CREATE EXTENSION pg_stat_statements;

-- Modify postgresql.conf
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.track = all
-- pg_stat_statements.max = 10000

Top N Slow Queries

-- Sort by total time
SELECT query, calls, total_exec_time, mean_exec_time,
  min_exec_time, max_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC LIMIT 10;

-- Sort by average time
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE calls > 10
ORDER BY mean_exec_time DESC LIMIT 10;

-- Sort by disk I/O
SELECT query, calls, shared_blks_read, shared_blks_hit,
  ROUND(shared_blks_hit::numeric / NULLIF(shared_blks_hit + shared_blks_read, 0), 4) AS cache_hit_ratio
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();

Cache Hit Ratio Analysis

-- Overall cache hit ratio (should be > 99%)
SELECT
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS cache_hit_ratio
FROM pg_statio_user_tables;

-- Per-table cache hit ratio
SELECT relname,
  heap_blks_hit, heap_blks_read,
  ROUND(heap_blks_hit::numeric / NULLIF(heap_blks_hit + heap_blks_read, 0), 4) AS cache_hit_ratio
FROM pg_statio_user_tables
WHERE heap_blks_read > 0
ORDER BY heap_blks_read DESC;

Common Performance Anti-Patterns

Anti-Pattern 1: Over-Indexing

-- Problem: Every index adds write overhead
-- Check unused indexes
SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

-- Drop unused indexes
DROP INDEX CONCURRENTLY idx_orders_unused;

Anti-Pattern 2: Large Transactions

-- Problem: Long transactions prevent VACUUM from reclaiming dead tuples, causing table bloat
-- Check long transactions
SELECT pid, now() - xact_start AS duration, query, state
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active')
  AND xact_start IS NOT NULL
ORDER BY duration DESC;

-- Set transaction timeout (PostgreSQL 17+)
ALTER SYSTEM SET idle_in_transaction_session_timeout = '60000';

Anti-Pattern 3: Type Mismatch

-- Problem: Implicit type conversion prevents index usage
-- Anti-pattern
SELECT * FROM orders WHERE order_id = '12345';  -- VARCHAR vs BIGINT mismatch

-- Optimization: ensure types match
SELECT * FROM orders WHERE order_id = 12345;

Anti-Pattern 4: OR Condition Abuse

-- Anti-pattern
SELECT * FROM orders WHERE customer_id = 100 OR amount > 5000;

-- Optimization: use UNION ALL
SELECT * FROM orders WHERE customer_id = 100
UNION ALL
SELECT * FROM orders WHERE amount > 5000 AND customer_id != 100;

Anti-Pattern 5: Functions Invalidating Indexes

-- Anti-pattern: using function on indexed column
SELECT * FROM orders WHERE DATE(created_at) = '2026-06-11';

-- Optimization: use range query
SELECT * FROM orders WHERE created_at >= '2026-06-11' AND created_at < '2026-06-12';

-- Or use expression index
CREATE INDEX idx_orders_created_date ON orders (DATE(created_at));

Prometheus + Grafana Monitoring

postgres_exporter Deployment

# Install postgres_exporter
wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.15.0/postgres_exporter-0.15.0.linux-amd64.tar.gz
tar xzf postgres_exporter-0.15.0.linux-amd64.tar.gz

# Configure data source
export DATA_SOURCE_NAME="postgresql://monitor:password@localhost:5432/mydb?sslmode=disable"

# Start
./postgres_exporter --web.listen-address=:9187

Prometheus Configuration

scrape_configs:
  - job_name: 'postgresql'
    static_configs:
      - targets: ['localhost:9187']
    metric_relabel_configs:
      - source_labels: [__name__]
        regex: 'pg_.*'
        action: keep

Key Monitoring Metrics

Metric	Meaning	Alert Threshold
pg_stat_activity_count	Current connections	> max_connections × 80%
pg_stat_database_deadlocks	Deadlock count	> 0
pg_stat_user_tables_n_dead_tup	Dead tuple count	> total rows × 10%
pg_stat_statements_mean_exec_time	Average query time	P99 > 500ms
pg_replication_lag	Replication lag	> 30s
pg_database_size_bytes	Database size	> disk 85%

Alert Rule Examples

groups:
  - name: postgresql_alerts
    rules:
      - alert: PostgreSQLHighConnections
        expr: pg_stat_activity_count > 160
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL connections too high"
          description: "Current connections exceed 160 for 5 minutes"

      - alert: PostgreSQLReplicationLag
        expr: pg_replication_lag > 30
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "PostgreSQL replication lag too high"
          description: "Replication lag exceeds 30 seconds"

      - alert: PostgreSQLDeadTuplesHigh
        expr: pg_stat_user_tables_n_dead_tup / pg_stat_user_tables_n_live_tup > 0.1
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL dead tuple ratio too high"
          description: "Dead tuples exceed 10%, VACUUM needed"

💡 Use the Hash Encryption tool to generate signing keys for monitoring alert webhooks.

FAQ

Q1: How large should shared_buffers be?

Recommended setting is 25% of total system memory, but not exceeding 8GB (Linux). This is because PostgreSQL relies on the OS page cache; overly large shared_buffers competes with OS cache. For a 16GB RAM server, set shared_buffers = 4GB.

Q2: When should I use BRIN index instead of B-tree?

When a table is very large (> 100 million rows) and column values correlate with physical storage order (like timestamps, auto-increment IDs), BRIN index size is only 1/100 of B-tree, with comparable query performance. Typical use case: time columns in log tables, collection timestamps in time-series data.

Q3: How to determine if partitioning is needed?

When a single table exceeds 10 million rows and queries typically involve only partial data (e.g., time range queries), consider partitioning. Partition pruning can significantly reduce scanned data volume.

Q4: When to use VACUUM FULL?

VACUUM FULL locks the table and rebuilds it entirely, reclaiming all fragmented space. Use only when table bloat is severe (dead tuples > 50%) and regular VACUUM cannot reclaim space. The pg_repack extension is recommended as an online alternative.

Q5: What are the limitations of PgBouncer's transaction mode?

In transaction mode, connections are returned after the transaction ends, so: cannot use temp tables (cross-transaction), cannot use SET for session-level parameters (cross-transaction), cannot use LISTEN/NOTIFY (cross-transaction), cannot use prepared statements (cross-transaction). Use session mode if you need these features.

Q6: How to quickly identify slow queries?

Enable pg_stat_statements extension
Query pg_stat_statements sorted by total_exec_time
Use EXPLAIN ANALYZE on slow queries to analyze execution plans
Check for Seq Scan, high-cost nodes
Add indexes or rewrite queries accordingly

Q7: What should random_page_cost be set to?

SSD/NVMe: Set to 1.1 (close to sequential I/O cost)
HDD: Keep default 4.0
This parameter directly affects the optimizer's choice between index scan and sequential scan

Q8: How to determine PostgreSQL connection limit?

Each connection consumes ~10MB of memory. Recommended formula: max_connections = available_memory(GB) × 100, but actual connections should be managed through PgBouncer pooling. PostgreSQL's max_connections can be set to 100-200; external connections are managed by PgBouncer.

Summary

PostgreSQL remains the most powerful open-source relational database in 2026. Performance tuning requires a systematic approach across four dimensions: query analysis, index design, parameter configuration, and architecture optimization. Key takeaways:

EXPLAIN ANALYZE is the starting point for performance tuning; understand each scan and join type's applicable scenarios
Index selection is critical: B-tree for general use, GIN for JSONB/arrays, BRIN for time-series data
Parameter tuning four cores: shared_buffers, work_mem, effective_cache_size, WAL configuration
Partitioning strategy is the foundation for large table management: range partitioning most common, hash partitioning for uniformity
VACUUM tuning prevents table bloat: autovacuum must be enabled and properly configured
Monitoring system is the safeguard: pg_stat_statements + postgres_exporter + Grafana trio