PostgreSQL Performance Tuning: From Good to Great
Practical strategies for optimizing PostgreSQL queries and database performance in production.
Harshit Shrivastav
Contributor
PostgreSQL Performance Tuning: From Good to Great
PostgreSQL is fast out of the box, but with proper tuning, it can be blazing fast. Here's how to optimize your database for production workloads.
Understanding the Basics
Before optimizing, understand how Postgres works:
- Shared buffers: In-memory cache for data
- Work mem: Memory for sort/hash operations
- WAL buffers: Write-ahead log buffers
- Query planner: Chooses execution strategy
Essential Configuration
shared_buffers
Set to 25% of system RAM:
shared_buffers = 8GB # For 32GB system
effective_cache_size
Set to 50-75% of system RAM:
effective_cache_size = 24GB # For 32GB system
work_mem
Memory per operation (sort, hash):
work_mem = 64MB # Adjust based on concurrent connections
maintenance_work_mem
For VACUUM, CREATE INDEX:
maintenance_work_mem = 2GB
Index Strategy
The Right Indexes
-- B-tree index (default, good for most cases)
CREATE INDEX idx_users_email ON users(email);
-- Partial index (for filtered queries)
CREATE INDEX idx_active_users ON users(email)
WHERE active = true;
-- Composite index (order matters!)
CREATE INDEX idx_posts_user_created ON posts(user_id, created_at);
-- Expression index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
Avoid Index Bloat
-- Check index usage
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0;
-- Remove unused indexes
DROP INDEX IF EXISTS unused_index;
Query Optimization
Use EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT * FROM posts
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 10;
Look for:
- Sequential scans on large tables (bad)
- Index scans (good)
- High execution time
- Large row estimates vs actual
Optimize JOIN Operations
-- Bad: Missing index
SELECT p.*, u.name
FROM posts p
JOIN users u ON p.user_id = u.id;
-- Good: Add index
CREATE INDEX idx_posts_user_id ON posts(user_id);
Avoid N+1 Queries
-- Bad: N+1 in application code
posts.forEach(post => {
const author = await db.query('SELECT * FROM users WHERE id = $1', [post.user_id]);
});
-- Good: Single JOIN
SELECT p.*, u.*
FROM posts p
JOIN users u ON p.user_id = u.id;
Connection Pooling
Use PgBouncer for connection pooling:
# pgbouncer.ini
[databases]
mydb = host=localhost dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
Benefits:
- Reduce connection overhead
- Better resource utilization
- Handle more concurrent clients
Vacuuming and Maintenance
Auto-vacuum Settings
autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 10s
Manual VACUUM
-- Regular vacuum
VACUUM ANALYZE posts;
-- Full vacuum (locks table)
VACUUM FULL posts;
-- Check for bloat
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Monitoring Queries
Slow Query Log
log_min_duration_statement = 1000 # Log queries > 1s
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find slow queries
SELECT
query,
calls,
total_time,
mean_time,
max_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;
Partitioning Large Tables
For tables with millions of rows:
-- Create partitioned table
CREATE TABLE posts_partitioned (
id SERIAL,
content TEXT,
created_at TIMESTAMP
) PARTITION BY RANGE (created_at);
-- Create partitions
CREATE TABLE posts_2024_01 PARTITION OF posts_partitioned
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE posts_2024_02 PARTITION OF posts_partitioned
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
Replication and High Availability
Streaming Replication
Primary server:
wal_level = replica
max_wal_senders = 3
wal_keep_size = 64MB
Standby server:
hot_standby = on
Real-World Checklist
Before going to production:
- Proper indexes on foreign keys
- Indexes on WHERE clause columns
- Connection pooling configured
- Autovacuum enabled
- Slow query logging enabled
- Regular backups configured
- Monitoring in place
- Query performance tested under load
Common Pitfalls
- Too many indexes: Each index slows writes
- No indexes: Kills read performance
- Default configuration: Tune for your hardware
- Ignoring VACUUM: Leads to bloat
- Not monitoring: Can't optimize what you don't measure
Conclusion
PostgreSQL performance is about:
- Proper indexing strategy
- Correct configuration for your hardware
- Regular maintenance
- Monitoring and iteration
Start with these optimizations, measure the impact, and iterate. Performance tuning is continuous, not one-time.
Comments (0)
Loading comments...