PostgreSQL How-tos

Performance Optimization

Keeping PostgreSQL fast requires a combination of query analysis, indexing strategy, and configuration tuning. These guides cover the most impactful techniques you can apply to your databases today.

Identifying Slow Queries with pg_stat_statements

The pg_stat_statements extension is the single most valuable tool for finding performance bottlenecks. It tracks execution statistics for all SQL statements executed by the server.

To enable it, add the extension to your postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'

After restarting PostgreSQL and creating the extension with CREATE EXTENSION pg_stat_statements;, you can query it to find the most time-consuming queries:

total_exec_time — total time spent executing the query across all calls
calls — number of times the query has been executed
mean_exec_time — average execution time per call
rows — total number of rows returned or affected

Sort by total_exec_time DESC to find queries consuming the most overall resources, or by mean_exec_time DESC to find individually slow queries. DBA SaaS automatically collects and analyzes these statistics for you.

Index Strategy: B-tree, GIN, GiST

Choosing the right index type is critical for query performance. PostgreSQL supports several index types, each optimized for different access patterns:

B-tree (default) — best for equality and range queries (=, <, >, BETWEEN, ORDER BY). Use this for most columns.
GIN (Generalized Inverted Index) — ideal for full-text search, JSONB containment queries (@>), and array operations. Use when a column contains composite values you need to search within.
GiST (Generalized Search Tree) — best for geometric data, range types, and full-text search when you need nearest-neighbor queries. Also supports exclusion constraints.
BRIN (Block Range Index) — extremely compact index for large tables where data is naturally ordered (e.g., timestamp columns in append-only tables).

Avoid over-indexing. Every index adds write overhead and consumes storage. Focus on indexes that support your most frequent and expensive queries.

VACUUM and Autovacuum Tuning

PostgreSQL uses MVCC (Multi-Version Concurrency Control), which means deleted and updated rows leave behind dead tuples. VACUUM reclaims this space and updates visibility maps.

Key autovacuum parameters to tune:

autovacuum_vacuum_threshold — minimum number of dead tuples before triggering vacuum (default: 50)
autovacuum_vacuum_scale_factor — fraction of table size to add to threshold (default: 0.2)
autovacuum_naptime — delay between autovacuum runs (default: 1 minute)
autovacuum_max_workers — number of concurrent autovacuum processes (default: 3)

For large, high-churn tables, reduce autovacuum_vacuum_scale_factor to 0.01 or lower and increase autovacuum_vacuum_threshold. This ensures vacuum runs more frequently on busy tables.

Connection Pooling with PgBouncer

Each PostgreSQL connection consumes memory (typically 5-10 MB). Applications with hundreds of connections can exhaust server resources. PgBouncer sits between your application and PostgreSQL, multiplexing many client connections onto fewer server connections.

Three pooling modes are available:

Session pooling — server connection assigned for the entire client session. Safest but least efficient.
Transaction pooling — server connection assigned per transaction. Best balance of safety and efficiency. Recommended for most applications.
Statement pooling — server connection assigned per statement. Most efficient but incompatible with multi-statement transactions.

work_mem and shared_buffers Tuning

Two of the most impactful configuration parameters:

shared_buffers — memory PostgreSQL uses for caching data. Set to 25% of total RAM as a starting point (e.g., 4 GB for a 16 GB server). Going beyond 40% rarely helps.
work_mem — memory per sort/hash operation. Default (4 MB) is often too low. For OLTP workloads, try 16-64 MB. For analytical queries, 256 MB or more. Be cautious: this is per-operation, not per-connection.

DBA SaaS analyzes your workload patterns and recommends optimal values for these settings based on your specific hardware and query mix.

Database Administration

Reliable administration practices form the foundation of any production PostgreSQL deployment. These guides cover essential day-to-day operations.

Backup Strategies

A robust backup strategy combines multiple approaches:

pg_dump — logical backup of individual databases. Best for small to medium databases, allows selective table restoration, and produces portable SQL or custom-format archives.
pg_basebackup — physical backup of the entire cluster. Required for point-in-time recovery (PITR). Faster than pg_dump for large databases.
WAL archiving — continuous archiving of write-ahead logs. Combined with pg_basebackup, enables PITR to any point in time. Configure archive_mode = on and set archive_command to copy WAL files to a safe location.

Best practice: use pg_basebackup + WAL archiving for disaster recovery, and pg_dump for logical backups you can inspect and selectively restore.

Replication Setup

PostgreSQL supports two main replication types:

Streaming replication — physical replication that sends WAL records to standby servers in real time. Provides exact byte-for-byte copies. Ideal for high availability and read scaling.
Logical replication — publishes changes at the row level, allowing selective table replication, cross-version replication, and data transformation. Configured using CREATE PUBLICATION and CREATE SUBSCRIPTION.

For high availability, use streaming replication with synchronous commit for zero data loss, or asynchronous for better performance with minimal lag.

Role and Permission Management

Follow the principle of least privilege:

Create separate roles for applications, analytics, and administration
Use GRANT to assign specific privileges on schemas and tables
Use ALTER DEFAULT PRIVILEGES to automatically grant permissions on new objects
Never use the postgres superuser role for application connections
Use pg_read_all_data and pg_write_all_data roles (PostgreSQL 14+) for broad read/write access without superuser

Tablespace Management

Tablespaces allow you to control the physical location of database objects on disk:

Place frequently accessed indexes on fast SSD storage
Move archive tables to slower, larger disks
Create tablespaces with CREATE TABLESPACE name LOCATION '/path/to/directory'
Assign tables and indexes with ALTER TABLE ... SET TABLESPACE

Monitoring and Troubleshooting

Effective monitoring is the key to catching problems before they affect users. PostgreSQL exposes rich statistics through system views.

Key System Views

pg_stat_activity — shows all current connections, their queries, wait events, and states. Essential for identifying long-running queries and blocked sessions.
pg_stat_user_tables — per-table statistics including sequential and index scans, live and dead tuples, and last vacuum/analyze times. Use this to identify tables needing indexes or vacuum tuning.
pg_stat_bgwriter — checkpoint and background writer statistics. High buffers_backend values indicate the background writer is not keeping up, causing performance degradation.

Detecting Lock Contention

Lock contention is a common source of application slowdowns. To detect it:

Query pg_stat_activity for sessions with wait_event_type = 'Lock'
Join pg_locks with pg_stat_activity to identify which sessions are blocking others
Look for long-running transactions holding AccessExclusiveLock or RowExclusiveLock
Consider using lock_timeout to prevent sessions from waiting indefinitely

DBA SaaS automatically detects lock contention patterns and alerts you when blocking chains form.

Analyzing Query Plans with EXPLAIN ANALYZE

Use EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) to understand how PostgreSQL executes a query:

Seq Scan — full table scan. Acceptable for small tables, a red flag for large ones.
Index Scan / Index Only Scan — uses an index to find rows. Index Only Scan is faster as it reads data from the index itself.
Nested Loop / Hash Join / Merge Join — different join strategies. The planner chooses based on estimated row counts and available indexes.
Sort / Hash Agg — watch for sorts or aggregations that spill to disk (indicated by Sort Method: external merge). Increase work_mem if this happens frequently.

Compare estimated rows vs actual rows. Large discrepancies indicate stale statistics — run ANALYZE on the affected tables.

Common Error Patterns and Fixes

"too many connections" — increase max_connections or, better yet, implement connection pooling with PgBouncer
"deadlock detected" — ensure transactions acquire locks in a consistent order across your application
"could not extend file" — disk space exhausted. Free space, add storage, or move tablespaces to a larger volume
"canceling statement due to statement timeout" — query exceeded the configured statement_timeout. Optimize the query or increase the timeout for specific sessions
High replication lag — check network bandwidth between primary and replica, increase wal_sender_timeout, and ensure the replica has sufficient I/O capacity