Postgres

Postgres 膨胀与 autovacuum:2026 调优指南

表与索引膨胀的真实成本、Pg 16+ 中 autovacuum 的工作机制、关键参数 — 以及在 OOM 前找出最严重表的查询。

Published 2026-04-25·13 min read

Postgres tables grow faster than the row count would suggest. Indexes lose half their efficiency every few months. Autovacuum can’t keep up. Most teams find out when latency drift becomes pages, not before. This is a tuning guide, not theory — every parameter listed has a recommended value and the reasoning behind it.

On this page

How autovacuum actually works
Find your worst-bloated tables
The 7 parameters that matter
Index bloat is its own beast
Already in trouble?
What to monitor
FAQ

How autovacuum actually works (a working model)

Postgres’s MVCC keeps old row versions. When you UPDATE a row, the old version becomes a dead tuple. Autovacuum scans tables, removes dead tuples, returns space to the free-space map. Two key thresholds trigger it per table:

# A table is autovacuumed when:
n_dead_tup > autovacuum_vacuum_threshold
              + autovacuum_vacuum_scale_factor * reltuples

# Defaults: 50 + 0.2 * reltuples
# A 100M-row table needs 20M dead tuples before autovacuum starts. That's a lot.

The default scale factor of 0.2 is fine for small tables, catastrophic for large ones. Lowering it on big tables is the single most-impactful tuning change most teams can make.

Find your most-bloated tables

SELECT
  schemaname || '.' || relname AS table,
  pg_size_pretty(pg_total_relation_size(relid)) AS total,
  n_live_tup,
  n_dead_tup,
  ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
  last_vacuum,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 25;

Anything with dead_pct > 20% is a problem. Anything with last_autovacuum > 7 days ago and active writes is a bigger problem.

The 7 parameters that matter

Parameter	Default	Recommended for big tables
`autovacuum_vacuum_scale_factor`	0.2	0.05 (table-level on hot tables)
`autovacuum_vacuum_threshold`	50	1000 (avoids vacuuming tiny tables)
`autovacuum_naptime`	1min	15s on busy fleets
`autovacuum_max_workers`	3	5-8 on multi-core hosts
`autovacuum_vacuum_cost_limit`	200	1000-2000 (autovacuum runs faster)
`maintenance_work_mem`	64MB	1-2GB on hosts with the RAM
`autovacuum_freeze_max_age`	200M	1B (delay aggressive vacuum) — see wraparound post

Index bloat is its own beast

Vacuum doesn’t fix index bloat. Indexes accumulate empty pages from deleted/updated rows; Postgres marks pages reusable but doesn’t reclaim space.

-- Find biggest indexes (the ones that benefit most from REINDEX)
SELECT schemaname || '.' || indexrelname AS index,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size,
       idx_scan
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;

-- Online reindex (Pg 12+)
REINDEX INDEX CONCURRENTLY idx_orders_customer_id;

Already in trouble: pg_repack and VACUUM FULL

pg_repack is the right tool 95% of the time. Online (no exclusive lock), creates a duplicate, swaps. Plan for 2× the table’s current size in temp disk.
VACUUM FULL takes an AccessExclusiveLock for the duration. Use only in maintenance windows on tables you don’t need read-available.
CLUSTER is VACUUM FULL plus reordering by an index. Same lock cost.

What to monitor (alert thresholds)

Per-table dead_tup_pct trending; alert at > 25% sustained 24h.
autovacuum_count over rolling window; sudden drop on a busy table = autovacuum stuck.
maintenance_work_mem utilization (rough proxy: current_setting('maintenance_work_mem') vs vacuum scan time).
pg_stat_progress_vacuum to see live vacuum progress.

FAQ

How often should I run VACUUM manually?+

On a well-tuned cluster, never. Manual vacuum is for emergencies. If autovacuum can't keep up, the answer is to tune autovacuum, not run vacuum on a cron.

Does TOAST get vacuumed?+

Yes — autovacuum picks up the TOAST table when its own thresholds trigger. Watch pg_stat_user_tables for the corresponding pg_toast.* entries.

pg_repack vs pg_squeeze?+

pg_repack is more mature and widely deployed. pg_squeeze (Pg 13+) avoids the temp-table approach and is friendlier on disk; pick it if disk is the constraint.

Keep reading

Postgres

Postgres transaction-ID wraparound: 4 hours from your worst Saturday

When pg_stat_activity shows 'autovacuum (to prevent wraparound)' and your write rate stops, you have 4 hours of work to do correctly or your DB goes read-only. Here's the real runbook.

Postgres

Postgres slow queries: 12 causes and how to find each one

A field-tested playbook for diagnosing a slow Postgres query in production — from missing indexes to plan flips to bloated tables — with the SQL to find each cause and the fix.

← All posts

Postgres 膨胀与 autovacuum:2026 调优指南

How autovacuum actually works (a working model)

Find your most-bloated tables

The 7 parameters that matter

Index bloat is its own beast

Already in trouble: pg_repack and VACUUM FULL

What to monitor (alert thresholds)

FAQ

像监控服务一样监控你的数据库。