Obsfly
postgres / bloat · live vs dead tuplesliveorders22% deadevents58% deadusers8% deadsessions40% deadaudit_log72% deadproducts4% dead

Postgres

Postgres 膨胀与 autovacuum:2026 调优指南

表与索引膨胀的真实成本、Pg 16+ 中 autovacuum 的工作机制、关键参数 — 以及在 OOM 前找出最严重表的查询。

Published ·13 min read

Postgres tables grow faster than the row count would suggest. Indexes lose half their efficiency every few months. Autovacuum can’t keep up. Most teams find out when latency drift becomes pages, not before. This is a tuning guide, not theory — every parameter listed has a recommended value and the reasoning behind it.

On this page
  1. How autovacuum actually works
  2. Find your worst-bloated tables
  3. The 7 parameters that matter
  4. Index bloat is its own beast
  5. Already in trouble?
  6. What to monitor
  7. FAQ

How autovacuum actually works (a working model)

Postgres’s MVCC keeps old row versions. When you UPDATE a row, the old version becomes a dead tuple. Autovacuum scans tables, removes dead tuples, returns space to the free-space map. Two key thresholds trigger it per table:

# A table is autovacuumed when:
n_dead_tup > autovacuum_vacuum_threshold
              + autovacuum_vacuum_scale_factor * reltuples

# Defaults: 50 + 0.2 * reltuples
# A 100M-row table needs 20M dead tuples before autovacuum starts. That's a lot.

The default scale factor of 0.2 is fine for small tables, catastrophic for large ones. Lowering it on big tables is the single most-impactful tuning change most teams can make.

Find your most-bloated tables

SELECT
  schemaname || '.' || relname AS table,
  pg_size_pretty(pg_total_relation_size(relid)) AS total,
  n_live_tup,
  n_dead_tup,
  ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
  last_vacuum,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 25;

Anything with dead_pct > 20% is a problem. Anything with last_autovacuum > 7 days ago and active writes is a bigger problem.

The 7 parameters that matter

ParameterDefaultRecommended for big tables
autovacuum_vacuum_scale_factor0.20.05 (table-level on hot tables)
autovacuum_vacuum_threshold501000 (avoids vacuuming tiny tables)
autovacuum_naptime1min15s on busy fleets
autovacuum_max_workers35-8 on multi-core hosts
autovacuum_vacuum_cost_limit2001000-2000 (autovacuum runs faster)
maintenance_work_mem64MB1-2GB on hosts with the RAM
autovacuum_freeze_max_age200M1B (delay aggressive vacuum) — see wraparound post

Index bloat is its own beast

Vacuum doesn’t fix index bloat. Indexes accumulate empty pages from deleted/updated rows; Postgres marks pages reusable but doesn’t reclaim space.

-- Find biggest indexes (the ones that benefit most from REINDEX)
SELECT schemaname || '.' || indexrelname AS index,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size,
       idx_scan
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;

-- Online reindex (Pg 12+)
REINDEX INDEX CONCURRENTLY idx_orders_customer_id;

Already in trouble: pg_repack and VACUUM FULL

  • pg_repack is the right tool 95% of the time. Online (no exclusive lock), creates a duplicate, swaps. Plan for 2× the table’s current size in temp disk.
  • VACUUM FULL takes an AccessExclusiveLock for the duration. Use only in maintenance windows on tables you don’t need read-available.
  • CLUSTER is VACUUM FULL plus reordering by an index. Same lock cost.

What to monitor (alert thresholds)

  • Per-table dead_tup_pct trending; alert at > 25% sustained 24h.
  • autovacuum_count over rolling window; sudden drop on a busy table = autovacuum stuck.
  • maintenance_work_mem utilization (rough proxy: current_setting('maintenance_work_mem') vs vacuum scan time).
  • pg_stat_progress_vacuum to see live vacuum progress.

FAQ

How often should I run VACUUM manually?+
On a well-tuned cluster, never. Manual vacuum is for emergencies. If autovacuum can't keep up, the answer is to tune autovacuum, not run vacuum on a cron.
Does TOAST get vacuumed?+
Yes — autovacuum picks up the TOAST table when its own thresholds trigger. Watch pg_stat_user_tables for the corresponding pg_toast.* entries.
pg_repack vs pg_squeeze?+
pg_repack is more mature and widely deployed. pg_squeeze (Pg 13+) avoids the temp-table approach and is friendlier on disk; pick it if disk is the constraint.

Keep reading

· · ·

像监控服务一样监控你的数据库。

预约 30 分钟演示。我们一起规划你的数据库规模,并报出第一个 30 天合作的报价。

Postgres 膨胀与 autovacuum:2026 调优指南 · Obsfly