Obsfly
redis / SLOWLOG GET 5liveidmscommand8124412.4KEYS user:*8123182.0SMEMBERS active:online812294.8SORT cart:9912 BY price812162.1ZRANGEBYSCORE leaderboard 0 +inf812044.0HGETALL session:8a9c12

Redis

Redis SLOWLOG : la télémétrie sous-estimée qui attrape la moitié de vos incidents

La plupart des équipes lancent Redis avec les SLOWLOG par défaut et ne regardent jamais. Voici comment le tuner, quoi extraire — et 3 classes d'incidents qui n'apparaissent que là.

Published ·8 min read

Most teams ship Redis with default slowlog-log-slower-than = 10000 (10ms) and slowlog-max-len = 128, then never look at it. On a busy instance the 128 entries cycle in seconds and you’ve captured nothing useful. SLOWLOG done right is the most cost-effective Redis instrumentation on the planet.

On this page
  1. Tune SLOWLOG once
  2. Three Redis incidents only SLOWLOG catches
  3. Latency monitor pairs with it
  4. Four queries to extract signal
  5. FAQ

Tune SLOWLOG once

# In redis.conf (persistent)
slowlog-log-slower-than 1000   # 1ms — catches all interesting commands
slowlog-max-len 1024            # 1024 entries — survives a burst

# Apply at runtime if needed
CONFIG SET slowlog-log-slower-than 1000
CONFIG SET slowlog-max-len 1024

Three Redis incidents that only SLOWLOG catches

  • KEYS *. A junior engineer added a debug script that runs KEYS * against a 50M-key instance every minute. Each call blocks Redis for 600ms. CPU graphs show normal utilization (it’s a single command thread, fully loaded for 600ms then idle for 60s). SLOWLOG shows the call.
  • SUNIONSTORE on huge sets. Application code computes a daily union of two big sets; this is O(N+M). 50ms once a day = invisible on metrics, very visible on SLOWLOG.
  • Lua script with surprise complexity. A Lua script that loops over a hash with HKEYS and was fine when the hash had 100 fields. Now it has 100k.

Latency monitor: complementary, not redundant

SLOWLOG records commands. LATENCY MONITOR records events. Different categories:

  • fork — bg-save / AOF rewrite forks. On a 30 GB instance these can take 1.5s.
  • aof-write — fsync stalls.
  • expire-cycle — the active expiration scan. Slow when the keyspace has many keys with TTL.
  • fast-command — a command that should be O(1) took longer. Often a sign of memory pressure or fragmentation.
CONFIG SET latency-monitor-threshold 100
LATENCY HISTORY fork
LATENCY HISTORY expire-cycle
LATENCY RESET fork

Four queries to extract signal

# 1. Top-N slowest in current SLOWLOG buffer
SLOWLOG GET 50

# 2. Per-command frequency
INFO commandstats

# 3. Per-database key counts (suspect KEYS / SCAN abuse on huge DBs)
INFO keyspace

# 4. Client-side breakdown
CLIENT LIST  -- look for clients with large idle / qbuf-free

FAQ

What's the overhead of slowlog-log-slower-than = 1000?+
Negligible. Each entry is ~150 bytes; 1024 entries is ~150KB. The threshold check is a single comparison per command.
Should I export SLOWLOG continuously?+
Yes — agent-side scrape every 30 seconds, drain via SLOWLOG GET + RESET, ship to your DBM. Otherwise the ring buffer overwrites entries you'd want.
Does Redis Cluster have a unified SLOWLOG?+
No — each node has its own. Aggregate cluster-wide in your monitoring layer; per-node is rarely the right view.

Keep reading

· · ·

Surveillez vos bases comme vos services.

Réservez une démo de 30 minutes. Nous spécifions votre flotte ensemble et chiffrons votre premier deal de 30 jours.

Redis SLOWLOG : la télémétrie sous-estimée qui attrape la moitié de vos incidents · Obsfly