How can I proactively monitor for query plan flips and significant planner misestimates in PostgreSQL to prevent outages?
11:29 22 Feb 2026

I recently read a postmortem here on a production outage caused by a PostgreSQL query plan flip, where an automatic ANALYZE led the planner to choose a bad execution plan due to insufficient statistics, causing severe performance degradation and 429s at the frontend.

I want to improve monitoring/alerting so that I can detect or predict these kinds of harmful planner decisions before they fully impact users.

Specifically:

  1. What PostgreSQL metrics or planner internals should I be collecting and alerting on?

  2. Are there tools/plugins that can detect significant plan changes or statistic anomalies (e.g., huge difference in row estimates vs actual row counts)?

  3. Are there best practices for setting early warning alerts without generating too much noise?

postgresql monitoring query-planner