Slow Query Diagnosis Workflow

Diagnosing slow queries follows a repeatable workflow: identify the query, capture the plan and execution context, then improve schema or SQL. This article outlines steps and tools (EXPLAIN, slow log, metrics) and gives a small checklist and example so you can fix slow queries systematically.

Overview

  • Identify: Use slow query log, APM, or performance_schema to find which SQL and which endpoints are slow. Focus on high total time (frequency × latency), not only single-query time.
  • Capture: Get the exact SQL (with literal values or use normalized form), the execution plan (EXPLAIN / EXPLAIN ANALYZE), and context (concurrency, data volume, indexes). Reproduce in a test or staging environment if possible.
  • Improve: Interpret the plan (full table scan, wrong index, temporary table, filesort), then add or change indexes, rewrite the query, or adjust schema. Re-measure after each change.

Example

Example 1: Enabling and using slow query log

SQL
-- Check current settings
SHOW VARIABLES LIKE 'slow_query_log%';
SHOW VARIABLES LIKE 'long_query_time';

-- Enable (example: log queries > 1s)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
-- Log file location: slow_query_log_file
  • After reproduction, inspect the slow log for the exact SQL and duration. Use pt-query-digest or similar to aggregate and find the most expensive queries.

Example 2: EXPLAIN and what to look for

SQL
EXPLAIN SELECT * FROM orders WHERE user_id = 100 ORDER BY created_at DESC LIMIT 20;
  • type: Prefer const, eq_ref, ref, range; avoid ALL (full table scan) on large tables.
  • key: Which index is used; NULL means no index used.
  • rows: Estimated rows examined; very large suggests missing or wrong index.
  • Extra: Using filesort or Using temporary can indicate expensive sort or temp table; Using index is good (covering).

Example 3: Adding an index and re-checking

SQL
-- Before: full scan or filesort
EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';

-- Add composite index for the predicate and covering columns
ALTER TABLE orders ADD KEY idx_user_created (user_id, created_at, id, status);

-- After: type=range, key=idx_user_created, Extra=Using index
EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';
  • After adding the index, run EXPLAIN again and re-measure latency and throughput.

Core Mechanism / Behavior

  • Optimizer: MySQL chooses an index (or none) based on statistics, selectivity, and cost. Wrong or outdated statistics can lead to bad plans; sometimes you need to hint or rewrite.
  • Execution: Large rows, filesort, or temporary tables usually mean more I/O and CPU. Reducing scanned rows (index) and avoiding filesort (matching ORDER BY to index) often gives the biggest gain.
  • Locking: Long-running queries can hold locks or be blocked; check information_schema / performance_schema for lock waits if slow queries coincide with contention.
StepAction
1 FindSlow log, APM, or performance_schema to get slow SQL and caller
2 PlanEXPLAIN (and EXPLAIN ANALYZE if available) for the query
3 Checktype, key, rows, Extra; look for full scan, wrong index, filesort
4 FixAdd/change index, rewrite SQL, or change schema; then re-measure

Key Rules

  • Always get the real execution plan (EXPLAIN) for the exact SQL and data shape; plans can differ with different values and data volume.
  • Prefer index-based solutions (covering index, composite index for predicate + order) before rewriting application logic; but remove redundant indexes that hurt writes.
  • Re-run the query and monitor after changes; use the same environment and data volume when possible to avoid regressions.

What's Next

See Index Fundamentals for B-Tree and covering indexes. See Pagination - Limit/Offset vs Seek for optimizing list queries. See MVCC and Locks if slowness is related to locking or isolation.