Slow Query Diagnosis Workflow
Diagnosing slow queries follows a repeatable workflow: identify the query, capture the plan and execution context, then improve schema or SQL. This article outlines steps and tools (EXPLAIN, slow log, metrics) and gives a small checklist and example so you can fix slow queries systematically.
Overview
- Identify: Use slow query log, APM, or
performance_schemato find which SQL and which endpoints are slow. Focus on high total time (frequency × latency), not only single-query time. - Capture: Get the exact SQL (with literal values or use normalized form), the execution plan (EXPLAIN / EXPLAIN ANALYZE), and context (concurrency, data volume, indexes). Reproduce in a test or staging environment if possible.
- Improve: Interpret the plan (full table scan, wrong index, temporary table, filesort), then add or change indexes, rewrite the query, or adjust schema. Re-measure after each change.
Example
Example 1: Enabling and using slow query log
SQL-- Check current settings SHOW VARIABLES LIKE 'slow_query_log%'; SHOW VARIABLES LIKE 'long_query_time'; -- Enable (example: log queries > 1s) SET GLOBAL slow_query_log = 'ON'; SET GLOBAL long_query_time = 1; -- Log file location: slow_query_log_file
- After reproduction, inspect the slow log for the exact SQL and duration. Use
pt-query-digestor similar to aggregate and find the most expensive queries.
Example 2: EXPLAIN and what to look for
SQLEXPLAIN SELECT * FROM orders WHERE user_id = 100 ORDER BY created_at DESC LIMIT 20;
- type: Prefer
const,eq_ref,ref,range; avoidALL(full table scan) on large tables. - key: Which index is used; NULL means no index used.
- rows: Estimated rows examined; very large suggests missing or wrong index.
- Extra:
Using filesortorUsing temporarycan indicate expensive sort or temp table;Using indexis good (covering).
Example 3: Adding an index and re-checking
SQL-- Before: full scan or filesort EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01'; -- Add composite index for the predicate and covering columns ALTER TABLE orders ADD KEY idx_user_created (user_id, created_at, id, status); -- After: type=range, key=idx_user_created, Extra=Using index EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';
- After adding the index, run EXPLAIN again and re-measure latency and throughput.
Core Mechanism / Behavior
- Optimizer: MySQL chooses an index (or none) based on statistics, selectivity, and cost. Wrong or outdated statistics can lead to bad plans; sometimes you need to hint or rewrite.
- Execution: Large
rows, filesort, or temporary tables usually mean more I/O and CPU. Reducing scanned rows (index) and avoiding filesort (matching ORDER BY to index) often gives the biggest gain. - Locking: Long-running queries can hold locks or be blocked; check
information_schema/performance_schemafor lock waits if slow queries coincide with contention.
| Step | Action |
|---|---|
| 1 Find | Slow log, APM, or performance_schema to get slow SQL and caller |
| 2 Plan | EXPLAIN (and EXPLAIN ANALYZE if available) for the query |
| 3 Check | type, key, rows, Extra; look for full scan, wrong index, filesort |
| 4 Fix | Add/change index, rewrite SQL, or change schema; then re-measure |
Key Rules
- Always get the real execution plan (EXPLAIN) for the exact SQL and data shape; plans can differ with different values and data volume.
- Prefer index-based solutions (covering index, composite index for predicate + order) before rewriting application logic; but remove redundant indexes that hurt writes.
- Re-run the query and monitor after changes; use the same environment and data volume when possible to avoid regressions.
What's Next
See Index Fundamentals for B-Tree and covering indexes. See Pagination - Limit/Offset vs Seek for optimizing list queries. See MVCC and Locks if slowness is related to locking or isolation.