Slow Query Diagnosis Workflow

Diagnosing slow queries follows a repeatable workflow: identify the query, capture the plan and execution context, then improve schema or SQL. This article outlines steps and tools (EXPLAIN, slow log, metrics) and gives a small checklist and example so you can fix slow queries systematically.

Overview

Identify: Use slow query log, APM, or performance_schema to find which SQL and which endpoints are slow. Focus on high total time (frequency × latency), not only single-query time.
Capture: Get the exact SQL (with literal values or use normalized form), the execution plan (EXPLAIN / EXPLAIN ANALYZE), and context (concurrency, data volume, indexes). Reproduce in a test or staging environment if possible.
Improve: Interpret the plan (full table scan, wrong index, temporary table, filesort), then add or change indexes, rewrite the query, or adjust schema. Re-measure after each change.

Example

Example 1: Enabling and using slow query log

SQL
-- Check current settings
SHOW VARIABLES LIKE 'slow_query_log%';
SHOW VARIABLES LIKE 'long_query_time';

-- Enable (example: log queries > 1s)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
-- Log file location: slow_query_log_file

After reproduction, inspect the slow log for the exact SQL and duration. Use pt-query-digest or similar to aggregate and find the most expensive queries.

Example 2: EXPLAIN and what to look for

SQL
EXPLAIN SELECT * FROM orders WHERE user_id = 100 ORDER BY created_at DESC LIMIT 20;

type: Prefer const, eq_ref, ref, range; avoid ALL (full table scan) on large tables.
key: Which index is used; NULL means no index used.
rows: Estimated rows examined; very large suggests missing or wrong index.
Extra: Using filesort or Using temporary can indicate expensive sort or temp table; Using index is good (covering).

Example 3: Adding an index and re-checking

SQL
-- Before: full scan or filesort
EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';

-- Add composite index for the predicate and covering columns
ALTER TABLE orders ADD KEY idx_user_created (user_id, created_at, id, status);

-- After: type=range, key=idx_user_created, Extra=Using index
EXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';

After adding the index, run EXPLAIN again and re-measure latency and throughput.

Core Mechanism / Behavior

Optimizer: MySQL chooses an index (or none) based on statistics, selectivity, and cost. Wrong or outdated statistics can lead to bad plans; sometimes you need to hint or rewrite.
Execution: Large rows, filesort, or temporary tables usually mean more I/O and CPU. Reducing scanned rows (index) and avoiding filesort (matching ORDER BY to index) often gives the biggest gain.
Locking: Long-running queries can hold locks or be blocked; check information_schema / performance_schema for lock waits if slow queries coincide with contention.

Step	Action
1 Find	Slow log, APM, or performance_schema to get slow SQL and caller
2 Plan	EXPLAIN (and EXPLAIN ANALYZE if available) for the query
3 Check	type, key, rows, Extra; look for full scan, wrong index, filesort
4 Fix	Add/change index, rewrite SQL, or change schema; then re-measure

Key Rules

Always get the real execution plan (EXPLAIN) for the exact SQL and data shape; plans can differ with different values and data volume.
Prefer index-based solutions (covering index, composite index for predicate + order) before rewriting application logic; but remove redundant indexes that hurt writes.
Re-run the query and monitor after changes; use the same environment and data volume when possible to avoid regressions.

What's Next

See Index Fundamentals for B-Tree and covering indexes. See Pagination - Limit/Offset vs Seek for optimizing list queries. See MVCC and Locks if slowness is related to locking or isolation.