Index Fundamentals - B-Tree, Covering Index
Indexes in MySQL (InnoDB) speed up lookups and range scans by organizing data in B-Tree structures. A covering index contains all columns needed by a query so the engine can satisfy it without reading the table. This article explains how B-Tree indexes work and when to use covering indexes for better performance.
Overview
- B-Tree index: InnoDB uses B+ trees. Keys are sorted; you can do equality and range lookups (e.g.
WHERE id = 5,WHERE created_at BETWEEN ...). Leaf nodes hold the index key and either a primary key value (for secondary indexes) or the full row (for the clustered index). - Clustered index: The primary key is the clustered index; table rows are stored in the leaf level. One table has one clustered index.
- Secondary index: Any other index. Its leaf stores (index columns + primary key). A lookup by secondary index often requires a table lookup (lookup into the clustered index) to get the rest of the row.
- Covering index: An index that contains every column the query needs (including
SELECTandWHERE). No need to go back to the table; the query is satisfied from the index alone.
Example
Example 1: Table and a secondary index
SQLCREATE TABLE orders ( id BIGINT PRIMARY KEY AUTO_INCREMENT, user_id BIGINT NOT NULL, status VARCHAR(20), created_at DATETIME, KEY idx_user_created (user_id, created_at) );
- Lookup by
user_idor(user_id, created_at)can useidx_user_created. If youSELECT *, InnoDB uses the index to find matching primary keys, then fetches full rows from the clustered index (table lookup).
Example 2: Covering index — no table lookup
SQL-- Query: list order ids and status for a user in a time range SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01'; -- Covering index: (user_id, created_at) + include (id, status) -- In MySQL 8.0 you can add columns for covering: ALTER TABLE orders ADD KEY idx_user_created_cover (user_id, created_at, id, status);
- For this query,
idx_user_created_covercontainsuser_id,created_at,id,status. The optimizer can satisfy the query from the index only (no table read).
Example 3: EXPLAIN shows "Using index"
SQLEXPLAIN SELECT id, status FROM orders WHERE user_id = 100 AND created_at >= '2025-01-01';
- If you see Using index in the
Extracolumn, the query is covered and avoids table access.
Leftmost Prefix Rule
A composite index (A, B, C) can be used for:
WHERE A = ?WHERE A = ? AND B = ?WHERE A = ? AND B = ? AND C = ?WHERE A = ? AND B > ?(range on B stops use of C)WHERE A = ? ORDER BY B, C
It cannot be used for:
WHERE B = ?(no leading column A)WHERE C = ?(no leading columns)WHERE B = ? AND C = ?(no A)
Column order matters. Put equality columns first, then range columns. For WHERE user_id = ? AND created_at >= ?, the index (user_id, created_at) is optimal.
Core Mechanism / Behavior
- B+ tree: Sorted structure; range scans are efficient because leaf nodes are linked. Lookup cost is roughly O(log N) in the number of index entries.
- Covering: If the index includes all columns in the query (SELECT, WHERE, ORDER BY, GROUP BY), the executor does not need to read the table. Fewer random I/Os and better cache use.
| Query pattern | Index type | Note |
|---|---|---|
WHERE id = ? | Primary / unique | Direct lookup |
WHERE user_id = ? | Secondary on user_id | May require table lookup |
SELECT id, status WHERE user_id = ? | Covering (user_id, id, status) | No table lookup |
ORDER BY created_at | Index with created_at | Avoid filesort if used |
Key Rules
- Prefer covering indexes for hot queries: add needed columns to the index so the query is satisfied from the index only.
- Order columns in a composite index by selectivity and predicate usage (e.g. equality columns first, then range).
- Use EXPLAIN to confirm "Using index" for covering and to see which index is chosen. Avoid
SELECT *when a covering index can serve a subset of columns.
What's Next
For lock behavior with indexes, see Locks - Row Lock vs Gap Lock. For query tuning, see Slow Query Diagnosis Workflow and Pagination - Limit/Offset vs Seek Method.