Index Fundamentals - B-Tree, Covering Index

Indexes in MySQL (InnoDB) speed up lookups and range scans by organizing data in B-Tree structures. A covering index contains all columns needed by a query so the engine can satisfy it without reading the table. This article explains how B-Tree indexes work and when to use covering indexes for better performance.

Overview

  • B-Tree index: InnoDB uses B+ trees. Keys are sorted; you can do equality and range lookups (e.g. WHERE id = 5, WHERE created_at BETWEEN ...). Leaf nodes hold the index key and either a primary key value (for secondary indexes) or the full row (for the clustered index).
  • Clustered index: The primary key is the clustered index; table rows are stored in the leaf level. One table has one clustered index.
  • Secondary index: Any other index. Its leaf stores (index columns + primary key). A lookup by secondary index often requires a table lookup (lookup into the clustered index) to get the rest of the row.
  • Covering index: An index that contains every column the query needs (including SELECT and WHERE). No need to go back to the table; the query is satisfied from the index alone.

Example

Example 1: Table and a secondary index

SQL
CREATE TABLE orders (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    status VARCHAR(20),
    created_at DATETIME,
    KEY idx_user_created (user_id, created_at)
);
  • Lookup by user_id or (user_id, created_at) can use idx_user_created. If you SELECT *, InnoDB uses the index to find matching primary keys, then fetches full rows from the clustered index (table lookup).

Example 2: Covering index — no table lookup

SQL
-- Query: list order ids and status for a user in a time range
SELECT id, status
FROM orders
WHERE user_id = 100 AND created_at >= '2025-01-01';

-- Covering index: (user_id, created_at) + include (id, status)
-- In MySQL 8.0 you can add columns for covering:
ALTER TABLE orders ADD KEY idx_user_created_cover (user_id, created_at, id, status);
  • For this query, idx_user_created_cover contains user_id, created_at, id, status. The optimizer can satisfy the query from the index only (no table read).

Example 3: EXPLAIN shows "Using index"

SQL
EXPLAIN SELECT id, status FROM orders
WHERE user_id = 100 AND created_at >= '2025-01-01';
  • If you see Using index in the Extra column, the query is covered and avoids table access.

Leftmost Prefix Rule

A composite index (A, B, C) can be used for:

  • WHERE A = ?
  • WHERE A = ? AND B = ?
  • WHERE A = ? AND B = ? AND C = ?
  • WHERE A = ? AND B > ? (range on B stops use of C)
  • WHERE A = ? ORDER BY B, C

It cannot be used for:

  • WHERE B = ? (no leading column A)
  • WHERE C = ? (no leading columns)
  • WHERE B = ? AND C = ? (no A)

Column order matters. Put equality columns first, then range columns. For WHERE user_id = ? AND created_at >= ?, the index (user_id, created_at) is optimal.

Core Mechanism / Behavior

  • B+ tree: Sorted structure; range scans are efficient because leaf nodes are linked. Lookup cost is roughly O(log N) in the number of index entries.
  • Covering: If the index includes all columns in the query (SELECT, WHERE, ORDER BY, GROUP BY), the executor does not need to read the table. Fewer random I/Os and better cache use.
Query patternIndex typeNote
WHERE id = ?Primary / uniqueDirect lookup
WHERE user_id = ?Secondary on user_idMay require table lookup
SELECT id, status WHERE user_id = ?Covering (user_id, id, status)No table lookup
ORDER BY created_atIndex with created_atAvoid filesort if used

Key Rules

  • Prefer covering indexes for hot queries: add needed columns to the index so the query is satisfied from the index only.
  • Order columns in a composite index by selectivity and predicate usage (e.g. equality columns first, then range).
  • Use EXPLAIN to confirm "Using index" for covering and to see which index is chosen. Avoid SELECT * when a covering index can serve a subset of columns.

What's Next

For lock behavior with indexes, see Locks - Row Lock vs Gap Lock. For query tuning, see Slow Query Diagnosis Workflow and Pagination - Limit/Offset vs Seek Method.