How Database Indexing Works
What is Indexing?β
πWithout Index β Full Table Scan
P1
βP2
βP3
β...
βP6
βP7
Read every page sequentially until you find what you need β like flipping through an entire book page by page.
7 pages checkedO(n)
πWith Index β Direct Lookup
INDEX
βP7
Look up the term in the index at the back of the book, then jump directly to the right page.
1 page checkedO(log n)
π―
What is an Index?
A separate data structure that stores a sorted copy of selected columns alongside pointers to the actual rows.
β‘
Why use Indexes?
Without an index, the database must scan every row. With one, it traverses a tree β reducing reads from billions to ~5.
βοΈ
How does it work?
A B-tree keeps data sorted. Each level narrows the search range by following the correct branch β like a decision tree.
Visualizationβ
Watch both approaches side by side for the same query β see how many rows the full table scan checks vs how few nodes the B-tree visits:
0 / 13
Without Index β Full Table Scan
SELECT * FROM users WHERE email = 'alice@mail.com'
| # | Name | |
|---|---|---|
| 1 | Emma | emma@mail.com |
| 2 | Liam | liam@mail.com |
| 3 | Sophia | sophia@mail.com |
| 4 | Noah | noah@mail.com |
| 5 | Olivia | olivia@mail.com |
| 6 | James | james@mail.com |
| 7 | Alice | alice@mail.com |
| 8 | Bob | bob@mail.com |
| 9 | Carol | carol@mail.com |
| 10 | Dave | dave@mail.com |
With B-tree Index β Tree Traversal
SELECT * FROM users WHERE email = 'alice@mail.com'
[ dave | sophia ]
βββββββββββββββΌββββββββββββββ
[ bob ]
[ james ]
[ noah ]
ββββββΌβββββ
alice, bob
carol, dave
emma, james
Waiting for table scan to progress...
Types of Indexesβ
π³
B-tree
Default in most databases. Sorted structure supports range queries.
WHERE, ORDER BY, JOIN#οΈβ£
Hash
O(1) exact match lookup using a hash table.
Equality only (=)π
Bitmap
Bit arrays for low-cardinality columns.
Data warehousingπ
Composite
Index on multiple columns β column order matters.
Multi-column WHEREπ¦
Covering
Includes all queried columns β avoids table lookups entirely.
Index-only scansπ
Unique
Enforces uniqueness on column values.
Primary keysIndex Scan Strategiesβ
The query optimizer chooses between these strategies based on cost estimates:
Index SeekFastest
B-tree traversal to find specific rows directly.
Index ScanFast
Read the entire index in order. Faster than table scan when index covers the query.
Full Table ScanSlowest
Read every row sequentially. Sometimes faster than index when selecting most rows.
Key Takeawaysβ
π’
Without Index
Every row is checked sequentially β O(n) β painfully slow for large tables.
π
With B-tree Index
The tree narrows the search at each level β only a few nodes visited β O(log n).
βοΈ
Trade-off
Indexes speed up reads but slow down writes β every INSERT/UPDATE/DELETE must update the index too.
πΎ
Storage Cost
Indexes consume additional disk space β typically 10-20% of table size.
π²
B-tree Depth
For 1 billion rows with branching factor 100 β only ~4-5 levels needed.
When to Index vs When NOT toβ
β
When to Index
βπColumns frequently used in WHERE clauses
βπColumns used in JOIN conditions
βπColumns used in ORDER BY or GROUP BY
βπHigh cardinality columns (many distinct values)
β When NOT to Index
βπTables with few rows (scan is fast enough)
βπLow cardinality (e.g. boolean β index won't help)
ββοΈHeavy write workloads (each write updates every index)
βπFrequently updated columns (maintenance cost is high)
How B-tree Indexes Work (Deep Dive)β
The most common index type is the B-tree (Balanced Tree). It keeps data sorted and allows searches, insertions, and deletions in O(log n) time:
- The root node contains keys that divide the data into ranges
- Each child node narrows the range further
- Leaf nodes contain the actual values and row pointers
- The tree stays balanced β all leaf nodes are at the same depth
For example, searching 1 billion rows with a branching factor of 100 requires only ~4-5 levels β that's 4-5 disk reads instead of 1 billion.
Learn Moreβ
Read the full theory in Database Design > Indexing.