B+ Trees

A B+ Tree is a self-balancing tree data structure and a variation of the B-Tree, optimized for disk-based storage systems that read and write data in fixed-size blocks. Two characteristics distinguish it from a regular B-Tree: all actual data records are stored exclusively in leaf nodes, and those leaf nodes are linked together into a sorted linked list.

This design makes range queries extremely efficient — once you locate the starting key, you simply follow the linked list forward rather than repeatedly traversing the tree from the root. B+ Trees are the most widely used index structure in database management systems, including MySQL (InnoDB), PostgreSQL, Oracle, and SQL Server.

Rendering diagram...

B+ Tree vs B-Tree#

Understanding the structural differences between B-Trees and B+ Trees clarifies why B+ Trees are preferred for database indexing.

Feature	B-Tree	B+ Tree
Data location	All nodes store data	Only leaves store data
Internal nodes	Keys + data + child pointers	Keys + child pointers only
Leaf nodes	No sibling links	Linked list for sequential access
Key duplication	No duplicates	Internal keys duplicated in leaves
Range queries	Tree traversal required	Single leaf scan
Fan-out	Lower (data in all nodes)	Higher (more keys per internal node)

B-Tree: Data in All Nodes

Rendering diagram...

B+ Tree: Data Only in Leaves

Rendering diagram...

B+ Tree Properties#

A valid B+ Tree must satisfy the following structural invariants at all times. Violating any of them — even temporarily during an insert or delete — can corrupt the tree.

Property	Description
Internal nodes	Store only keys and child pointers (no data)
Leaf nodes	Store keys and data (or pointers to data)
Leaf linking	All leaves connected via next pointers (optionally prev for bidirectional scans)
Key copies	Internal keys are copies of keys in leaves
All leaves	At the same depth (perfectly balanced)
Order m	Internal: max m children; Leaf: max m-1 key-value pairs

B+ Tree Node Structure

Internal nodes and leaf nodes have different structures

B+ Tree node with separate internal and leaf structures

Search Operation#

Search in a B+ Tree always ends at a leaf node. This predictability matters: regardless of what key you're looking for, the algorithm always traverses from the root down to a leaf, so search time is always proportional to the tree height — never more, never less. This is more consistent than a B-Tree, where a key stored in an internal node can be found before reaching a leaf.

Searching for key 35

Rendering diagram...

B+ Tree Search

Search always traverses to a leaf node

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Range Queries#

The leaf linked list is what makes B+ Trees so powerful for range queries. Instead of repeatedly navigating the tree for each key in the range, you navigate to the starting leaf once, then scan forward through the linked list until you pass the end of the range. This is why SQL queries with BETWEEN, >, or ORDER BY on indexed columns are fast — they map directly to this scan pattern.

Range Query: 25 to 45

Rendering diagram...

Range Query

Find all keys in a given range using leaf links

Time:O(log n + k) where k = number of resultsSpace:O(k) for storing resultsBest:O(log n) when range is emptyWorst:O(log n + n) when all keys are in range

Insertion#

Insertion in a B+ Tree always targets a leaf node. The process is: navigate to the correct leaf, insert the new key in sorted order, then check whether the leaf has overflowed (reached the order limit). If it has, split the leaf in half and push the first key of the new right leaf up to the parent as a separator. If the parent also overflows, the split propagates upward — potentially all the way to creating a new root.

Before Insert 35 (Order 4)

Rendering diagram...

After Insert 35: Leaf Splits

Rendering diagram...

B+ Tree Insert

Insert key-value pair, splitting nodes as needed

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Deletion#

Deletion from a B+ Tree is more involved than insertion because each node must maintain a minimum occupancy — at least ⌈order/2⌉ - 1 keys. If removing a key drops a node below this minimum, you must either borrow a key from a neighboring sibling (redistribution) or merge the node with a sibling and remove the corresponding separator from the parent. There is also a second concern: if the deleted key was being used as a routing key in an internal node, that separator must be updated to remain consistent with the leaf contents.

Case	Condition	Action
Simple delete	Leaf has enough keys after deletion	Just remove the key
Borrow from sibling	Sibling has extra keys	Redistribute keys, update parent
Merge with sibling	Both at minimum	Combine nodes, remove parent key
Update separator	Deleted key was separator	Replace with new first key of right child

B+ Tree Delete

Delete a key while maintaining tree properties

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Complete Implementation#

The following implementation brings together all the operations covered above — node structure, search, range query, insertion, leaf splitting, and internal splitting — into a single working class. Reading through it in full is a good way to see how the pieces fit together and how the leaf linked list is threaded through every operation.

Complete B+ Tree

Full implementation with all operations

Complete B+ Tree with range queries

Applications#

B+ Trees are the default index structure in most relational databases and many file systems. The following examples show where you are likely encountering B+ Trees in practice.

Application	Why B+ Tree?
MySQL InnoDB	Primary and secondary indexes, clustered index
PostgreSQL	Default index type for most data types
SQLite	All table and index storage
MongoDB	WiredTiger storage engine indexes
File Systems (NTFS, ext4)	Directory and file metadata indexing
LevelDB / RocksDB	Sorted string tables (SSTables) with B+ tree-like index blocks

Reviewing AI-Generated Code#

B+ Trees have two structurally distinct node types and a critical linked list connecting the leaves. These details are easy to overlook, and AI code generators frequently miss them. The most common mistakes are mixing up internal and leaf node structures, and failing to update the leaf linked list during a split. If either invariant breaks, the tree may still pass basic point-lookup tests while silently returning wrong results for range queries.

Issue	What to Check	Common AI Mistake
Node types	Internal nodes have children, leaves have values	Mixing up node structures
Leaf linking	Leaves must maintain next pointers	Breaking linked list during split
Key duplication	Internal keys are copies from leaves	Not copying keys correctly
Search termination	Search always ends at leaf	Returning early from internal node
Range query	Use leaf links, don't traverse tree	Tree traversal for range query

Spotting AI B+ Tree Bugs

Common errors in AI-generated B+ Tree code

Correct leaf split maintaining the linked list

Questions to Ask AI About B+ Tree Code#

"How is the leaf linked list maintained?" — The next pointer on every leaf node must be updated during every split. Trace through the split code to verify this.
"What's the time complexity of the range query?" — The correct answer is O(log n + k), where k is the number of results. If the implementation traverses the entire tree, it is O(n) and is not using the leaf links at all.
"Where is data stored?" — Data must appear only in leaf nodes. If values are stored in internal nodes, the implementation is a B-Tree, not a B+ Tree.
"What gets pushed up to the parent when a leaf splits?" — The first key of the new right leaf becomes the separator in the parent. For internal node splits, the middle key is removed from the node and pushed up — it is not copied to the new node.

Summary#

Concept	Key Point
Structure	Internal nodes: keys only; Leaves: keys + data
Leaf Linking	Leaves form linked list for sequential access
Search	Always traverses to leaf: O(log n)
Range Query	Find start leaf, then scan: O(log n + k)
Insert	Always at leaf, split upward if needed
Delete	Remove from leaf, borrow/merge if underflow
Fan-out	Higher than B-Tree (no data in internal nodes)
Applications	Database indexes, file systems

B+ Trees are the cornerstone of database indexing. Their efficient range queries — enabled by the leaf linked list — and high fan-out — enabled by keeping data out of internal nodes — make them ideal for disk-based storage systems where every I/O operation has a measurable cost. Every time you run a WHERE clause, a JOIN, or an ORDER BY on an indexed column in a relational database, a B+ Tree is doing the work behind the scenes.

PreviousB-Tree

NextTrie