B+ Trees

A B+ Tree is a self-balancing tree data structure and a variation of the B-Tree, optimized for disk-based storage systems that read and write data in fixed-size blocks. Two characteristics distinguish it from a regular B-Tree: all actual data records are stored exclusively in leaf nodes, and those leaf nodes are linked together into a sorted linked list.

This design makes range queries extremely efficient — once you locate the starting key, you simply follow the linked list forward rather than repeatedly traversing the tree from the root. B+ Trees are the most widely used index structure in database management systems, including MySQL (InnoDB), PostgreSQL, Oracle, and SQL Server.

Rendering diagram...

B+ Tree vs B-Tree#

Understanding the structural differences between B-Trees and B+ Trees clarifies why B+ Trees are preferred for database indexing.

FeatureB-TreeB+ Tree
Data locationAll nodes store dataOnly leaves store data
Internal nodesKeys + data + child pointersKeys + child pointers only
Leaf nodesNo sibling linksLinked list for sequential access
Key duplicationNo duplicatesInternal keys duplicated in leaves
Range queriesTree traversal requiredSingle leaf scan
Fan-outLower (data in all nodes)Higher (more keys per internal node)
B-Tree: Data in All Nodes
Rendering diagram...
B+ Tree: Data Only in Leaves
Rendering diagram...

B+ Tree Properties#

A valid B+ Tree must satisfy the following structural invariants at all times. Violating any of them — even temporarily during an insert or delete — can corrupt the tree.

PropertyDescription
Internal nodesStore only keys and child pointers (no data)
Leaf nodesStore keys and data (or pointers to data)
Leaf linkingAll leaves connected via next pointers (optionally prev for bidirectional scans)
Key copiesInternal keys are copies of keys in leaves
All leavesAt the same depth (perfectly balanced)
Order mInternal: max m children; Leaf: max m-1 key-value pairs

B+ Tree Node Structure

Internal nodes and leaf nodes have different structures

B+ Tree node with separate internal and leaf structures

Search Operation#

Search in a B+ Tree always ends at a leaf node. This predictability matters: regardless of what key you're looking for, the algorithm always traverses from the root down to a leaf, so search time is always proportional to the tree height — never more, never less. This is more consistent than a B-Tree, where a key stored in an internal node can be found before reaching a leaf.

Searching for key 35
Rendering diagram...

B+ Tree Search

Search always traverses to a leaf node

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Range Queries#

The leaf linked list is what makes B+ Trees so powerful for range queries. Instead of repeatedly navigating the tree for each key in the range, you navigate to the starting leaf once, then scan forward through the linked list until you pass the end of the range. This is why SQL queries with BETWEEN, >, or ORDER BY on indexed columns are fast — they map directly to this scan pattern.

Range Query: 25 to 45
Rendering diagram...
Rendering diagram...

Range Query

Find all keys in a given range using leaf links

Time:O(log n + k) where k = number of resultsSpace:O(k) for storing resultsBest:O(log n) when range is emptyWorst:O(log n + n) when all keys are in range

Insertion#

Insertion in a B+ Tree always targets a leaf node. The process is: navigate to the correct leaf, insert the new key in sorted order, then check whether the leaf has overflowed (reached the order limit). If it has, split the leaf in half and push the first key of the new right leaf up to the parent as a separator. If the parent also overflows, the split propagates upward — potentially all the way to creating a new root.

Before Insert 35 (Order 4)
Rendering diagram...
After Insert 35: Leaf Splits
Rendering diagram...

B+ Tree Insert

Insert key-value pair, splitting nodes as needed

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Deletion#

Deletion from a B+ Tree is more involved than insertion because each node must maintain a minimum occupancy — at least ⌈order/2⌉ - 1 keys. If removing a key drops a node below this minimum, you must either borrow a key from a neighboring sibling (redistribution) or merge the node with a sibling and remove the corresponding separator from the parent. There is also a second concern: if the deleted key was being used as a routing key in an internal node, that separator must be updated to remain consistent with the leaf contents.

CaseConditionAction
Simple deleteLeaf has enough keys after deletionJust remove the key
Borrow from siblingSibling has extra keysRedistribute keys, update parent
Merge with siblingBoth at minimumCombine nodes, remove parent key
Update separatorDeleted key was separatorReplace with new first key of right child

B+ Tree Delete

Delete a key while maintaining tree properties

Time:O(log n)Space:O(1)Best:O(log n)Worst:O(log n)

Complete Implementation#

The following implementation brings together all the operations covered above — node structure, search, range query, insertion, leaf splitting, and internal splitting — into a single working class. Reading through it in full is a good way to see how the pieces fit together and how the leaf linked list is threaded through every operation.

Complete B+ Tree

Full implementation with all operations

Complete B+ Tree with range queries

Applications#

B+ Trees are the default index structure in most relational databases and many file systems. The following examples show where you are likely encountering B+ Trees in practice.

ApplicationWhy B+ Tree?
MySQL InnoDBPrimary and secondary indexes, clustered index
PostgreSQLDefault index type for most data types
SQLiteAll table and index storage
MongoDBWiredTiger storage engine indexes
File Systems (NTFS, ext4)Directory and file metadata indexing
LevelDB / RocksDBSorted string tables (SSTables) with B+ tree-like index blocks

Reviewing AI-Generated Code#

B+ Trees have two structurally distinct node types and a critical linked list connecting the leaves. These details are easy to overlook, and AI code generators frequently miss them. The most common mistakes are mixing up internal and leaf node structures, and failing to update the leaf linked list during a split. If either invariant breaks, the tree may still pass basic point-lookup tests while silently returning wrong results for range queries.

IssueWhat to CheckCommon AI Mistake
Node typesInternal nodes have children, leaves have valuesMixing up node structures
Leaf linkingLeaves must maintain next pointersBreaking linked list during split
Key duplicationInternal keys are copies from leavesNot copying keys correctly
Search terminationSearch always ends at leafReturning early from internal node
Range queryUse leaf links, don't traverse treeTree traversal for range query

Spotting AI B+ Tree Bugs

Common errors in AI-generated B+ Tree code

Correct leaf split maintaining the linked list

Questions to Ask AI About B+ Tree Code#

  1. "How is the leaf linked list maintained?" — The next pointer on every leaf node must be updated during every split. Trace through the split code to verify this.
  2. "What's the time complexity of the range query?" — The correct answer is O(log n + k), where k is the number of results. If the implementation traverses the entire tree, it is O(n) and is not using the leaf links at all.
  3. "Where is data stored?" — Data must appear only in leaf nodes. If values are stored in internal nodes, the implementation is a B-Tree, not a B+ Tree.
  4. "What gets pushed up to the parent when a leaf splits?" — The first key of the new right leaf becomes the separator in the parent. For internal node splits, the middle key is removed from the node and pushed up — it is not copied to the new node.

Summary#

ConceptKey Point
StructureInternal nodes: keys only; Leaves: keys + data
Leaf LinkingLeaves form linked list for sequential access
SearchAlways traverses to leaf: O(log n)
Range QueryFind start leaf, then scan: O(log n + k)
InsertAlways at leaf, split upward if needed
DeleteRemove from leaf, borrow/merge if underflow
Fan-outHigher than B-Tree (no data in internal nodes)
ApplicationsDatabase indexes, file systems

B+ Trees are the cornerstone of database indexing. Their efficient range queries — enabled by the leaf linked list — and high fan-out — enabled by keeping data out of internal nodes — make them ideal for disk-based storage systems where every I/O operation has a measurable cost. Every time you run a WHERE clause, a JOIN, or an ORDER BY on an indexed column in a relational database, a B+ Tree is doing the work behind the scenes.