Skip to content

B Trees

Chapter 14 of Open Data Structures

Motivation for B-Trees

  • Index structures for large datasets cannot fit in main memory.

  • Disk and SSD storage require a different approach to efficiency.

  • Modern SSDs have random access times of approximately 100--200 microseconds.

  • One SSD access takes more time than 500,000--2,000,000 CPU instructions (on modern multi-GHz processors).

  • One SSD access is approximately 1,000--10,000 times slower than reading from main memory (50--100 ns).

  • Suppose an AVL tree is used to store 20 million
    records on an SSD.

  • This very deep binary tree requires lots of SSD accesses; accesses
    microseconds per access milliseconds.

  • We know we can't improve on the lower bound on the search for a binary tree.

  • The solution is to use more branches and, thus, reduce the tree height!

    • As branching increases, depth decreases, so fewer expensive storage accesses are needed.

Realistic Worst-Case Scan

You need to scan all records in a 1-billion row table.
Assume each random access is microseconds.

  • AVL/BST lookup per key: random reads (height ).

  • AVL I/O time: days
    (or minutes at parallelism).

  • B-tree (order ) height: ; therefore, reads/key.

  • B-tree I/O time: days
    (or minutes at parallelism).

Takeaway: Even with parallelism, reducing tree height from 30 to 5 saves I/O work. This is why B-Trees dominate for large datasets.

Web Search Scale

Live system: queries/sec, each needs one key lookup in B pages.

  • AVL tree depth: .
    SSD accesses/query = ms/query at /access.

  • B-tree height (order ): .
    SSD accesses/query = ms/query.

  • 10k q/s: AVL = seconds of SSD work per second (queueing backlog).
    B-tree = seconds of SSD work per second.

AVL-style access causes user-visible stalls under load, while a B-tree design maintains low-latency response.

Example B-Tree, Order-5 with 26 Elements
Example B-Tree, Order-5 with 26 Elements
Example B-Tree, Order-5 with 26 Elements
Example B-Tree, Order-5 with 26 Elements

Definition of a B-tree

  • A B-tree of order is an -way tree (i.e., a tree where each node may have up to children) in which:

    1. All leaves are on the same level.

    2. The number of children of a non-leaf node is always one more than the number of keys.

    3. All keys of a node are sorted in increasing order. These keys partition the child keys as a search tree.

    4. The root is either a leaf node or has two to children.

    5. Other non-leaf nodes have at least children.

    6. A node contains no more than keys.

  • The number should always be odd.

Insert

Constructing a B-Tree

Starting with an empty Order-5 B-tree, let’s insert the
following sequence of keys:
1, 12, 8, 2, 25, 6, 14, 28, 17, 7, 52, 16, 48, 68, 3, 26, 29, 55, 45.

  • The first four items (1, 12, 8, 2) go into the root:

    1 / 1
  • Putting the fifth item (25) in the root would violate condition 5.

  • Therefore, when 25 arrives, pick the middle key to make a new root.

    1 / 1
  • 6, 14, and 28 get added to the leaf nodes.

  • Adding 17 to the right leaf node would over-fill it.

    1 / 1
  • So, the middle key is promoted (to the root), splitting the leaves.

  • We can now add 7, 52, 16, 48 to the leaf nodes.

  • Adding 68 to the right leaf node would over-fill it.

  • So we split the rightmost leaf, promoting 48 to the root.

  • Adding 3 to the left leaf node would over-fill it

  • Adding 3 causes us to split the leftmost leaf, promoting 3 to the root.

  • 26, 29, 55 then go into the leaves.

  • Adding 45 causes a split of the 4th leaf.

  • Promote 28 to the root, which causes the root to split.

  • Promote 28 to the root; doesn’t fit.

  • Promote the 17 as the new root.

  • Attempt to insert the new key into a leaf.

  • If this would result in that leaf becoming too big, split the leaf into two, promoting the middle key to the leaf’s parent.

  • If this would result in the parent becoming too big, split the parent into two, promoting the middle key.

  • This strategy may need to repeat to the root.

  • If necessary, the root is split in two and the middle key is promoted to a new root, making the tree one level higher.

Exercise in Inserting a B-Tree

Insert the following keys to a 5-way B-tree:
3, 7, 9, 23, 45, 1, 5, 14, 25, 24, 13, 11, 8, 19, 4, 31, 35, 56, 2, 6, 12.

Deletion from a B-Tree

There are four removal cases to consider in order.

  1. Key is in a leaf: Delete the key.
  2. Key is not in a leaf: Its predecessor and successor must be in leaves. Delete the key and promote either the predecessor or successor (implementor's choice) to replace it in the non-leaf position.

After performing Case 1 or 2, a leaf node may now have too few keys (underflow). We will fix this underflow with Case 3 or 4.

When removing a key would cause underflow (too few keys), handle it as follows:

  1. Sibling has spare keys — Rotate:
    • Pull a key from a sibling up to the parent.
    • Demote the parent key down into the lacking leaf.
  2. Neither sibling has spare keys — Merge:
    • Combine the lacking node, sibling, and parent key into one node.
    • If the parent now underflows, recursively apply Cases 3 or 4 to it.

Type 1: Simple Removal from Leaf

Assuming a 5-way B-Tree (as before), delete 2.

Because there are enough keys in the node, just delete it.

Type 2: Simple Removal from Non-Leaf

Assuming a 5-way B-Tree (as before), delete 52.

Replace with the closest value from a leaf with enough values.

Type 4: Too few keys in node and its siblings

Delete 72

Type 3: Enough siblings

Delete 22

1 / 1

Exercise in Removal a B-Tree

Given a 5-way B-tree created by these data (last exercise):
3, 7, 9, 23, 45, 1, 5, 14, 25, 24, 13, 11, 8, 19, 4, 31, 35, 56, 2, 6, 12
Delete these keys: 4, 5, 7, 3, 14, 45

Analysis

Maximum Keys in an Order- B-Tree

Max keys per level:

LevelKey Count
root

So, the maximum number of items is:

When and , this gives .

Big- of B-Trees

OperationBalanced Binary Tree-Way B Tree
Insert
Find
Delete

The benefits are:

  1. Reduced hard-drive access (but the same number of instructions) and
  2. Reduced memory usage (fewer pointers).

Reasons for Using B-Trees

When searching tables held on a hard drive, the cost of each drive transfer is high but doesn’t depend much on the amount of data transferred, especially if consecutive items are transferred.

  • For example, an order 101 B-tree can transfer each node in one drive read operation.

  • An order 101 B-tree and height 3 can hold items (approximately 100 million) and any item can be accessed with 3 drive reads (assuming we hold the root in memory).

If we take , we get a 2–3 Tree, in which non-leaf nodes have two or three children (i.e., one or two keys).

  • B-Trees are always balanced (since the leaves are all at the same level), so 2-3 Trees make a good type of balanced tree.

Comparing Trees

  • Binary trees

    • Can become unbalanced and lose their good time complexity (big ).

    • AVL Trees are strict binary trees that overcome the balance problem.

    • Heaps remain balanced but are unsorted.

  • Multi-way trees

    • B-Trees are -way, with any (odd) number of children.

    • The 3-Way B-Tree (a.k.a 2–3 Tree) approximates a self-balancing binary tree, exchanging the AVL tree’s balancing operations for insertion and (more complex) deletion operations.

What does the “B” in B-Tree stand for?

  • No one knows.

  • The inventors of the B-Tree will not say what it means.

  • It could mean: Boeing (where it was invented), balanced, between, broad, bushy, and Bayer (the inventors are Bayer and McCreight).

  • “The more you think about what the ‘B’ in B-trees means, the better you understand B-trees.”
    — Edward M. McCreight (2013)

Variations

Introduction to B*-Trees

B*-Trees are a B-Tree variant that increases node occupancy for better space utilization.

  • B*-Trees require non-root nodes to be at least full (vs. full in B-Trees).

  • Insertions may redistribute keys among siblings before splitting, reducing splits overall.

  • This typically yields denser nodes and fewer I/O operations in large storage systems.

  • B*-Trees preserve B-Tree search costs while improving storage efficiency.

Introduction to B+Trees

B+Trees are a variant of B-Trees that further optimize for disk-based storage and range queries.

  • All data (or pointers to data) are stored only in leaf nodes.

  • Internal nodes store only keys (no records); therefore, they can hold more keys.

  • Leaf nodes are linked sequentially, enabling fast full-range scans and ordered iteration.

  • Higher fan-out and fewer disk reads per access make B+Trees popular in databases and file systems.

  • Insertion and deletion keep leaves at the same depth, preserving strong balance and performance.

Advanced B-Tree Variations

Beyond B+Trees and B*-Trees, researchers and systems developers have created specialized variants:

  • Bε-Trees: Write-optimized variants with buffered internal nodes to batch insertions, useful for heavy write workloads in key-value stores.

  • B-Link Trees: Concurrency-friendly variants with sibling pointers and latch-free search paths, common in database index implementations.

  • Cache-aware/Cache-oblivious B-Trees: Variants optimized for modern memory hierarchies (L1/L2/L3 caches) rather than just disk I/O.

Lab 20: B-Trees

Let’s look at the lab based on today’s lecture.