Fractional Nested Sets: Solving a 30-Year-Old Tree Problem with Floats

The Problem with Nested Sets

If you've ever stored a tree in a relational database, you've probably encountered the Modified Preorder Tree Traversal (MPTT) pattern — also known as nested sets. It's Joe Celko's classic: every node gets a lft and rgt integer, and a subtree query becomes a simple range scan:

SELECT * FROM categories WHERE lft > 2 AND rgt < 11;

Beautiful for reads. Brutal for writes.

Insert "Tablets" under "Electronics" in a 1,000-node tree? MPTT needs to shift the lft and rgt values of every node to the right of the insertion point. That's potentially 500+ UPDATE statements for a single insert. Move a subtree? Same story. The write cost is O(n), and in production, with triggers, indexes, and replication — it hurts.

This is a problem the industry has lived with since Celko published it in the early 1990s. The usual advice is "use adjacency lists if you write a lot" or "use materialized paths." Each has its own trade-offs. None fix the core issue with nested sets.

The Insight: What If lft and rgt Were Floats?

Here's the thing — there's nothing inherently integer about nested set boundaries. The only requirement is that a node's lft is less than its rgt, and that parent boundaries contain child boundaries. Order, not cardinality.

So what if, instead of shifting every node to make room for an integer, we just... placed the new node in the gap?

Electronics  [1, 12]
  Phones     [2, 7]
  Laptops    [8, 11]
  Tablets    [11.5, 11.75]  ← placed in the gap. Nothing else moves.

One row inserted. Zero rows updated. The tree is still valid — every nested set invariant holds.

This is the core idea behind Fractional Nested Sets (FNS): use IEEE 754 floating-point numbers for lft and rgt, and compute new positions by bisecting the gap between adjacent boundaries.

How Bisection Works

When inserting a new child, the algorithm finds the gap between the last sibling's rgt and the parent's rgt, then places the new node at calculated fractions of that gap:

// Simplified insertion logic
const gap = parentRgt - lastSiblingRgt;
const newLft = lastSiblingRgt + gap * 0.33;
const newRgt = lastSiblingRgt + gap * 0.66;

For moves, same principle — find the gap at the destination, bisect, update the moved node. One UPDATE instead of hundreds.

The bisect() utility handles the math:

bisect(1, 2);   // → 1.5
bisect(1, 1.5); // → 1.25
bisect(1, 1.25); // → 1.125

Each successive insertion halves the available gap. IEEE 754 float64 gives us ~52 bits of mantissa precision, which means roughly 52 successive halvings before we lose precision. In practice, that translates to thousands of operations before any concern arises.

The Precision Question

"But floats are imprecise!" — yes, and this is the part people get stuck on. The key insight is that we don't need exact values. We need correct ordering. As long as node.lft < node.rgt and parent boundaries contain child boundaries, the tree is valid. Whether lft is 3.7500000 or 3.7500001 doesn't matter.

That said, gaps do shrink over time. After many insertions in the same spot, adjacent boundaries converge. The library monitors this:

const PRECISION_THRESHOLD = 1e-7;

function hasAdequateGap(a: number, b: number): boolean {
  return Math.abs(b - a) > PRECISION_THRESHOLD;
}

When gaps drop below the threshold, a rebalance is triggered — a single DFS pass that reassigns clean integer values to every node:

function rebalance(nodes: TreeNode[]): void {
  // DFS walk, assign lft = counter++, rgt = counter++
  // after visiting all children
  // Result: clean integers, maximum gaps restored
}

This rebalance is O(n), but it happens rarely — roughly every few hundred mutations in a hot spot. Amortized, every operation is O(1).

The Numbers

I benchmarked FNS against traditional MPPT with a 200-node tree and 50 move operations. The metric that matters is nodesModified — how many nodes had their lft/rgt values rewritten, which directly maps to database UPDATEs:

Operation	FNS	MPPT	Reduction
200 inserts	285	8,799	97%
50 moves	50	8,052	99%

With MPPT, 50 moves touched over 8,000 nodes. With FNS, exactly 50 — one per move. That's not a micro-optimization; that's a fundamentally different cost model.

The API

The library implements a clean TreeOperations interface that all three strategies (FNS, MPPT, Adjacency List) share:

import { FractionalNestedSet } from '@abeedoo/fractional-nested-sets';

const tree = new FractionalNestedSet();

tree.addRoot('electronics');
tree.addChild('electronics', 'phones');
tree.addChild('electronics', 'laptops');
tree.addChild('phones', 'iphones');
tree.addChild('phones', 'android');

// Subtree query — still just a range scan
tree.getSubtree('phones');
// → [phones, iphones, android]

// Ancestors — path from root
tree.getAncestors('iphones');
// → [electronics, phones, iphones]

// Move — only the moved node changes
const result = tree.moveNode('android', 'laptops');
console.log(result.nodesModified); // → 1

A StorageAdapter interface makes it database-agnostic:

const adapter: StorageAdapter = {
  get(id: string): TreeNode | undefined { /* your DB read */ },
  getAll(): TreeNode[] { /* your DB scan */ },
  set(node: TreeNode): void { /* your DB upsert */ },
  delete(id: string): void { /* your DB delete */ },
  clear(): void { /* your DB truncate */ },
};

const tree = new FractionalNestedSet(adapter);

Database Schema

For persistence, your table needs float columns instead of integers — that's the only schema change:

CREATE TABLE categories (
  id        TEXT PRIMARY KEY,
  parent_id TEXT REFERENCES categories(id),
  lft       DOUBLE PRECISION NOT NULL,
  rgt       DOUBLE PRECISION NOT NULL,
  depth     INTEGER NOT NULL DEFAULT 0
);

CREATE INDEX idx_lft ON categories (lft);
CREATE INDEX idx_rgt ON categories (rgt);

Every nested set query you already know still works:

-- Get all descendants of 'electronics'
SELECT * FROM categories
WHERE lft > 1.0 AND rgt < 12.0
ORDER BY lft;

-- Get ancestors of a node
SELECT * FROM categories
WHERE lft < 3.5 AND rgt > 4.25
ORDER BY lft;

-- Count children
SELECT COUNT(*) FROM categories
WHERE lft > 1.0 AND rgt < 12.0;

In Production

This isn't theoretical. Fractional nested sets power the taxonomy and menu systems in radish-commerce, handling category hierarchies and drag-and-drop menu builders where move operations need to be fast and non-disruptive.

Why Hasn't Anyone Done This Before?

Honestly, I'm not sure. The individual pieces aren't new — nested sets have been around since the early '90s, and floating-point arithmetic is as old as computing. But I haven't found this combination in Celko's work, Django MPTT, Drupal's taxonomy system, or any implementation I've looked at. They all use integers. They all pay the O(n) write tax.

Maybe it's because "use floats for database keys" sounds like bad advice. And in many contexts, it is. But nested set boundaries aren't keys — they're ordering values. The precision requirements are different, and IEEE 754 has more than enough for this use case.

Try It

The library is MIT-licensed, zero dependencies, and available on npm:

npm install @abeedoo/fractional-nested-sets

Run the benchmarks yourself:

git clone https://github.com/abeedoolabs/fractional-nested-sets
cd fractional-nested-sets
npm install
npm run bench

The conformance test suite runs identical operation sequences across all three implementations (FNS, MPPT, Adjacency List) to verify structural equivalence — because the best way to prove a new approach works is to show it produces the same tree as the proven ones.

Fractional Nested Sets is part of the Abeedoo open-source ecosystem.