postgresqlreferral-treesengineeringltree

Modeling Referral Trees With PostgreSQL ltree

Romain Prevost

May 6, 2026 · 5 min read

A multi-level referral program is a tree. Affiliate A recruits B, B recruits C, and a conversion at C may pay commission up the chain to B and A. The data model you pick for that tree decides whether your commission engine runs in single-digit milliseconds or melts under a recursive query at month-end payout.

At Argus Grape we store the entire referral hierarchy in a single PostgreSQL column using the ltree extension. No adjacency-list recursion, no nested-set rebalancing on every insert. Here is why ltree wins for this workload, and exactly which operators and indexes make upline and downline queries fast.

The problem with adjacency lists

The obvious schema gives each affiliate a parent_id. It is clean to write but punishing to read. Finding every ancestor of an affiliate, the upline you must pay, means a recursive CTE that walks one row at a time. Finding an entire downline subtree, every affiliate someone recruited directly or transitively, is the same recursion in the other direction. Both get slower as the tree deepens, and both fight your index instead of using it.

A path is a label, not a join

ltree stores a materialized path as a dotted label, for example 7c3a.91ff.b204, where each segment is an affiliate identifier and position encodes depth. The root affiliate is the first segment; the rightmost segment is the node itself. Instead of joining rows to reconstruct the chain, you read the path directly. Depth is just the number of labels, available through the nlevel function.

Ancestors and descendants in one operator

The whole model pays off with two containment operators. The @> operator means is-an-ancestor-of; the <@ operator means is-a-descendant-of. To find the upline of a node, you ask which paths contain it. To find a downline, you ask which paths are contained by it. Both are set operations over the path column, not row-by-row recursion.

Upline payout: select rows where path @> the converting affiliate's path, ordered by nlevel, to walk from root down to the node.
Downline reporting: select rows where path <@ a given affiliate's path to pull the full subtree in one scan.
Depth-capped commissions: filter with nlevel to pay only N levels of upline, matching your commission policy.
Subtree moves: reparent an affiliate and its descendants by rewriting the path prefix, no per-row updates of parent pointers.

The GIST index is what makes it fast

Containment operators are only cheap because ltree ships a GIST operator class. A GIST index on the path column lets PostgreSQL answer @> and <@ as index scans rather than sequential scans, so an upline lookup touches a handful of pages regardless of how many millions of affiliates exist. Build it once, and the planner uses it for both ancestor and descendant queries. You can layer an lquery pattern search on top for label matching without changing the index.

The right data model turns month-end payout from a recursive crawl into an index scan you can run on every conversion.

How this fits the rest of the pipeline

When a signed conversion webhook lands, we resolve the converting affiliate's path, run a single @> query to fetch the upline, apply depth caps and risk scoring, then write the commission rows. Because the lookup is an index scan, attribution stays fast enough to run inline rather than as a batch job, and shadow-banned nodes are excluded from payout with a flag filter on the same query. The trade-offs are real, paths must stay within ltree length limits and large reparents rewrite many rows, but for read-dominated affiliate trees that trade is firmly in our favor. The full schema, operators, and query shapes are documented in our API reference.

Last updated May 6, 2026.