All articles
edgegoredisarchitecture

Sub-10ms Affiliate Redirects at the Edge: Architecture

Romain Prevost
· 4 min read

An affiliate redirect sits directly in the conversion path. A visitor clicks a partner link, hits your tracker, and only then lands on the merchant page. Every millisecond you add there is latency the merchant did not ask for and the affiliate cannot see. At scale, slow redirects quietly suppress conversions and make partners blame your link, not their own funnel.

Our budget for the redirect hop is a hard 10 milliseconds at p99, measured at the edge before the 302 leaves the building. This is how we hit it without dropping a single click record, and what we deliberately gave up to get there.

The hot path does almost nothing

The redirect handler is written in Go and deployed to edge points of presence close to the click. When a request arrives, the handler decodes the short link slug, looks up the destination, stamps an Argus CID for cookieless attribution, and returns a 302. That is the entire synchronous path. No relational query, no commission math, no fraud scoring, no webhook fan-out. All of that is real work, but none of it belongs between the click and the redirect. Destination lookups resolve against Redis, not PostgreSQL: each edge POP keeps a warm replica holding the slug-to-destination map plus per-link state such as active, paused, or shadow-banned. A lookup is a single key fetch on a connection pulled from a pre-warmed pool that is sized at boot and reused for the life of the process.

Writes happen after the redirect, not before

We never block the visitor on a durable write. Once the 302 is queued, the handler hands the click event to an async stream writer over a buffered channel and returns immediately. A pool of background workers drains that channel, batches events, and appends them to a durable log. From there the heavy machinery runs out of band.

That deferred pipeline is where the expensive work lives:

  • Fraud signals: IP reputation, user-agent and TLS fingerprint checks, click-velocity windows that feed shadow-ban decisions
  • Attribution: binding the click to a CID and, on conversion, walking the PostgreSQL ltree referral tree for multi-level commission splits
  • Webhook dispatch: signed, idempotent conversion callbacks to the merchant, retried with backoff
  • Aggregation: rollups for the dashboards partners actually watch, like click-to-signup and reversal rates

Fail-open is a design decision, not a fallback

The redirect must succeed even when our own systems are degraded. If the Redis replica is slow or the lookup misses, the handler does not retry into a timeout. It falls back to a cached destination or a configured default and still returns the 302 inside budget. If the stream writer's channel is full because a downstream consumer stalled, we drop the event to a spillover buffer rather than apply backpressure to the visitor.

A dropped analytics event costs us a row. A blocked redirect costs the merchant a sale. We never confuse the two.

This is also why shadow-bans are silent. A flagged click still gets a clean 302 to the real destination. The visitor and the fraudulent affiliate see normal behavior; the difference is that the event is quarantined in the async pipeline and never credited. Enforcement lives off the hot path, so it can be as sophisticated as we want without ever touching redirect latency.

What the budget bought

Keeping the synchronous path to a slug decode, one Redis read, a CID stamp, and a channel send is what holds p99 under 10ms. The cost is eventual consistency: a brand-new link can take a beat to propagate to every edge replica, and click counts settle a few seconds behind real time. For redirects, that trade is obviously correct. The visitor gets sent where they are going, and the truth about who earned what is reconciled durably, a moment later, where it is safe to be slow.

Last updated May 20, 2026.