Building a multi-signal fingerprint scorer that doesn't lie

What goes into a probabilistic match? Walk through the weights, the gotchas, and the test harness we use to keep accuracy honest.

DODaniel OkaforPrincipal EngineerApr 2, 2026·11 min read

Fingerprint attribution sounds like dark magic. It is not. It is five fields, a weighted similarity score, and a confidence threshold. The hard part is the discipline: choosing your weights, holding the line on the threshold, and writing the tests that keep both honest.

The five fields that earn their keep

IP address — strongest single signal, but mobile-network shared NAT is real. Don't weight it >0.5.
User-Agent — fine-grained on browsers, coarse on apps. Useful but noisy.
Screen dimensions — device-class signal. Strong when paired with model hints.
Timezone — narrows geography without revealing it.
Locale — language + region. Cheap to compare, surprisingly discriminating.

export function score(click: Signals, install: Signals): number {
  const ip = click.ip === install.ip ? 1 : 0;
  const ua = uaSimilarity(click.ua, install.ua); // 0..1
  const sc = click.screen === install.screen ? 1 : 0;
  const tz = click.timezone === install.timezone ? 1 : 0;
  const lo = click.locale === install.locale ? 1 : 0;
  return 0.40 * ip + 0.25 * ua + 0.15 * sc + 0.10 * tz + 0.10 * lo;
}

The test harness

We keep two reference datasets: one of known-attributed installs (deterministic match available, fingerprint also computed) and one of known-organic installs (no preceding click). We compute precision and recall every release and refuse to ship a scoring change that regresses either by more than half a percent.

“If your fingerprint scorer doesn't have a precision-recall harness, you don't have a scorer — you have a vibe.”

Tagged#Engineering#Fingerprinting#Testing

Building a multi-signal fingerprint scorer that doesn't lie

The five fields that earn their keep

The test harness

Keep going

App Links vs Universal Links: the practical setup guide nobody wrote

The post-IDFA attribution playbook: signals, scoring, and what to ship next

Why deferred deep linking still matters in 2026