What It Is
Dots is a payouts platform for online platforms that pay their users — marketplaces, contractor networks, gig apps, esports leagues, content platforms. A merchant calls POST /v2/transfers with a recipient and an amount; Dots picks the right rail (ACH, RTP, PayPal, Venmo, Cash App, Wise for international corridors, gift cards via Tremendous), screens the recipient against sanctions and identity vendors, collects the right tax form for their country, and fires a signed webhook back when settlement clears. The payee onboarding flow runs each user through a per-merchant step list — identity verification, OFAC screen, tax form, payout setup — themed per merchant and resumable via webhook callback.
Architecture
The system is shaped around the fact that a single POST /v2/transfers fans out across four asynchronous parties — the merchant who initiated it, the rail that executes it, the compliance vendor that gates it, and the merchant's own webhook listener that reacts when it lands. Every layer is built to absorb failure on those edges. A Flask API on Fargate scopes everything by api_app_id; wallet writes happen under sorted Redis distributed locks at Postgres SERIALIZABLE, ensuring two concurrent payouts on the same wallet can't interleave.
Celery workers pick up issue_payout and dispatch through a Protocol-based provider abstraction into the right rail — Orum for ACH/RTP, Wise for international corridors, PayPal Payouts for Venmo and Cash App, Tremendous for gift cards. Provider webhooks come back signed (RSA for the bank rails, HMAC elsewhere) and advance the transfer's seven-state machine. Outbound merchant webhooks delegate to Svix. A daily Beat job diffs internal state against each provider over the last week as a reconciliation safety net. The Onboard widget is a server-driven Flow engine: each merchant declares a step list (onboard → id-verify → ofac → compliance → payout), and the iframe walks the payee through it, each step an idempotent server-side state advance.
Technical Deep Dive
1. Withdrawable Balance
A user has $200 in their Dots wallet — $150 they topped up with a credit card yesterday, $50 a friend sent them through a peer-to-peer transfer on the platform. They click "Withdraw to bank." Dots offers $50. The card-funded $150 is blocked from external payout but still spendable inside the platform: they can pay an invoice with it, or send it to another user. The split isn't a time-based hold — the user can spend the full $200 internally tomorrow even if no one else sends them money.
The naive wallet has $N, withdraw $N design opens a fraud loop: deposit with a stolen card, immediately pay out, chargeback hits days later, Dots eats the loss. The obvious fix is a time-based hold à la Stripe. But that's bad UX, because most users want to spend refilled funds on P2P or invoice payments, not externally withdraw them; a blanket hold blocks both.
To solve this, Dots classifies each Transaction by source. A single treasury wallet sits at the top of the system (one User row, username='dots'); every refill flows treasury → user, every payout flows user → treasury. Wallet.get_withdrawable_balance() doesn't trust the wallet's amount column — it re-derives from the transaction journal each call, walking transactions in ascending id order and classifying each credit into one of two buckets: deposited (source is the treasury, potentially fraud-tainted) and received (source is another user's wallet).
Outbound debits drain received first, then deposited. Withdrawable balance = received. Refilled money can sit indefinitely — spendable on P2P or invoice payments — but not externally paid out until equal value arrives from another user. A Redis snapshot of (last_id, deposited, received) lets warm calls resume from the last seen id and walk only the journal's tail; the wallet's amount is essentially a denormalized side-effect of the same journal.
The taxonomy is fragile, and reversals are the seam. An edge case surfaced where a reversed payout — funds returned to a wallet after a failed transfer — was being classified as a deposit (the source was the treasury), pushing the user's own withdrawable money into the non-withdrawable pool. Adding transaction.reverses is None to the deposit check fixed it. Every new Transaction type — refill fees, FX legs, promo credits — has to be slotted into this classification explicitly. There's no time-based dimension to coordinate; one method, one classifier.
2. Payout Reconciliation
A merchant calls POST /v2/transfers to pay a creator $500. From their side, the transfer either lands in the creator's bank account or doesn't — and a transfer.completed webhook from Dots tells them which, exactly once. From Dots' side, the answer is harder: the $500 leaves through one of ~10 third-party rails, and each rail reports settlement on its own quirky timeline.
The asymmetry is the hard part: some rails return success synchronously (gift cards via Tremendous), some confirm via signed webhook hours later (Wise, Orum, Method, Increase, PayPal Payouts, AirTM, Tilled — RSA, HMAC, IP allowlist, no two alike), and some send no terminal signal at all — ACH-via-Orum fires only a submitted webhook, then nothing. The seven-state machine has to look the same to the merchant either way.
The naive design is to trust the primary rail signal. It fails twice. Webhooks drop or arrive twice — every provider has its own delivery quirks and outage profile — and the no-terminal-signal rails leave transfers stuck in pending indefinitely. So the state machine has three feed-ins instead of one. The rail signal handles the happy path. An hourly Beat job, complete_pending_transfers, force-completes any pending transfer whose complete_at is past — the Orum webhook writes that ETA from its estimated_funds_delivery_date, so ACH transfers settle on schedule even with no settlement webhook. And a daily Beat, audit_transfers, walks every transfer touched in the last seven days, calls provider.get_external_transfer() for each, and Slacks any drift to a human (only AirTM auto-corrects; everything else escalates).
State advances are guarded by a Redis transfer:{id}:status_lock and a no_clobber=True flag, added after a duplicate-webhook incident in which two deliveries of the same event each called set_status('completed') on the same transfer — both fired the user's settlement notification, both fired the merchant's outbound flow.updated webhook, both ran the per-completion fee charge. The fix double-checks status before and after acquiring the Redis lock; the second arrival sees self.status == 'completed' either way and silently no-ops.
