SYSTEM OVERVIEW
Dunning & Recovery
What happens when payment fails — from first retry through bad debt flagging to customer recovery.
The dunning brain is a single idempotent function in subscriptions-api. On every trigger it reads all six stores, evaluates account-type rules, computes the minimal diff from current state, and dispatches typed consequence actions. Gather → Compute → Apply → Decide. One code path for all dunning decisions.
Dunning is event-driven, not cron. Stripe fires webhooks → Pub/Sub → billing-webhooks → POST /client/v4/dunning/events → subscriptions-api. Five event types drive all dunning decisions.
| Stripe Event | Dunning Effect |
|---|---|
invoice.payment_failed | Evaluate retry — increment retry count, apply account-type retry rules |
invoice.marked_uncollectible | Flag + ban + cancel coordinated — two parallel Pub/Sub messages |
invoice.payment_succeeded | Evaluate lifting — check all outstanding invoices, potentially unflag |
invoice.finalized | Track invoice state — no dunning action, audit trail only |
charge.refunded | Evaluate impact — reassess dunning status after refund |
When Stripe marks an invoice uncollectible, two parallel Pub/Sub messages are emitted: flag-as-bad-debt (operational — triggers cancel, flag, and ban) and dunning-event (audit — records to subs_dunning). The operational path sets bad_debt=true in subs_account, bans the account, and cancels active subscriptions. All three actions are dispatched as a coordinated unit — never partial.
Flag, ban, and cancel happen together or not at all. The consequence orchestration layer dispatches all three as a single atomic decision. Partial state — flagged but not banned, or banned but not cancelled — indicates a bug, not a race.
When a payment succeeds on a previously uncollectible invoice, PayBadDebt sets bad_debt=lifting — a transitional state, not clearance. Actual unflagging happens via drift remediation, not directly from the payment event. The time between lifting and cleared is undefined: there is no guaranteed SLA, no scheduled job, and no alert if the transition stalls.
| Account Type | Dunning Treatment |
|---|---|
| PayGo Standard | Full dunning — all retry rules, bad debt flagging, ban, cancel |
| Startup | Same as PayGo Standard — full dunning applies |
| Enterprise Legacy | Excluded from dunning — no retry, no flag, no ban |
| Contract | Excluded from flagAsBadDebt — retries may still apply |
| Partner PayGo | Skipped entirely — dunning engine returns early |
| Academic / Nonprofit | Single retry only, then manual review — no automatic escalation |
| Unsupported: cloudflare_ent, msp, partners_pay_go, sfcc | RestrictionUnsupportedAccountType returned — no action taken |
PayPal payments use Stripe's send_invoice flow rather than automatic collection. When a PayPal payment fails, Stripe's automatic dunning never fires — the invoice.payment_failed webhook is never emitted. The dunning brain never sees the failure. No retry is scheduled, no bad debt flag is set, no notification is sent to the customer.
Dunning is event-driven: five Stripe webhook types trigger one idempotent brain that reads all six stores. The current system handles PayGo accounts end-to-end. Enterprise, contract, and partner accounts follow different paths — see the account type table above.