Q2 2026 TECH SPEC
Honest Invoicing
Credit notes replace metadata. The brain replaces racing handlers. The invoice amount becomes the truth. Users pay what they see — one button, the invoice they recognize.
| Status | DRAFT — contingent on credit note surface audit (M1) and brain expansion Phase 1 |
|---|---|
| Driver | Jared Goguen |
| Reviewers | FinTech Engineering Review — Mar 30 |
| Repos | subscriptions-api (credit notes, brain destination), billing-webhooks (brain source) |
| Staffing | 2 engineers, 1 quarter |
| Dependencies | Brain Phase 2 (flag lifting) enables credit note M3 (pay-original-invoice) |
When payment fails and the invoice goes uncollectible, the user ends up with two invoices in their dashboard: the original they recognize ($25.47) and a synthetic "consolidated" one they don't ($19.66). Two buttons, one confusing. Meanwhile, four handlers in billing-webhooks race to make dunning decisions with partial state, and five stores disagree on what the dunning state actually is.
The invoice shows $25.47. The real debt is $19.66, hidden in bad_debt_amount metadata. Five Stripe metadata keys govern bad-debt state — metadata is not accounting.
Two API endpoints return overlapping data. GetBillingBadDebt returns prorated amounts. GetAccountBillingHistory returns original amounts. The dashboard shows both.
Four handlers in billing-webhooks make dunning decisions with partial state. They race with the forwarding path that sends events to the engine. Decisions split across two services with no coordination.
subs_dunning, subs_customer, subs_account, Stripe metadata, and IAPI ban state — five stores hold dunning-related state that must agree. They don't always. 90 tickets in 12 weeks, stable volume.
Two forward changes, one retroactive migration. First, issue a credit note for unused service at uncollectible time — the invoice amount adjusts to the real debt. Second, move dunning logic into the brain — one service, full state, coordinated decisions. Third, migrate 73,517 existing accounts from metadata to credit notes. Together: the invoice shows the truth, and the system responds correctly when it's paid.
| Moment | Today | Honest |
|---|---|---|
| T0 | Invoice created: $25.47 ($20 Pro + $5.47 Workers). Payment fails. | Same — both systems agree. |
| T2 | Prorate and encode to metadata. Write bad_debt_amount=$19.66. Invoice still shows $25.47. | Issue credit note for $5.81 (9 unused days of $20/mo Pro). Invoice adjusts to $19.66. |
| T3 | Dead period. No service. No payment. | Same — both systems agree. |
| T4 | Create consolidated invoice summing bad_debt_amount values. Link via metadata. User sees two invoices. | User pays the original invoice at $19.66. One button. The invoice they recognize. |
| Today | After | |
|---|---|---|
| What user sees | $32.26 on an invoice they've never seen | Their $200 invoice → $167.74 credit → pay $32.26 |
| Payment target | Synthetic consolidated Stripe invoice | Original Stripe invoice (amount_remaining adjusted) |
| Audit trail | consolidated → bad_debt_invoices → originals | invoice → credit_note → payment |
| Drift detection | bad_debt_amount correct across 4 stores? | Credit note exists? amount_remaining matches proration? |
| Revenue recognition | Custom bad_debt_amount handler + CN pipeline | CN pipeline only (credit_note_processor.go) |
| Automation | Partial — flag lift may need manual intervention | Full — pay → lift → new sub (Brain Phase 2) |
Three credit note issuance paths already exist in subscriptions-api, plus an approval pipeline and revenue recognition processor. We're adding a fourth path — triggered when an invoice is marked uncollectible.
Each line item: in-advance (fixed-cost, future period) or in-arrears (usage, past consumption). Usage charges get no credit — they're owed in full.
For each in-advance item: (days_remaining / total_days) × amount. Sum to get the credit note total.
Stripe API: POST /v1/credit_notes. Reason: order_change. Line items reference originals. Tax credited proportionally.
amount_remaining adjusts from $25.47 to $19.66. That number is the debt. One button. Payable.
Reinstatement: user sees original invoice at adjusted amount, pays via Unified Checkout, brain lifts the bad-debt flag (Phase 2), new subscription starts with a fresh billing cycle anchor.
11 surfaces read or write bad debt metadata. Each one changes or disappears. M1 (surface audit) must produce a complete manifest before implementation begins.
| Surface | R/W | What Changes |
|---|---|---|
| GetBillingBadDebt API | R | Creates consolidated invoices from bad_debt_amount. Replaced by credit-note-adjusted originals. |
| GetAccountBillingHistory | R | Displays all invoices including consolidated. Consolidated invoices disappear. |
| stripe_invoice_utils.go | R | Uses bad_debt_paid to transform uncollectible → OPEN/CLOSED. Needs credit note awareness. |
| Calculator.GetBadDebtAmount | R | Priority chain reading 5 metadata keys. Replaced by credit note amount_remaining. |
| createConsolidatedInvoice | W | Dedup via bad_debt_invoices metadata. Entire path deleted. |
| FlagAsBadDebt handler | W | Writes bad_debt_amount at uncollectible time. Replaced by credit note issuance. |
| CheckFlagAsBadDebt / badDebtLift | R | Reads bad_debt_paid for lifting. Needs credit-note-aware lift trigger. |
| Drift detection (BQ) | R | Checks bad_debt_amount correctness. Replaced by credit note existence + amount check. |
| HealAccount / applystripe | R | Reconciliation must understand credit-note-adjusted invoices are not drifting. |
| Dashboard payment flow | R | Routes to consolidated vs original. With credit notes, always routes to original. |
| Revenue recognition | R | credit_note_processor.go already handles CNs. New issuance path feeds into existing pipeline. |
Honest invoicing changes what the payment target is. The dunning brain changes how the system responds when it's paid. Four handlers in billing-webhooks make dunning decisions with partial state. The work: move them into subscriptions-api, pin their behavior with tests, reshape into the brain's architecture, delete the originals.
- 4
- handlers to move from billing-webhooks
- 20
- test scenarios across flagging, lifting, and amount calculation
- 6
- state stores the brain reads for full context
- 3
- RPCs that become direct calls (hop eliminated)
Move 4 handlers + supporting functions into subs-api. Three RPCs become direct calls. 20 integration tests pin every branch across flagging, lifting, and amount calculation. Copy and test overlap.
Read-evaluate-delta-decide. Idempotent. Incremental: flagging first, then lifting, then account type rules. Largest phase.
Code audit, verify test coverage, delete originals. billing-webhooks becomes a pure forwarder.
The redesign target. Old handlers make isolated decisions with partial state. The brain reads everything, evaluates per-type rules, computes the minimal delta, and produces one coordinated set of consequences.
| Work | Estimate | Staffing | Notes |
|---|---|---|---|
| Phases 1–3 (critical path) | 10–14 weeks | 1 engineer | Sequential. Copy + Test → Redesign → Delete. |
| Segment Rules | 3–4 weeks | 1 engineer | Discovery parallel from w1. Impl after Phase 3. |
| PayPal spike | 1–2 weeks | 1 engineer | Parallel. Bounds brain scope early. |
Separate workstream, starts in parallel with brain expansion. Audit all account type handling scattered across the codebase, define per-type dunning rules with Product and Finance, implement in the brain. Unknown types error loudly — no silent PayGo defaults.
- 12
- account types with full dunning treatment today
- 16
- types excluded from dunning (need explicit rules)
- 0
- tolerance for unknown types falling through silently
The migration population is every account with uncollectible invoices where debt is encoded as Stripe metadata instead of credit notes — 73,517 accounts as of March 2026. The forward mechanism (M1–M4) ships first; the migration (M5) fixes the existing stock.
- 73,517
- accounts with lying invoices
- ~9,000
- in active drift at any time
- Late Q2
- after M4 proves the mechanics
Sizing inputs are gathered during M1–M4: reinstatement rate by account age, Stripe credit note behavior on uncollectible invoices, and the brain's coverage of legacy metadata accounts. These determine batch strategy and timeline.
Four execution paths based on account state. Sub-cohort sizes are produced by the Analyze & Size phase.
1 uncollectible invoice, no consolidated, no override. Largest sub-cohort. Bulk migration.
2+ uncollectible invoices. Credit note per invoice. Multi-invoice policy required first.
User already attempted payment. Void consolidated, credit note originals.
bad_debt_note set by support. Individual review. Not bulk-migrated.
Drift detection is where the complexity lives. The unified inconsistency report (cloudflare_account_inconsistencies) unions four sources. Three check flags — they're untouched. One checks amounts — it's replaced entirely. Three new invariants from this workstream, each enforced at three tiers.
Three of four drift sources (subscription, flags, never-paid) are unchanged — they check flags and payment status, not amounts. The fourth — invoice metadata drift — is replaced entirely: was bad_debt_amount correct across 4 stores? Now: credit note exists + amount_remaining matches proration?
| # | Invariant | Property |
|---|---|---|
| I2 | Every uncollectible has a credit note | For every invoice in uncollectible status (post-migration), a credit note exists. amount_remaining = original − credit. |
| I3 | Blocked customer has recovery path | Every account with bad_debt flag has at least one payable invoice. Payment clears all flags. |
| I5 | Dunning state consistent across stores | subs_dunning, subs_customer, subs_account, Stripe metadata, and IAPI ban state agree on dunning status. |
Inline: uncollectible handler issues CN atomically. Batch: BQ query for uncollectible invoices without CNs (post-migration). Alert: counter — PagerDuty if > 0.
Inline: brain verifies payable invoice exists before flagging. Batch: BQ query for flagged accounts without recovery path. Alert: counter — PagerDuty if > 0.
Inline: recompute engine reads all 6 stores before writes. Batch: existing BQ consistency check. Alert: inconsistency count trending up.
- 2
- engineers
- 1
- quarter
- 10
- milestones across 3 tracks
| Deliverable | Ships | Depends On | Δ/mo | Estimate |
|---|---|---|---|---|
| Brain Phase 1 · Copy + Test | April | — | — | 4–6 weeks |
| CN M1 · Surface Audit | April | — | — | 1–2 weeks |
| CN M2 · Issuance | Apr–May | CN M1 | — | 2–3 weeks |
| Brain Phase 2 · Redesign | May | Brain Phase 1 | — | 4–6 weeks |
| CN M3 · Pay-Original | May | CN M2, Brain Phase 2 | — | 2–3 weeks |
| CN M4 · Validation | Jun | CN M3 | — | 2–3 weeks |
| Brain Phase 3 · Delete | Jun | Brain Phase 2 | — | 1–2 weeks |
| Segment Rules | Jul | Brain Phase 3 | — | 3–4 weeks |
| Retroactive Migration | Jun | CN M4 | — | 3–4 weeks (73K accounts) |
| Brain + CN + Cleanup | April | Brain Phase 1 | −100 | (stale flag cleanup) |
| # | Decision | Owner | By When | Status | Recommendation | Risk if Wrong |
|---|---|---|---|---|---|---|
| 1 | Surface audit completeness | Engineering | Apr 1 (CN M1) | OPEN | Code search + brain graph | Critical — silent break |
| 2 | Multi-invoice reinstatement policy | Product + Finance | May 12 (CN M3) | OPEN | Progressive (latest first) | High — delays M3 |
| 3 | BCA reset vs. inherit old anchor | Eng + Product | May 12 (CN M3) | OPEN | Reset | Low — comms needed |
| 4 | Credit note reason code | Engineering | Apr 14 (CN M2) | DECIDED | order_change | Low |
| 5 | PayPal dunning logic scope | Engineering | Apr 1 (Brain P1) | OPEN | Spike to bound scope | High — scope expansion |
| 6 | Per-type dunning rules | Product + Finance | Jun 30 (Segments) | OPEN | Explicit rules per type | Medium — business rules |
| 7 | Retroactive CN accounting limits | Finance | Jun 16 (Migration) | OPEN | Research Stripe limits | Medium — old invoices |
| 8 | Dual-mode coexistence during migration | Engineering | Jun 2 (CN M4) | OPEN | CN first, fall back to metadata | Medium — inconsistent |
| Risk | Impact | Owner | Mitigation |
|---|---|---|---|
| Undiscovered handlers | Critical | CN M1 lead | Surface audit (CN M1) + code search. Any missed handler breaks silently. |
| Premature deletion | Critical | Brain lead | Brain Phase 3 gated on Phase 2 test suite passing. No deletion without coverage. |
| PayPal scope expansion | High | Brain lead | Spike bounds brain scope before Phase 1. If PayPal logic is larger than expected, brain timeline grows. |
| Stripe CN limits on uncollectible | High | CN lead | Load-bearing unknown. If Stripe rejects CNs on uncollectible invoices, fallback is 1:1 replacement invoices. Fallback adds ~2 weeks. |
| Business rule disagreement | Medium | Product | Per-type rules need Product + Finance alignment. Start discovery in parallel. |
| What | Why |
|---|---|
| Payment retry logic | Distinct concern from invoice adjustment and brain architecture. Stays in billing-webhooks. |
| Notification system | Dunning notifications are downstream of brain decisions. Separate service, separate timeline. |
| Billing portal surfaces | Dashboard changes are downstream of the API work. Separate surface, separate timeline. |
| Probation surfaces | Probation is policy enforcement. Brain doesn't change probation behavior. |
| Shadow billing / Reactor | Separate workstream. Validates the billing engine, not dunning or invoicing. |
Code inventory (4 handlers), test matrices (20 scenarios), deletion manifest, account type catalog, and credit note infrastructure details.
The invoice shows $19.66. The debt is $19.66. The user pays $19.66. One button. The invoice they recognize. The brain decides, the credit note adjusts, the system agrees.