Observability & Data

BIGQUERY DATA PIPELINES

39 Dataform definitions power the observability layer — all views, no materializations. Queries always read fresh data from nightly mirrors of Stripe snapshots. There is no stale cache to invalidate.

Metadata extraction follows a three-layer pattern: raw Stripe tables → _metadata_json (extracted JSON string) → _metadata (typed fields). Key sources are Stripe raw tables (invoices, subscriptions, items, prices) and the OPE store. Element classification tags each line item as one of: expected_in_stripe, bad_debt_item, deleted_from_ope, contract, delayed_upgrade, or missing_price. These classifications feed the drift detection and reconciliation views downstream.

DRIFT DETECTION

Two separate drift systems operate independently — different data sources, different candidate selection, different remediation paths. Stripe drift compares OPE subscriptions against Stripe subscriptions. WWW drift compares zone plan levels in the www database against subs-api records. Both are batch-only: daily BigQuery snapshots. Drift that occurs between snapshots is invisible until the next day. New drift is identified by comparing today's snapshot against yesterday's — accounts with persistent drift are excluded from automatic remediation by design.

DRIFT REMEDIATION

applystripe is the primary engine: a gather-plan-apply pipeline that reads account state from 7+ sources into a unified view, computes what the correct state should be, and applies the minimal diff. driftcontrol runs the daily cron with safety valves. auto_resync_www handles WWW drift. Enterprise account types (JDC, SFCC, IBM) are excluded — their state is set via contract paths and automatic remediation would break intended state.

Restriction	Effect
UnsupportedAccountType	Blocks entire plan — cloudflare_ent, msp, partners_pay_go, sfcc excluded
MergedAccount	Multiple accountIDs map to one Stripe customer — ambiguous, skipped
InvalidPaymentMethod	No payment method but changes cost money — cannot apply
InvoiceDateTooNear	Next invoice within N days — customer comms concern (BILLSUB-16)
NoStripeCustomer	No Stripe customer record found for account
BadDebtDriftRemediation gate off	Per-account feature gate controls bad debt flag remediation
InvoiceMetadataRemediation gate off	Per-account gate controls invoice metadata correction

WEEKLY TICKET DIGEST

An automated pipeline classifies BILL and CUSTESC Jira tickets weekly using the V6 taxonomy — 13 atoms organized into 6 molecules. The digest covers 27 weeks of data (~1,586 tickets) with a 96.3% classification rate. This is the primary signal connecting engineering decisions to customer pain.

Molecule	Atoms	~Weekly Volume
Subscription State	annual-monthly-confusion, cancellation-failure, stuck-processing, ghost-subscription	~15 tickets
Payment Loop	checkout-blocked, bad-debt-state	~16 tickets
Invoice Quality	void-waive, invoice-confusion	~9 tickets
Reconciliation	entitlement-drift, ordering-duplicate	~6 tickets
Customer Impact	refund-dispute	~4 tickets
Operational	operational	~5 tickets

INVESTIGATING AN ACCOUNT

The dunning brain's gather phase reads all six stores — subs_dunning, subs_account, subs_customer, billing_prod, cf_prod, and Stripe metadata — making it the most complete single view of account state available. For live investigation, triggering a gather on the account surfaces the full state diff.

NinjaPanel provides the support-facing view for customer accounts. BigQuery drift tables (cloudflare_account_inconsistencies_summary, cloudflare_inconsistent_bad_debt_accounts) surface accounts with cross-store disagreements. Stripe dashboard provides the authoritative payment history. Every healing attempt is logged to cfbill_migrations_audit with params, result, and failed boolean — check this table first when investigating why remediation didn't fire.