Silver + Gold for 22 Measures — A real warehouse plan (anonymized)

Note for readers: This is a real operational planning document from a live engagement, anonymized for publication. Company names, vendor names, region codes, and people have been replaced with placeholders ("Acme", "the accounting system", "the fleet system", "Region A/B", "the controller"). Technical content, architecture, measure counts, and slice plans are unchanged. Published as a case study at eidosagi.com/case-studies/warehouse-planning.

SUPERSEDES silver-gold-22-measures-plan.html (v1, written 90 minutes earlier). Three structural smells were caught in the v1 plan during stakeholder review: fragile leading-digit account join, measure-per-entity identifier explosion, and conflated measure / metric / dimension semantics. v2 reframes the entire gold layer to follow the dbt MetricFlow pattern that Airbnb, Uber, Google, and dbt itself converged on independently.

Silver + Gold for 22 Measures v2 — dbt MetricFlow + Dimensional

Dimensional modeling done correctly: one silver fact, ~7 metric definitions, one materialized gold fact, one stakeholder bridge. Built on the open-source dbt MetricFlow framework — the same pattern Airbnb's Minerva, Uber's uMetric, Google's LookML, and Snowflake's Semantic Views all implement.

Repo

data-pipeline-v5

Branch

silver-gold-22-measures-2026-05-10

Total Estimate

~3.5 hours

Standard

dbt MetricFlow (OSS)

§1 The Four Discoveries That Reshaped The Plan

v1 of this plan was a brute-force port of v4's measure list into per-measure dbt models. Stakeholder review (and a cept consultation) surfaced four issues that compounded into "the wrong shape" — not a wrong implementation, but a wrong architecture. v2 restarts from the right shape.

FIND-1 The Ledger Gold Parity workflow has been red 9 days — but unrelated

Investigated

dashboard-migrations repo's Ledger Gold Parity workflow has been failing every day since 2026-05-02 against commit 184e9a8. Job exits 1 at the validator step in 10 seconds.

Investigation: validator at validation/ledger_excel_parity.py compares fixture JSON (committed snapshots from spreadsheet-lab-v1, the v1 spreadsheet lab) against live ledger_gold.* materialized views in Supabase. This is a different gold layer (v1's, in Supabase) from what we're building (warehouse-postgres). The failure does NOT signal v4 baseline drift. It signals that the v1 lab's fixture-vs-live parity has bit-rotted, which is expected once we stopped maintaining the v1 layer.

Action: file an issue in dashboard-migrations to archive or update the workflow, but it does not gate this plan. Our v4 baseline comes from v4's reconcile output, not from ledger_gold.

FIND-2 Leading-digit join in fct_gl_entry is a fragile heuristic disguised as a dim join

Caught by stakeholder

v1's silver fact joined account categories via LEFT(accountno, 1) = gl_leading_digit. This is string-prefix matching presented as a dimensional join. Three real problems:

Assumes invariants the accounting system doesn't guarantee. 5-digit account numbers, leading digit always classifies. The accounting system could issue a 6-digit account or repurpose a digit and we'd silently miscategorize.
Hides exceptions. Accounts 55110/55120/56130/56140 (Capitalized Truck/Container/Labor) start with 5, which the heuristic says = "direct cost / COGS," but the controller EXCLUDES them from R&M and Total Costs. The exclusions live as per-measure NOT IN clauses scattered across gold model SQL.
Misses the actual dimension. The accounting system's gl_accounts bronze table (already in sources.yml) literally has every account with its chart-of-accounts category. We should join against THAT.

Fix: new silver model silver/dim_gl_account that resolves account → category once, with flags like is_capitalized, is_in_total_costs, is_revenue. Gold/MetricFlow filter expressions then read WHERE is_revenue, not WHERE accountno LIKE '4%'.

FIND-3 measure-per-entity is identifier explosion, not dimensional modeling

Caught by stakeholder

v1 had 22 distinct measure_ids like consolidated_revenue_region_a, disposal_cost_roll_off_region_b, etc. The strings encode three things: the measure (revenue), the entity (region_a), and the controller's stakeholder language ("consolidated").

Why it's wrong:

Combinatorial explosion baked in: 95 future measures × 2 entities × 4 LOBs = 760 distinct identifiers with overlapping concepts. Unmaintainable.
Filter logic copy-pasted: "exclude capitalized" lives in 4 SQL files instead of 1 metric definition.
Dimension semantics hidden in names: "Show me Region B for everything" requires regex on identifiers — never the answer.
Hierarchy lost: consolidated_revenue_total IS revenue rolled across entity_id values. Treating it as a separate measure makes the relationship invisible to the model.

FIND-4 "Measure" and "metric" are different concepts, and v1 conflated them

Caught by stakeholder

Dimensional modeling vocabulary, as used by every analytics platform:

Measure = an atomic numeric column you can SUM/AVG/COUNT. For accounting-system GL data, the measure is signed_amount. ONE measure.
Metric = a measure + a filter + an aggregation + the dimensions you can slice by. ~7 distinct metrics across all 95 published items: revenue, disposal_cost, rm_cost, total_costs, lob_revenue, front_line_truck, personnel.
Dimensions = the things you slice metrics by. For us: entity (region_a/region_b), lob (roll_off/fec/residential/portable), period (2025-MM).
Stakeholder name = the label the controller puts on a metric × dimension combination. "Consolidated Revenue Region A" is a name, not a metric.

v1 jammed all four into a single string identifier. That's the category error. v2 separates them.

§2 The Right Architecture (4 Layers)

The architecture below is what every major analytics platform converged on independently. We're catching up to industry, not inventing.

graph TB subgraph "BRONZE — raw vendor data (existing)" B1[dp5_ledger_bronze.gl_entries] B2[dp5_ledger_bronze.gl_accounts] B3[dp5_fleet_bronze.vehicles] end subgraph "SILVER — dims and facts (mostly existing, dim_gl_account is new)" S1[silver.dim_gl_account
NEW: replaces leading-digit heuristic] S2[silver.dim_entity] S3[silver.dim_lob] S4[silver.dim_period] S5[silver.fct_gl_entry
signed_amount with all dims joined] S6[silver.fct_fleet_vehicle] S7[silver.gl_exclusions
NEW: 1:N exclusion table] end subgraph "SEMANTIC — YAML, MetricFlow-managed" SM1[semantic_models.yml
measures + entities + dimensions] SM2[metrics.yml
filter exprs, aggs, sliceable_by] end subgraph "GOLD — materialized values" G1[gold.metric_value
metric_id × period × entity × lob × value] end subgraph "REGISTRY — stakeholder bridge" R1[registry.metric_published
controller PDF row → metric × dim tuple] end B1 --> S5 B2 --> S1 B3 --> S6 S1 --> S5 S2 --> S5 S3 --> S5 S5 --> SM1 S6 --> SM1 SM1 --> SM2 SM2 --> G1 G1 --> R1 style S1 fill:#1f5b2a,stroke:#3fb950 style S7 fill:#1f5b2a,stroke:#3fb950 style SM1 fill:#3a2c52,stroke:#bc8cff style SM2 fill:#3a2c52,stroke:#bc8cff style G1 fill:#1f5b2a,stroke:#3fb950 style R1 fill:#1f5b2a,stroke:#3fb950

Concept-by-concept: how 22 measures collapse to ~7 metrics × dimensions

graph LR subgraph "OLD: 22 measure_ids (v1)" O1[consolidated_revenue_region_a] O2[consolidated_revenue_region_b] O3[consolidated_revenue_total] O4[disposal_cost_region_a] O5[disposal_cost_roll_off_region_a] O6[lob_revenue_residential] O7[rm_cost_region_a] O8[total_costs_cogs_sga_minus_da_region_a] O9[front_line_truck_portable_lob] OX[15 more strings ...] end subgraph "NEW: 7 metrics × 3 dimensions (v2)" N1[metric: revenue] N2[metric: disposal_cost] N3[metric: rm_cost] N4[metric: total_costs] N5[metric: front_line_truck] N6[metric: weeks_paychecks] N7[metric: workdays_per_month] DX[× dim: entity_id] DY[× dim: lob_id] DZ[× dim: period] end O1 --> N1 O2 --> N1 O3 --> N1 O4 --> N2 O5 --> N2 O6 --> N1 O7 --> N3 O8 --> N4 O9 --> N5 style O1 fill:#3a1f1f,stroke:#f85149 style O2 fill:#3a1f1f,stroke:#f85149 style O3 fill:#3a1f1f,stroke:#f85149 style O4 fill:#3a1f1f,stroke:#f85149 style O5 fill:#3a1f1f,stroke:#f85149 style O6 fill:#3a1f1f,stroke:#f85149 style O7 fill:#3a1f1f,stroke:#f85149 style O8 fill:#3a1f1f,stroke:#f85149 style O9 fill:#3a1f1f,stroke:#f85149 style OX fill:#3a1f1f,stroke:#f85149 style N1 fill:#1f5b2a,stroke:#3fb950 style N2 fill:#1f5b2a,stroke:#3fb950 style N3 fill:#1f5b2a,stroke:#3fb950 style N4 fill:#1f5b2a,stroke:#3fb950 style N5 fill:#1f5b2a,stroke:#3fb950 style N6 fill:#1f5b2a,stroke:#3fb950 style N7 fill:#1f5b2a,stroke:#3fb950

The data flow, end to end

sequenceDiagram participant V as Vendor APIs participant DLT as dlt (data-pipeline-v5) participant B as Bronze (warehouse) participant DBT as dbt participant S as Silver participant MF as MetricFlow participant G as Gold participant R as Registry participant TEST as dbt tests V->>DLT: extract DLT->>B: dp5_ledger_bronze.* B->>DBT: source DBT->>S: build dim_gl_account, fct_gl_entry, gl_exclusions S->>MF: semantic_models read silver tables MF->>DBT: compile metrics → SQL DBT->>G: materialize gold.metric_value G->>R: registry.metric_published joins gold for ratification R->>TEST: dbt test: every published row within ±$0.10 of v4 TEST-->>DBT: pass / fail Note over G,R: Dashboard reads registry + gold by JOIN

§3 Industry Justification — Why This Is The Right Pattern

This isn't a team-invented pattern or a company-specific compromise. Every serious analytics platform converged on this independently because the alternatives don't scale.

graph LR subgraph "Companies (internal platforms)" AB[Airbnb
Minerva] UB[Uber
uMetric] LK[LinkedIn
Unified Metrics Platform] NF[Netflix
Metric Catalog] end subgraph "Vendors (products)" LM[Looker
LookML measures] DM[dbt MetricFlow
OSS - what we'll use] SF[Snowflake
Semantic Views 2024] DB[Databricks
Unity Catalog semantics] CB[Cube.dev
OSS] end subgraph "Common pattern" P1[1\. central metric registry] P2[2\. measures + dimensions modeled separately] P3[3\. metrics = measure + filter + agg + slice-by] P4[4\. materialized cubes downstream] P5[5\. stakeholder names map TO metric × dim tuples] end AB --> P1 UB --> P1 LK --> P1 NF --> P1 LM --> P1 DM --> P1 SF --> P1 DB --> P1 CB --> P1 style DM fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style P1 fill:#3a2c52,stroke:#bc8cff style P2 fill:#3a2c52,stroke:#bc8cff style P3 fill:#3a2c52,stroke:#bc8cff style P4 fill:#3a2c52,stroke:#bc8cff style P5 fill:#3a2c52,stroke:#bc8cff

Why every one of them did it (the forcing functions)

Forcing function	Real-world example
Metric consistency	Airbnb pre-Minerva: ~5 different "DAU" calculations across teams, all called "DAU," all giving different numbers. Boards getting conflicting reports. Minerva's whole purpose was forcing ONE definition per metric. Acme equivalent waits in the wings: when Marketing builds a "revenue" view that includes one-off Other Income but Finance builds one that excludes it, both called "revenue," trust breaks.
Discoverability at scale	Uber's uMetric catalog has ~15,000 metrics. "What metrics exist?" must be a query against a registry table, not `ls */.sql`. At Acme's eventual scale (95 → 500+ over next 24 months as the next region comes online + new vendors land), a registry is mandatory.
Versioning + lineage	LinkedIn UMP: every metric definition is versioned. When a definition changes, downstream consumers are notified, dashboards display the version they reference, comparisons across versions are queryable. SQL files in a git log don't have this.
Combinatorial scale	Looker LookML: dimensions and measures are ORTHOGONAL. You don't write `revenue_by_region_by_quarter`; you define `revenue` and `region` and `quarter`, then queries combine them at runtime. Hand-writing per-metric-per-dimension SQL doesn't scale past about 30 distinct metric × dim combinations.
Materialization strategy	Netflix Metric Catalog: hot queries hit pre-aggregated cubes; cold queries run on-demand against silver. The catalog decides materialization based on query patterns. Per-metric tables can't be optimized this way — you've already committed to a fixed materialization shape.

★ The Airbnb Minerva post-mortem (paraphrased)

Airbnb built Minerva (~2018) specifically to migrate AWAY from per-team-per-metric SQL files. Their pre-Minerva state was effectively consolidated_revenue_region_a × 5,000 — every team had its own variant of the same metric, scattered across SQL files, all subtly different.

The migration took years and was expensive precisely because they had let the wrong shape compound. Acme catches this at 22 measures, not 5,000. This is the moment to fix the shape.

§4 dbt MetricFlow — The Implementation Standard We'll Use

dbt MetricFlow is the open-source implementation of this pattern, designed to live inside an existing dbt project. We're already using dbt; adding MetricFlow is a pip install dbt-metricflow + YAML files. No new infrastructure.

⌬ What runs on-prem vs in dbt Cloud

Open source (we use this): the dbt-metricflow Python library — compiles semantic_models + metrics YAML into SQL, materializes via dbt build. Runs anywhere dbt Core runs (our existing Dagster setup).

dbt Cloud only (we don't need): the hosted Semantic Layer API (REST/GraphQL endpoints for BI tools to query metrics dynamically). Our consumers (dashboard) query Postgres tables directly — no REST API needed.

Note on a contradiction in published commentary: some recent articles (e.g., Medium / Reliable Data Engineering, Apr 2026) state that the dbt Semantic Layer "requires dbt Cloud (no dbt Core option)." This conflates the dbt Semantic Layer SERVICE (the hosted Cloud-only API surface) with the dbt-metricflow LIBRARY (open source, runs with dbt Core). For our pattern — materialize to Postgres tables, dashboard reads tables directly — we only need the OSS library. The Cloud-only restriction does not apply.

⚠ Critical: MATERIALIZE, do not dynamic-compute

MetricFlow defaults to dynamic SQL generation — every query against the Semantic Layer compiles fresh SQL and runs it against silver. That's fine for ad-hoc analyst queries; it's wrong for a dashboard hit by every page load.

Slice 6.8v2 must materialize gold.metric_value as a real dbt-built table on every dbt run. The dashboard reads the materialized table; it NEVER queries MetricFlow's compile-on-demand path. Pattern (per dbt MetricFlow docs):

dbt build --select +metric:revenue +metric:disposal_cost ...
# materializes gold.metric_value with all combinations

This decision is locked in: Phase 3 dashboard wiring queries materialized tables only. Dynamic-compute path is reserved for ad-hoc analyst use, never for user-facing reads.

How it works (data + control flow)

flowchart TD Y1[semantic_models.yml
name: gl_entries
measures: signed_amount
entities: entity_id, lob_id
dimensions: period] Y2[metrics.yml
name: revenue
type: simple
filter: is_revenue = true
agg: sum] CMD[dbt build
--select +metric:revenue] COMP[MetricFlow compiles:
SELECT period, entity_id, SUM signed_amount
FROM silver.fct_gl_entry
JOIN silver.dim_gl_account
WHERE is_revenue
GROUP BY 1, 2] MAT[(gold.metric_value
materialized table)] DB[(analytics-warehouse-postgres)] Y1 --> CMD Y2 --> CMD CMD --> COMP COMP --> MAT MAT --> DB style Y1 fill:#3a2c52,stroke:#bc8cff style Y2 fill:#3a2c52,stroke:#bc8cff style CMD fill:#1c2230,stroke:#58a6ff style COMP fill:#1c2230,stroke:#58a6ff style MAT fill:#1f5b2a,stroke:#3fb950

What our YAML will look like

One semantic_model file (the GL fact):

semantic_models:
  - name: gl_entry
    description: "GL line items with all dimensions resolved."
    model: ref('fct_gl_entry')

    entities:
      - name: entity        # region_a | region_b
        type: foreign
        expr: entity_id
      - name: lob           # roll_off | fec | residential | portable
        type: foreign
        expr: lob_id

    dimensions:
      - name: period
        type: time
        type_params:
          time_granularity: month
        expr: period
      - name: account_category
        type: categorical
      - name: is_capitalized
        type: categorical
      - name: is_revenue
        type: categorical

    measures:
      - name: signed_amount
        agg: sum
        expr: signed_amount
        agg_time_dimension: period

One metrics file (the 7 business metrics):

metrics:
  - name: revenue
    description: "Total revenue (4xxxx accounts), Acme-side. Negate for display."
    type: simple
    type_params:
      measure: signed_amount
    filter: "{{ Dimension('gl_entry__is_revenue') }} = true"
    # Multiply by -1 in the published view (revenue is stored as credits).

  - name: disposal_cost
    description: "Disposal cost (53xxx accounts)."
    type: simple
    type_params:
      measure: signed_amount
    filter: "{{ Dimension('gl_entry__account_category') }} = 'disposal'"

  - name: rm_cost
    description: "R&M (55xxx + 56xxx + 57xxx, EXCLUDING capitalized)."
    type: simple
    type_params:
      measure: signed_amount
    filter: |
      {{ Dimension('gl_entry__account_category') }} IN ('truck_rm', 'container_rm', 'portable_rm')
      AND {{ Dimension('gl_entry__is_capitalized') }} = false

  - name: total_costs_cogs_sga_minus_da
    description: "COGS + SG&A − D&A (5xxxx + 6xxxx, excluding D&A and Capitalized)."
    type: simple
    type_params:
      measure: signed_amount
    filter: |
      {{ Dimension('gl_entry__is_in_total_costs') }} = true

Then dbt build --select +metric:revenue compiles these into SQL, materializes the rows, and puts them in gold.metric_value.

The stakeholder bridge — how the controller's PDF rows map to metric × dim tuples

-- registry.metric_published (seed, then evolved as we ratify more)
-- One row per controller-published metric. The bridge from stakeholder language
-- to dimensional language.

published_id, published_label, source_doc, metric_id, entity_id, lob_id, ratified
"M001", "Consolidated Revenue Region A", "controller_2026_02_11.pdf", "revenue", "region_a", NULL, true
"M002", "Consolidated Revenue Region B", "controller_2026_02_11.pdf", "revenue", "region_b", NULL, true
"M003", "Consolidated Revenue Total", "controller_2026_02_11.pdf", "revenue", NULL, NULL, true
"M004", "Disposal Cost Roll Off Region A", "controller_2026_02_11.pdf", "disposal_cost", "region_a", "roll_off", true
"M005", "R&M Cost Region A", "controller_2026_02_11.pdf", "rm_cost", "region_a", NULL, true
... 17 more ratified, 73 unratified (ratified=false)

Dashboard query becomes trivial:

SELECT p.published_label, g.value
FROM registry.metric_published p
JOIN gold.metric_value g
  ON g.metric_id = p.metric_id
  AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
  AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12'
  AND p.ratified = true
ORDER BY p.published_id;

§5 What Gets Reverted (Slices 6.2 + 6.3)

The SQL logic in slices 6.2 + 6.3 (filter expressions, exclusion handling, reassignment overrides) ports directly into MetricFlow filter expressions and silver dim flags. The structure they encode (per-measure tables, column-based exclusions, leading-digit join) is wrong and gets rebuilt.

6.2 silver.fct_gl_entry (v1 shape)

Revert (commit a5276a6)

Logic ports: dimension joins, signed_amount calculation, date parsing, reassignment overlay. Structure changes: remove the leading-digit account_category join (replaced by dim_gl_account FK); remove the excluded_from_measure column (replaced by silver.gl_exclusions). The keystone-fact pattern is preserved; the wiring is corrected.

6.3 gold.measures_accounting_monthly (v1 shape)

Revert (commit 5bb856f)

19 CTEs unioned into one monolithic table with measure-per-entity identifiers. Replaced entirely: the 19 filter expressions become 7 metrics in metrics.yml, the 22 published combinations become 22 rows in registry.metric_published, and gold.metric_value is materialized once by MetricFlow. The generate_schema_name.sql macro stays (still useful — gold lands in gold. schema).

6.1 v4 mappings as v5 seeds (KEPT)

Stays committed (396e6c3)

All 5 seeds (department_reassignments, known_exclusions, fleet_vehicle_type_to_lob, dim_period, dim_holiday) plus their schema.yml entries remain valid. They're consumed by the new silver dims and by MetricFlow filter expressions exactly as they were in v1. No revert needed.

§6 The Revised Slice Plan (v2)

Slice	What it builds	Estimate	Cumulative	Status
`6.1`	v4 mappings as v5 seeds — 5 CSVs + schema.yml	30 min	0:30	Done
`6.0r`	Revert slices 6.2 + 6.3 cleanly (git revert + verify branch state)	10 min	0:40	Next
`6.2v2`	silver/dim_gl_account.sql — joins gl_accounts bronze + account_categories seed, derives is_capitalized + is_in_total_costs flags	30 min	1:10	Next
`6.3v2`	silver/fct_gl_entry.sql (rewritten) — joins dim_gl_account by accountno (not leading-digit), drops excluded_from_measure column. Plus silver/gl_exclusions.sql (1:N table)	45 min	1:55	Next
`6.4v2`	Add dbt-metricflow to v5 dependencies; create models/semantic/semantic_models.yml + metrics.yml with 7 metric definitions	45 min	2:40	Next
`6.5v2`	silver/fct_fleet_vehicle.sql + fleet semantic_model + front_line_truck metric	25 min	3:05	Next
`6.6v2`	Calendar metrics (weeks_paychecks, workdays_per_month) — semantic_model on dim_period, two metrics	20 min	3:25	Next
`6.7v2`	seeds/metric_published.csv — 22 ratified rows mapping the controller's PDF labels → (metric_id, entity_id, lob_id) tuples	20 min	3:45	Next
`6.8v2`	Materialize gold.metric_value via dbt build --select +metrics; wire to Dagster asset graph	25 min	4:10	Next
`6.9v2`	dbt tests: every published row matches v4 within ±$0.10; dim_period freshness warning; row counts; not_null on value	35 min	4:45	Next
`6.10v2`	Push branch + open PR + trigger Dagster run + verify in warehouse	25 min	5:10	Next
`6.11v2`	Sync gold.metric_value + registry.metric_published from warehouse to Supabase (read-only mirror). Dashboard reads Supabase tables directly.	35 min	5:45	Next (CRITICAL — dashboard depends on this)

⏱ Net time vs v1 plan

v1 estimated ~3:30 to "22 measures in per-measure gold tables in warehouse only." v2 estimates ~5:45 to "1 silver fact + 7 metrics + 22 registry rows + 1 materialized gold fact + passing dbt tests + Supabase mirror that the dashboard can query." Adds ~2 hours for a fundamentally better architecture that scales to 95 metrics without rewrite AND completes the dashboard-readable surface in the same body of work. Worth it.

⚠ Why slice 6.11v2 is non-optional

Earlier draft of this plan deferred the Supabase sync as "Phase 2." That was wrong. The dashboard can't query analytics-warehouse-postgres directly (mesh-only). Without slice 6.11v2, gold tables exist but no human ever sees them. Phase 2 (dashboard wire-up) and the Supabase sync are the same step; collapsing them avoids a 30-day "v2 is built but invisible" gap.

§7 Definition of Done (v2)

✓ silver.dim_gl_account exists with one row per accounting account, carrying account_category_id, is_revenue, is_capitalized, is_in_total_costs flags derived ONCE from the chart of accounts.
✓ silver.fct_gl_entry joins dim_gl_account by accountno (not leading-digit). The excluded_from_measure column is gone.
✓ silver.gl_exclusions exists as a 1:N table (recordno, measure_id).
✓ semantic_models.yml defines 2 semantic models (gl_entry, fleet_vehicle, dim_period) with measures + entities + dimensions.
✓ metrics.yml defines 7 metrics: revenue, disposal_cost, rm_cost, total_costs_cogs_sga_minus_da, front_line_truck, weeks_paychecks, workdays_per_month.
✓ gold.metric_value is materialized by MetricFlow via dbt build --select +metric:*; ~7 metrics × periods × dim combos rows.
✓ registry.metric_published seed + table has 22 rows mapping the controller's PDF labels to (metric_id, entity_id, lob_id) tuples; ratified=true on all 22.
✓ dbt tests pass: every published row's gold value matches v4 known-good within ±$0.10 for every 2025 period.
✓ Dagster asset graph shows the gold model as part of materialization; rebuild on bronze refresh is automatic.
✓ PR merged to data-pipeline-v5 main.
✓ Supabase has mirror tables: mirror_gold.metric_value and mirror_gold.metric_published — synced from the warehouse via a sync script (slice 6.11v2). Dashboard can JOIN these directly with no warehouse access.
✓ Dashboard CAN read from Supabase mirror by JOIN (separate slice to wire the dashboard's metrics-registry.ts, but the data is queryable today).

The validation queries (post-deploy smoke test)

-- 1. Metric definitions are intact: 7 metrics
SELECT metric_id, COUNT(*) AS rows_materialized
FROM gold.metric_value
GROUP BY metric_id
ORDER BY metric_id;
-- expect 7 rows

-- 2. Published metrics are intact: 22 ratified rows
SELECT COUNT(*) FROM registry.metric_published WHERE ratified = true;
-- expect 22

-- 3. Every published metric has corresponding gold value for Dec 2025
SELECT p.published_label, g.value, p.ratified
FROM registry.metric_published p
LEFT JOIN gold.metric_value g
  ON g.metric_id = p.metric_id
  AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
  AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12' AND p.ratified = true
ORDER BY p.published_id;
-- expect 22 rows, no NULL values

-- 4. Sentinel cents-precision check
SELECT g.value
FROM registry.metric_published p
JOIN gold.metric_value g USING (metric_id)
WHERE p.published_label = 'Consolidated Revenue Region B'
  AND g.entity_id = 'region_b' AND g.period = '2025-12';
-- expect 872850.23 (matches the controller's December PDF to the cent)

§8 What This Plan Explicitly Does NOT Do

Update dashboard's metrics-registry.ts to read from Supabase mirror — separate slice 6.12+ if/when. Currently the dashboard reads hand-curated TS; switching it to query the Supabase mirror is the next phase AFTER the data lands in Supabase. Dashboard wire-up is ~1.5 hr.
Tackle the 73 unratified measures — they become registry rows (ratified=false → ratified=true after each ratification). Adding measures is faster now: define published_label, point at existing metric × dim, run dbt test, flip to ratified=true. No new gold tables.
Implement dbt MetricFlow Service / GraphQL API — Cloud-only, we don't need it. Postgres tables are our query surface.
Build excelebro bonsai — apical violation; deferred until after this ships and we have evidence of what needs pruning.
Fix the dashboard-migrations Ledger Gold Parity workflow — unrelated to this work (different gold layer in Supabase). File an issue, let it ride.

§9 Risks Specific To MetricFlow Adoption

Risk	Likelihood	Mitigation
`dbt-metricflow` Postgres adapter has rough edges (newer than Snowflake/BigQuery support)	Medium	If we hit a blocker, fall back to hand-rolling the 4-layer pattern as plain dbt models (the same shape, manually compiled). MetricFlow is preferred because it's the standard, but the architecture works either way.
Compiled MetricFlow SQL doesn't honor known_exclusions cleanly	Medium	Encode exclusions as `silver.gl_exclusions` with a LEFT JOIN ... IS NULL pattern in fct_gl_entry. MetricFlow then filters against the resolved silver table, not against bronze. Tested in slice 6.3v2 before defining metrics.
The 'all_4xxx_revenue' wildcard exclusion (Sep 2025 HTN $82,932.50) doesn't fit the standard exclusion pattern	Medium	This is one specific row excluded from ALL revenue metrics for a specific period. Encode as: gl_exclusions row with `measure_id='all_revenue'` + period filter; MetricFlow's revenue metric definition adds a JOIN to exclude any (recordno, period) tuple in that exclusion class.
analytics-warehouse-postgres connection blocked from local agent (mesh-only)	High (already encountered)	Build all SQL/YAML locally; deploy + verify via Dagster (slice 6.10v2). Same approach as v1; well-understood.
v4 reconcile output (the test fixture) has Decimal precision that drifts from Postgres numeric round-trip	Low	Test tolerance is ±$0.10, well above any precision drift. Investigate only if a measure misses by >$0.50.
The 7-metric assumption is wrong (some published rows need a metric we haven't identified)	Low	Slice 6.7v2 builds metric_published from v4's MEASURES list; if any row can't be expressed as (existing_metric, entity, lob), we add a metric. The 7 are an estimate; could end up being 8 or 9. Architecture handles either.

§10 v1 vs v2 — The Diff Summarized

Dimension	v1 plan	v2 plan
Account category resolution	LEFT(accountno, 1) heuristic	silver.dim_gl_account FK join
Exclusion handling	Column on fct_gl_entry (1:1 only)	silver.gl_exclusions table (1:N)
Identifier shape	22 measure_ids encoding entity + LOB in name	~7 metric_ids + dimension columns
Metric definition source	Hand-written SQL in 19 CTEs in one .sql file	YAML metric definitions in metrics.yml
Materialization	22 per-measure gold tables OR 1 monolithic table (debate)	1 gold.metric_value table (MetricFlow-managed)
Stakeholder bridge	None — measure_id IS the stakeholder name	registry.metric_published (named bridge)
Adding measures #23-95	22 new gold tables × ~1 hour each = ~73 hours of model SQL	~73 new registry rows + occasional new metric (~1-2 hours per truly-new concept) = closer to ~10-20 hours for the lot
Cross-vendor measures (e.g., "revenue per truck")	Would require new gold model joining accounting + fleet	Define new metric referencing both semantic_models; MetricFlow handles join
Discoverability	Read 22 SQL files	SELECT * FROM registry.metric_published
Industry pattern	None — invented locally	dbt MetricFlow / Airbnb Minerva / LookML / Snowflake Semantic Views

§11 Governance Principle: Supabase Is Never The Semantic Source Of Truth

Locking this in now, before Phase 3 dashboard wire-up creates the temptation to define metrics in Supabase migrations.

Layer	Lives in	Owner	Allowed?
Metric definitions (YAML)	`data-pipeline-v5/dbt_project/models/semantic/`	dbt + agents (under PR review)	✅ Canonical
Materialized metric values	`gold.metric_value` in analytics-warehouse-postgres	dbt build (automated)	✅ Canonical
Stakeholder bridge	`registry.metric_published` in analytics-warehouse-postgres	dbt seed + engineer (versioned in repo)	✅ Canonical
App-facing denormalized aggregates	Supabase (Phase 3 sync from warehouse)	Sync script — read from gold, write to Supabase tables	✅ Read-only mirror
NEW metrics defined in Supabase migrations	—	—	❌ Never
Aggregations / business logic in dashboard app code	—	—	❌ Never
Dashboard tile reads metric directly from silver	—	—	❌ Never (always read gold or Supabase mirror)

The rule: if a future dashboard need wants a number that doesn't exist in gold.metric_value, the answer is "add a metric to MetricFlow YAML + republish," NOT "write a SQL view in Supabase" and NOT "compute it in the app's React code." Single source of truth for every business number; one path for adding new ones.

Why this matters: the moment Supabase has its own "revenue" calculation that drifts from gold.metric_value, stakeholders see two different numbers in two different views and trust collapses. Same Minerva lesson Airbnb learned the hard way.

§12 What Ships To Supabase, What Stays In The Warehouse

This is the operational picture. Supabase is the dashboard's read surface; the warehouse is the metric factory. Knowing exactly what crosses the boundary (and what does NOT) is the difference between clean architecture and accidental sprawl.

graph LR subgraph "WAREHOUSE — analytics-warehouse-postgres (Railway, mesh-only)" direction TB W_BR[Bronze
dp5_ledger_bronze.*
dp5_fleet_bronze.*
dp5_crm_bronze.*
~1.4M rows] W_SLD[Silver dims
dim_gl_account
dim_entity, dim_lob
dim_period, dim_holiday] W_SLF[Silver facts
fct_gl_entry
fct_fleet_vehicle
gl_exclusions] W_SEM[Semantic YAML
semantic_models.yml
metrics.yml] W_GD[Gold materialized
gold.metric_value
~1260 rows] W_REG[Registry
registry.metric_published
22 ratified rows] W_BR --> W_SLF W_SLD --> W_SLF W_SLF --> W_SEM W_SEM --> W_GD W_GD --> W_REG end subgraph "SUPABASE — Acme prod (RBAC, dashboard-readable)" direction TB S_GD[mirror_gold.metric_value
mirror — read-only] S_REG[mirror_gold.metric_published
mirror — read-only] S_RBAC[RBAC tables
auth.users, user_roles
existing, untouched] S_OTHER[Other app tables
weekly_reports, etc.
existing, untouched] end subgraph "Sync Pipeline" SYNC[Slice 6.11v2 sync script
warehouse → Supabase
via dlt or psycopg copy] end W_GD -- "ships" --> SYNC W_REG -- "ships" --> SYNC SYNC --> S_GD SYNC --> S_REG style W_BR fill:#3a1f1f,stroke:#f85149 style W_SLD fill:#3a1f1f,stroke:#f85149 style W_SLF fill:#3a1f1f,stroke:#f85149 style W_SEM fill:#3a1f1f,stroke:#f85149 style W_GD fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style W_REG fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_GD fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_REG fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_RBAC fill:#1c2230,stroke:#8b949e style S_OTHER fill:#1c2230,stroke:#8b949e style SYNC fill:#3a2c52,stroke:#bc8cff,stroke-width:3px

The ship/no-ship matrix

Object	Lives in	Ships to Supabase?	Reason
`dp5_ledger_bronze.*` (gl_entries, ap_bills, etc.)	warehouse only	NO	~1M+ rows. Cost prohibitive in Supabase. Not needed by dashboard. Raw vendor data shouldn't sit in app DB (compliance).
`dp5_fleet_bronze.*` (vehicles, status_changes, etc.)	warehouse only	NO	Same reasoning. Bronze is internal-only.
`dp5_crm_bronze.*`	warehouse only	NO	Same reasoning.
`silver.dim_gl_account`	warehouse only	NO	Internal modeling artifact. Dashboard doesn't need raw account categorization — that logic is encoded in metric definitions.
`silver.dim_entity`, `silver.dim_lob`, `silver.dim_period`, `silver.dim_holiday`	warehouse only	NO	Internal modeling. Dashboard reads dimensional values via gold.metric_value's entity_id/lob_id/period columns; doesn't need the dim tables themselves.
`silver.fct_gl_entry`, `silver.fct_fleet_vehicle`	warehouse only	NO	Hundreds of thousands of rows. Not used by dashboard directly. Letting Supabase consume silver would create a back-channel for defining metrics outside the canonical place — violates §11 governance.
`silver.gl_exclusions`	warehouse only	NO	Internal. Exclusion logic is applied IN gold materialization; downstream Supabase only sees the result.
`semantic_models.yml` + `metrics.yml`	git repo (data-pipeline-v5)	NO	YAML files are version-controlled DEFINITIONS, not data. They live in git, get compiled by MetricFlow, and produce gold rows.
dbt seeds (department_reassignments, known_exclusions, etc.)	git repo + warehouse	NO	Internal modeling input. Their effect propagates to gold; the seeds themselves don't need to be queryable from Supabase.
`dbt_artifacts.*` (run history, test results)	warehouse only	NO	Operational telemetry. Could ship to Supabase if QA dashboard wants it, but not in v2 scope.
`gold.metric_value`	warehouse + Supabase mirror	YES — slice 6.11v2	Dashboard reads this. ~1260 rows (7 metrics × 12 periods × dim combos). Trivial sync cost. Source of every number on the dashboard.
`registry.metric_published`	warehouse + Supabase mirror	YES — slice 6.11v2	Dashboard joins this to translate the controller's PDF labels to metric × dimension tuples. ~22 rows now, ~95 at full scale. Trivial cost.
dashboard app's existing tables (`weekly_reports`, `auth.users`, RBAC tables, etc.)	Supabase only	EXISTING — untouched	App-owned state. v2 doesn't change anything here. RBAC + auth + dashboard chrome stays exactly as it is.
future "denormalized aggregates for performance" (Phase 3?)	Supabase only	DEFERRED	If/when dashboard query patterns demand pre-aggregated cubes (e.g., "monthly revenue × LOB matrix" for a chart), we'd build them as MATERIALIZED VIEWS in Supabase derived from the mirror. Same governance: derived from gold, never defines new metrics.

The two rules that keep this clean

Rule 1 — Supabase mirror is read-only. The sync script writes mirror_gold.metric_value + mirror_gold.metric_published on a schedule (or on dbt-build trigger). Nothing in dashboard app code or in Supabase migrations modifies these tables. If a dashboard developer wants to change a metric value, the answer is "edit the metric definition in data-pipeline-v5/dbt_project/models/semantic/metrics.yml, push, dbt builds, sync runs, mirror updates."

Rule 2 — If it's not in the matrix above, it doesn't ship. When a future need surfaces ("can we get raw GL entries in Supabase for X?"), the default answer is NO — the request gets reframed: "what NEW metric do you need in gold that would answer X?" If the answer is "I want to query bronze ad hoc," that's an analyst use case for the warehouse directly (via warehouse-mcp or psql), not a dashboard use case.

Why this matters operationally

Cost containment. Supabase priced by data volume. Mirror is small (≤2K rows for v2 scope). Bronze is large (~1.4M rows). The matrix prevents accidental million-row Supabase loads.
Governance enforcement. §11 said "Supabase is never the semantic source of truth." This matrix is HOW that's enforced — by listing exactly what Supabase contains, anything else is by definition not supposed to be there.
Sync surface stays narrow. Slice 6.11v2's sync script is responsible for 2 tables. Adding a third = explicit code change + governance review. No accidental sprawl.
Dashboard predictability. Dashboard developers know exactly what tables exist in Supabase under mirror_gold. Schema is stable: gold.metric_value's columns don't drift because they're materialized by MetricFlow against versioned YAML.

§13 Decisions to confirm

Three places to confirm or modify

Approve the architecture switch (v1 → v2)? v2 is ~1 hour more work this session but eliminates a refactor that v1 would force at ~50 measures. The cept consultation + stakeholder review surfaced the smells; the right call is to fix shape now, not later.
dbt MetricFlow vs hand-rolled 4-layer? Default recommendation: MetricFlow (it's the open-source standard, runs on-prem, we already use dbt). Fallback if MetricFlow has rough edges on Postgres: hand-roll the same 4-layer pattern as plain dbt models. Architecture is the same; only the implementation tooling changes.
File the dashboard-migrations accounting_parity issue or ignore? Default: file an issue — the workflow has been red 9 days, and even though it's unrelated to this work, leaving a red CI signal in a sibling repo is bad hygiene. ~5 min to file. Or ignore if you've already decided to retire that validator.