Note for readers: This is a real operational planning document from a live engagement, anonymized for publication. Company names, vendor names, region codes, and people have been replaced with placeholders ("Acme", "the accounting system", "the fleet system", "Region A/B", "the controller"). Technical content, architecture, measure counts, and slice plans are unchanged. Published as a case study at eidosagi.com/case-studies/warehouse-planning.

SUPERSEDES silver-gold-22-measures-plan.html (v1, written 90 minutes earlier). Three structural smells were caught in the v1 plan during stakeholder review: fragile leading-digit account join, measure-per-entity identifier explosion, and conflated measure / metric / dimension semantics. v2 reframes the entire gold layer to follow the dbt MetricFlow pattern that Airbnb, Uber, Google, and dbt itself converged on independently.
Silver + Gold for 22 Measures v2 — dbt MetricFlow + Dimensional
Dimensional modeling done correctly: one silver fact, ~7 metric definitions, one materialized gold fact, one stakeholder bridge. Built on the open-source dbt MetricFlow framework — the same pattern Airbnb's Minerva, Uber's uMetric, Google's LookML, and Snowflake's Semantic Views all implement.
Repo
data-pipeline-v5
Branch
silver-gold-22-measures-2026-05-10
Total Estimate
~3.5 hours
Standard
dbt MetricFlow (OSS)

§1 The Four Discoveries That Reshaped The Plan

v1 of this plan was a brute-force port of v4's measure list into per-measure dbt models. Stakeholder review (and a cept consultation) surfaced four issues that compounded into "the wrong shape" — not a wrong implementation, but a wrong architecture. v2 restarts from the right shape.

FIND-1 The Ledger Gold Parity workflow has been red 9 days — but unrelated
Investigated

dashboard-migrations repo's Ledger Gold Parity workflow has been failing every day since 2026-05-02 against commit 184e9a8. Job exits 1 at the validator step in 10 seconds.

Investigation: validator at validation/ledger_excel_parity.py compares fixture JSON (committed snapshots from spreadsheet-lab-v1, the v1 spreadsheet lab) against live ledger_gold.* materialized views in Supabase. This is a different gold layer (v1's, in Supabase) from what we're building (warehouse-postgres). The failure does NOT signal v4 baseline drift. It signals that the v1 lab's fixture-vs-live parity has bit-rotted, which is expected once we stopped maintaining the v1 layer.

Action: file an issue in dashboard-migrations to archive or update the workflow, but it does not gate this plan. Our v4 baseline comes from v4's reconcile output, not from ledger_gold.

FIND-2 Leading-digit join in fct_gl_entry is a fragile heuristic disguised as a dim join
Caught by stakeholder

v1's silver fact joined account categories via LEFT(accountno, 1) = gl_leading_digit. This is string-prefix matching presented as a dimensional join. Three real problems:

  1. Assumes invariants the accounting system doesn't guarantee. 5-digit account numbers, leading digit always classifies. The accounting system could issue a 6-digit account or repurpose a digit and we'd silently miscategorize.
  2. Hides exceptions. Accounts 55110/55120/56130/56140 (Capitalized Truck/Container/Labor) start with 5, which the heuristic says = "direct cost / COGS," but the controller EXCLUDES them from R&M and Total Costs. The exclusions live as per-measure NOT IN clauses scattered across gold model SQL.
  3. Misses the actual dimension. The accounting system's gl_accounts bronze table (already in sources.yml) literally has every account with its chart-of-accounts category. We should join against THAT.

Fix: new silver model silver/dim_gl_account that resolves account → category once, with flags like is_capitalized, is_in_total_costs, is_revenue. Gold/MetricFlow filter expressions then read WHERE is_revenue, not WHERE accountno LIKE '4%'.

FIND-3 measure-per-entity is identifier explosion, not dimensional modeling
Caught by stakeholder

v1 had 22 distinct measure_ids like consolidated_revenue_region_a, disposal_cost_roll_off_region_b, etc. The strings encode three things: the measure (revenue), the entity (region_a), and the controller's stakeholder language ("consolidated").

Why it's wrong:

FIND-4 "Measure" and "metric" are different concepts, and v1 conflated them
Caught by stakeholder

Dimensional modeling vocabulary, as used by every analytics platform:

v1 jammed all four into a single string identifier. That's the category error. v2 separates them.

§2 The Right Architecture (4 Layers)

The architecture below is what every major analytics platform converged on independently. We're catching up to industry, not inventing.

graph TB subgraph "BRONZE — raw vendor data (existing)" B1[dp5_ledger_bronze.gl_entries] B2[dp5_ledger_bronze.gl_accounts] B3[dp5_fleet_bronze.vehicles] end subgraph "SILVER — dims and facts (mostly existing, dim_gl_account is new)" S1[silver.dim_gl_account
NEW: replaces leading-digit heuristic] S2[silver.dim_entity] S3[silver.dim_lob] S4[silver.dim_period] S5[silver.fct_gl_entry
signed_amount with all dims joined] S6[silver.fct_fleet_vehicle] S7[silver.gl_exclusions
NEW: 1:N exclusion table] end subgraph "SEMANTIC — YAML, MetricFlow-managed" SM1[semantic_models.yml
measures + entities + dimensions] SM2[metrics.yml
filter exprs, aggs, sliceable_by] end subgraph "GOLD — materialized values" G1[gold.metric_value
metric_id × period × entity × lob × value] end subgraph "REGISTRY — stakeholder bridge" R1[registry.metric_published
controller PDF row → metric × dim tuple] end B1 --> S5 B2 --> S1 B3 --> S6 S1 --> S5 S2 --> S5 S3 --> S5 S5 --> SM1 S6 --> SM1 SM1 --> SM2 SM2 --> G1 G1 --> R1 style S1 fill:#1f5b2a,stroke:#3fb950 style S7 fill:#1f5b2a,stroke:#3fb950 style SM1 fill:#3a2c52,stroke:#bc8cff style SM2 fill:#3a2c52,stroke:#bc8cff style G1 fill:#1f5b2a,stroke:#3fb950 style R1 fill:#1f5b2a,stroke:#3fb950

Concept-by-concept: how 22 measures collapse to ~7 metrics × dimensions

graph LR subgraph "OLD: 22 measure_ids (v1)" O1[consolidated_revenue_region_a] O2[consolidated_revenue_region_b] O3[consolidated_revenue_total] O4[disposal_cost_region_a] O5[disposal_cost_roll_off_region_a] O6[lob_revenue_residential] O7[rm_cost_region_a] O8[total_costs_cogs_sga_minus_da_region_a] O9[front_line_truck_portable_lob] OX[15 more strings ...] end subgraph "NEW: 7 metrics × 3 dimensions (v2)" N1[metric: revenue] N2[metric: disposal_cost] N3[metric: rm_cost] N4[metric: total_costs] N5[metric: front_line_truck] N6[metric: weeks_paychecks] N7[metric: workdays_per_month] DX[× dim: entity_id] DY[× dim: lob_id] DZ[× dim: period] end O1 --> N1 O2 --> N1 O3 --> N1 O4 --> N2 O5 --> N2 O6 --> N1 O7 --> N3 O8 --> N4 O9 --> N5 style O1 fill:#3a1f1f,stroke:#f85149 style O2 fill:#3a1f1f,stroke:#f85149 style O3 fill:#3a1f1f,stroke:#f85149 style O4 fill:#3a1f1f,stroke:#f85149 style O5 fill:#3a1f1f,stroke:#f85149 style O6 fill:#3a1f1f,stroke:#f85149 style O7 fill:#3a1f1f,stroke:#f85149 style O8 fill:#3a1f1f,stroke:#f85149 style O9 fill:#3a1f1f,stroke:#f85149 style OX fill:#3a1f1f,stroke:#f85149 style N1 fill:#1f5b2a,stroke:#3fb950 style N2 fill:#1f5b2a,stroke:#3fb950 style N3 fill:#1f5b2a,stroke:#3fb950 style N4 fill:#1f5b2a,stroke:#3fb950 style N5 fill:#1f5b2a,stroke:#3fb950 style N6 fill:#1f5b2a,stroke:#3fb950 style N7 fill:#1f5b2a,stroke:#3fb950

The data flow, end to end

sequenceDiagram participant V as Vendor APIs participant DLT as dlt (data-pipeline-v5) participant B as Bronze (warehouse) participant DBT as dbt participant S as Silver participant MF as MetricFlow participant G as Gold participant R as Registry participant TEST as dbt tests V->>DLT: extract DLT->>B: dp5_ledger_bronze.* B->>DBT: source DBT->>S: build dim_gl_account, fct_gl_entry, gl_exclusions S->>MF: semantic_models read silver tables MF->>DBT: compile metrics → SQL DBT->>G: materialize gold.metric_value G->>R: registry.metric_published joins gold for ratification R->>TEST: dbt test: every published row within ±$0.10 of v4 TEST-->>DBT: pass / fail Note over G,R: Dashboard reads registry + gold by JOIN

§3 Industry Justification — Why This Is The Right Pattern

This isn't a team-invented pattern or a company-specific compromise. Every serious analytics platform converged on this independently because the alternatives don't scale.

graph LR subgraph "Companies (internal platforms)" AB[Airbnb
Minerva] UB[Uber
uMetric] LK[LinkedIn
Unified Metrics Platform] NF[Netflix
Metric Catalog] end subgraph "Vendors (products)" LM[Looker
LookML measures] DM[dbt MetricFlow
OSS - what we'll use] SF[Snowflake
Semantic Views 2024] DB[Databricks
Unity Catalog semantics] CB[Cube.dev
OSS] end subgraph "Common pattern" P1[1\. central metric registry] P2[2\. measures + dimensions modeled separately] P3[3\. metrics = measure + filter + agg + slice-by] P4[4\. materialized cubes downstream] P5[5\. stakeholder names map TO metric × dim tuples] end AB --> P1 UB --> P1 LK --> P1 NF --> P1 LM --> P1 DM --> P1 SF --> P1 DB --> P1 CB --> P1 style DM fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style P1 fill:#3a2c52,stroke:#bc8cff style P2 fill:#3a2c52,stroke:#bc8cff style P3 fill:#3a2c52,stroke:#bc8cff style P4 fill:#3a2c52,stroke:#bc8cff style P5 fill:#3a2c52,stroke:#bc8cff

Why every one of them did it (the forcing functions)

Forcing functionReal-world example
Metric consistency Airbnb pre-Minerva: ~5 different "DAU" calculations across teams, all called "DAU," all giving different numbers. Boards getting conflicting reports. Minerva's whole purpose was forcing ONE definition per metric. Acme equivalent waits in the wings: when Marketing builds a "revenue" view that includes one-off Other Income but Finance builds one that excludes it, both called "revenue," trust breaks.
Discoverability at scale Uber's uMetric catalog has ~15,000 metrics. "What metrics exist?" must be a query against a registry table, not ls **/*.sql. At Acme's eventual scale (95 → 500+ over next 24 months as the next region comes online + new vendors land), a registry is mandatory.
Versioning + lineage LinkedIn UMP: every metric definition is versioned. When a definition changes, downstream consumers are notified, dashboards display the version they reference, comparisons across versions are queryable. SQL files in a git log don't have this.
Combinatorial scale Looker LookML: dimensions and measures are ORTHOGONAL. You don't write revenue_by_region_by_quarter; you define revenue and region and quarter, then queries combine them at runtime. Hand-writing per-metric-per-dimension SQL doesn't scale past about 30 distinct metric × dim combinations.
Materialization strategy Netflix Metric Catalog: hot queries hit pre-aggregated cubes; cold queries run on-demand against silver. The catalog decides materialization based on query patterns. Per-metric tables can't be optimized this way — you've already committed to a fixed materialization shape.
★ The Airbnb Minerva post-mortem (paraphrased)

Airbnb built Minerva (~2018) specifically to migrate AWAY from per-team-per-metric SQL files. Their pre-Minerva state was effectively consolidated_revenue_region_a × 5,000 — every team had its own variant of the same metric, scattered across SQL files, all subtly different.

The migration took years and was expensive precisely because they had let the wrong shape compound. Acme catches this at 22 measures, not 5,000. This is the moment to fix the shape.

§4 dbt MetricFlow — The Implementation Standard We'll Use

dbt MetricFlow is the open-source implementation of this pattern, designed to live inside an existing dbt project. We're already using dbt; adding MetricFlow is a pip install dbt-metricflow + YAML files. No new infrastructure.

⌬ What runs on-prem vs in dbt Cloud

Open source (we use this): the dbt-metricflow Python library — compiles semantic_models + metrics YAML into SQL, materializes via dbt build. Runs anywhere dbt Core runs (our existing Dagster setup).

dbt Cloud only (we don't need): the hosted Semantic Layer API (REST/GraphQL endpoints for BI tools to query metrics dynamically). Our consumers (dashboard) query Postgres tables directly — no REST API needed.

Note on a contradiction in published commentary: some recent articles (e.g., Medium / Reliable Data Engineering, Apr 2026) state that the dbt Semantic Layer "requires dbt Cloud (no dbt Core option)." This conflates the dbt Semantic Layer SERVICE (the hosted Cloud-only API surface) with the dbt-metricflow LIBRARY (open source, runs with dbt Core). For our pattern — materialize to Postgres tables, dashboard reads tables directly — we only need the OSS library. The Cloud-only restriction does not apply.

⚠ Critical: MATERIALIZE, do not dynamic-compute

MetricFlow defaults to dynamic SQL generation — every query against the Semantic Layer compiles fresh SQL and runs it against silver. That's fine for ad-hoc analyst queries; it's wrong for a dashboard hit by every page load.

Slice 6.8v2 must materialize gold.metric_value as a real dbt-built table on every dbt run. The dashboard reads the materialized table; it NEVER queries MetricFlow's compile-on-demand path. Pattern (per dbt MetricFlow docs):

dbt build --select +metric:revenue +metric:disposal_cost ...
# materializes gold.metric_value with all combinations

This decision is locked in: Phase 3 dashboard wiring queries materialized tables only. Dynamic-compute path is reserved for ad-hoc analyst use, never for user-facing reads.

How it works (data + control flow)

flowchart TD Y1[semantic_models.yml
name: gl_entries
measures: signed_amount
entities: entity_id, lob_id
dimensions: period] Y2[metrics.yml
name: revenue
type: simple
filter: is_revenue = true
agg: sum] CMD[dbt build
--select +metric:revenue] COMP[MetricFlow compiles:
SELECT period, entity_id, SUM signed_amount
FROM silver.fct_gl_entry
JOIN silver.dim_gl_account
WHERE is_revenue
GROUP BY 1, 2] MAT[(gold.metric_value
materialized table)] DB[(analytics-warehouse-postgres)] Y1 --> CMD Y2 --> CMD CMD --> COMP COMP --> MAT MAT --> DB style Y1 fill:#3a2c52,stroke:#bc8cff style Y2 fill:#3a2c52,stroke:#bc8cff style CMD fill:#1c2230,stroke:#58a6ff style COMP fill:#1c2230,stroke:#58a6ff style MAT fill:#1f5b2a,stroke:#3fb950

What our YAML will look like

One semantic_model file (the GL fact):

semantic_models:
  - name: gl_entry
    description: "GL line items with all dimensions resolved."
    model: ref('fct_gl_entry')

    entities:
      - name: entity        # region_a | region_b
        type: foreign
        expr: entity_id
      - name: lob           # roll_off | fec | residential | portable
        type: foreign
        expr: lob_id

    dimensions:
      - name: period
        type: time
        type_params:
          time_granularity: month
        expr: period
      - name: account_category
        type: categorical
      - name: is_capitalized
        type: categorical
      - name: is_revenue
        type: categorical

    measures:
      - name: signed_amount
        agg: sum
        expr: signed_amount
        agg_time_dimension: period

One metrics file (the 7 business metrics):

metrics:
  - name: revenue
    description: "Total revenue (4xxxx accounts), Acme-side. Negate for display."
    type: simple
    type_params:
      measure: signed_amount
    filter: "{{ Dimension('gl_entry__is_revenue') }} = true"
    # Multiply by -1 in the published view (revenue is stored as credits).

  - name: disposal_cost
    description: "Disposal cost (53xxx accounts)."
    type: simple
    type_params:
      measure: signed_amount
    filter: "{{ Dimension('gl_entry__account_category') }} = 'disposal'"

  - name: rm_cost
    description: "R&M (55xxx + 56xxx + 57xxx, EXCLUDING capitalized)."
    type: simple
    type_params:
      measure: signed_amount
    filter: |
      {{ Dimension('gl_entry__account_category') }} IN ('truck_rm', 'container_rm', 'portable_rm')
      AND {{ Dimension('gl_entry__is_capitalized') }} = false

  - name: total_costs_cogs_sga_minus_da
    description: "COGS + SG&A − D&A (5xxxx + 6xxxx, excluding D&A and Capitalized)."
    type: simple
    type_params:
      measure: signed_amount
    filter: |
      {{ Dimension('gl_entry__is_in_total_costs') }} = true

Then dbt build --select +metric:revenue compiles these into SQL, materializes the rows, and puts them in gold.metric_value.

The stakeholder bridge — how the controller's PDF rows map to metric × dim tuples

-- registry.metric_published (seed, then evolved as we ratify more)
-- One row per controller-published metric. The bridge from stakeholder language
-- to dimensional language.

published_id, published_label, source_doc, metric_id, entity_id, lob_id, ratified
"M001", "Consolidated Revenue Region A", "controller_2026_02_11.pdf", "revenue", "region_a", NULL, true
"M002", "Consolidated Revenue Region B", "controller_2026_02_11.pdf", "revenue", "region_b", NULL, true
"M003", "Consolidated Revenue Total", "controller_2026_02_11.pdf", "revenue", NULL, NULL, true
"M004", "Disposal Cost Roll Off Region A", "controller_2026_02_11.pdf", "disposal_cost", "region_a", "roll_off", true
"M005", "R&M Cost Region A", "controller_2026_02_11.pdf", "rm_cost", "region_a", NULL, true
... 17 more ratified, 73 unratified (ratified=false)

Dashboard query becomes trivial:

SELECT p.published_label, g.value
FROM registry.metric_published p
JOIN gold.metric_value g
  ON g.metric_id = p.metric_id
  AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
  AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12'
  AND p.ratified = true
ORDER BY p.published_id;

§5 What Gets Reverted (Slices 6.2 + 6.3)

The SQL logic in slices 6.2 + 6.3 (filter expressions, exclusion handling, reassignment overrides) ports directly into MetricFlow filter expressions and silver dim flags. The structure they encode (per-measure tables, column-based exclusions, leading-digit join) is wrong and gets rebuilt.

6.2 silver.fct_gl_entry (v1 shape)
Revert (commit a5276a6)

Logic ports: dimension joins, signed_amount calculation, date parsing, reassignment overlay. Structure changes: remove the leading-digit account_category join (replaced by dim_gl_account FK); remove the excluded_from_measure column (replaced by silver.gl_exclusions). The keystone-fact pattern is preserved; the wiring is corrected.

6.3 gold.measures_accounting_monthly (v1 shape)
Revert (commit 5bb856f)

19 CTEs unioned into one monolithic table with measure-per-entity identifiers. Replaced entirely: the 19 filter expressions become 7 metrics in metrics.yml, the 22 published combinations become 22 rows in registry.metric_published, and gold.metric_value is materialized once by MetricFlow. The generate_schema_name.sql macro stays (still useful — gold lands in gold. schema).

6.1 v4 mappings as v5 seeds (KEPT)
Stays committed (396e6c3)

All 5 seeds (department_reassignments, known_exclusions, fleet_vehicle_type_to_lob, dim_period, dim_holiday) plus their schema.yml entries remain valid. They're consumed by the new silver dims and by MetricFlow filter expressions exactly as they were in v1. No revert needed.

§6 The Revised Slice Plan (v2)

Slice What it builds Estimate Cumulative Status
6.1 v4 mappings as v5 seeds — 5 CSVs + schema.yml 30 min 0:30 Done
6.0r Revert slices 6.2 + 6.3 cleanly (git revert + verify branch state) 10 min 0:40 Next
6.2v2 silver/dim_gl_account.sql — joins gl_accounts bronze + account_categories seed, derives is_capitalized + is_in_total_costs flags 30 min 1:10 Next
6.3v2 silver/fct_gl_entry.sql (rewritten) — joins dim_gl_account by accountno (not leading-digit), drops excluded_from_measure column. Plus silver/gl_exclusions.sql (1:N table) 45 min 1:55 Next
6.4v2 Add dbt-metricflow to v5 dependencies; create models/semantic/semantic_models.yml + metrics.yml with 7 metric definitions 45 min 2:40 Next
6.5v2 silver/fct_fleet_vehicle.sql + fleet semantic_model + front_line_truck metric 25 min 3:05 Next
6.6v2 Calendar metrics (weeks_paychecks, workdays_per_month) — semantic_model on dim_period, two metrics 20 min 3:25 Next
6.7v2 seeds/metric_published.csv — 22 ratified rows mapping the controller's PDF labels → (metric_id, entity_id, lob_id) tuples 20 min 3:45 Next
6.8v2 Materialize gold.metric_value via dbt build --select +metrics; wire to Dagster asset graph 25 min 4:10 Next
6.9v2 dbt tests: every published row matches v4 within ±$0.10; dim_period freshness warning; row counts; not_null on value 35 min 4:45 Next
6.10v2 Push branch + open PR + trigger Dagster run + verify in warehouse 25 min 5:10 Next
6.11v2 Sync gold.metric_value + registry.metric_published from warehouse to Supabase (read-only mirror). Dashboard reads Supabase tables directly. 35 min 5:45 Next (CRITICAL — dashboard depends on this)
⏱ Net time vs v1 plan
v1 estimated ~3:30 to "22 measures in per-measure gold tables in warehouse only." v2 estimates ~5:45 to "1 silver fact + 7 metrics + 22 registry rows + 1 materialized gold fact + passing dbt tests + Supabase mirror that the dashboard can query." Adds ~2 hours for a fundamentally better architecture that scales to 95 metrics without rewrite AND completes the dashboard-readable surface in the same body of work. Worth it.
⚠ Why slice 6.11v2 is non-optional
Earlier draft of this plan deferred the Supabase sync as "Phase 2." That was wrong. The dashboard can't query analytics-warehouse-postgres directly (mesh-only). Without slice 6.11v2, gold tables exist but no human ever sees them. Phase 2 (dashboard wire-up) and the Supabase sync are the same step; collapsing them avoids a 30-day "v2 is built but invisible" gap.

§7 Definition of Done (v2)

The validation queries (post-deploy smoke test)

-- 1. Metric definitions are intact: 7 metrics
SELECT metric_id, COUNT(*) AS rows_materialized
FROM gold.metric_value
GROUP BY metric_id
ORDER BY metric_id;
-- expect 7 rows

-- 2. Published metrics are intact: 22 ratified rows
SELECT COUNT(*) FROM registry.metric_published WHERE ratified = true;
-- expect 22

-- 3. Every published metric has corresponding gold value for Dec 2025
SELECT p.published_label, g.value, p.ratified
FROM registry.metric_published p
LEFT JOIN gold.metric_value g
  ON g.metric_id = p.metric_id
  AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
  AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12' AND p.ratified = true
ORDER BY p.published_id;
-- expect 22 rows, no NULL values

-- 4. Sentinel cents-precision check
SELECT g.value
FROM registry.metric_published p
JOIN gold.metric_value g USING (metric_id)
WHERE p.published_label = 'Consolidated Revenue Region B'
  AND g.entity_id = 'region_b' AND g.period = '2025-12';
-- expect 872850.23 (matches the controller's December PDF to the cent)

§8 What This Plan Explicitly Does NOT Do

§9 Risks Specific To MetricFlow Adoption

Risk Likelihood Mitigation
dbt-metricflow Postgres adapter has rough edges (newer than Snowflake/BigQuery support) Medium If we hit a blocker, fall back to hand-rolling the 4-layer pattern as plain dbt models (the same shape, manually compiled). MetricFlow is preferred because it's the standard, but the architecture works either way.
Compiled MetricFlow SQL doesn't honor known_exclusions cleanly Medium Encode exclusions as silver.gl_exclusions with a LEFT JOIN ... IS NULL pattern in fct_gl_entry. MetricFlow then filters against the resolved silver table, not against bronze. Tested in slice 6.3v2 before defining metrics.
The 'all_4xxx_revenue' wildcard exclusion (Sep 2025 HTN $82,932.50) doesn't fit the standard exclusion pattern Medium This is one specific row excluded from ALL revenue metrics for a specific period. Encode as: gl_exclusions row with measure_id='all_revenue' + period filter; MetricFlow's revenue metric definition adds a JOIN to exclude any (recordno, period) tuple in that exclusion class.
analytics-warehouse-postgres connection blocked from local agent (mesh-only) High (already encountered) Build all SQL/YAML locally; deploy + verify via Dagster (slice 6.10v2). Same approach as v1; well-understood.
v4 reconcile output (the test fixture) has Decimal precision that drifts from Postgres numeric round-trip Low Test tolerance is ±$0.10, well above any precision drift. Investigate only if a measure misses by >$0.50.
The 7-metric assumption is wrong (some published rows need a metric we haven't identified) Low Slice 6.7v2 builds metric_published from v4's MEASURES list; if any row can't be expressed as (existing_metric, entity, lob), we add a metric. The 7 are an estimate; could end up being 8 or 9. Architecture handles either.

§10 v1 vs v2 — The Diff Summarized

Dimensionv1 planv2 plan
Account category resolution LEFT(accountno, 1) heuristic silver.dim_gl_account FK join
Exclusion handling Column on fct_gl_entry (1:1 only) silver.gl_exclusions table (1:N)
Identifier shape 22 measure_ids encoding entity + LOB in name ~7 metric_ids + dimension columns
Metric definition source Hand-written SQL in 19 CTEs in one .sql file YAML metric definitions in metrics.yml
Materialization 22 per-measure gold tables OR 1 monolithic table (debate) 1 gold.metric_value table (MetricFlow-managed)
Stakeholder bridge None — measure_id IS the stakeholder name registry.metric_published (named bridge)
Adding measures #23-95 22 new gold tables × ~1 hour each = ~73 hours of model SQL ~73 new registry rows + occasional new metric (~1-2 hours per truly-new concept) = closer to ~10-20 hours for the lot
Cross-vendor measures (e.g., "revenue per truck") Would require new gold model joining accounting + fleet Define new metric referencing both semantic_models; MetricFlow handles join
Discoverability Read 22 SQL files SELECT * FROM registry.metric_published
Industry pattern None — invented locally dbt MetricFlow / Airbnb Minerva / LookML / Snowflake Semantic Views

§11 Governance Principle: Supabase Is Never The Semantic Source Of Truth

Locking this in now, before Phase 3 dashboard wire-up creates the temptation to define metrics in Supabase migrations.

LayerLives inOwnerAllowed?
Metric definitions (YAML) data-pipeline-v5/dbt_project/models/semantic/ dbt + agents (under PR review) ✅ Canonical
Materialized metric values gold.metric_value in analytics-warehouse-postgres dbt build (automated) ✅ Canonical
Stakeholder bridge registry.metric_published in analytics-warehouse-postgres dbt seed + engineer (versioned in repo) ✅ Canonical
App-facing denormalized aggregates Supabase (Phase 3 sync from warehouse) Sync script — read from gold, write to Supabase tables ✅ Read-only mirror
NEW metrics defined in Supabase migrations ❌ Never
Aggregations / business logic in dashboard app code ❌ Never
Dashboard tile reads metric directly from silver ❌ Never (always read gold or Supabase mirror)

The rule: if a future dashboard need wants a number that doesn't exist in gold.metric_value, the answer is "add a metric to MetricFlow YAML + republish," NOT "write a SQL view in Supabase" and NOT "compute it in the app's React code." Single source of truth for every business number; one path for adding new ones.

Why this matters: the moment Supabase has its own "revenue" calculation that drifts from gold.metric_value, stakeholders see two different numbers in two different views and trust collapses. Same Minerva lesson Airbnb learned the hard way.

§12 What Ships To Supabase, What Stays In The Warehouse

This is the operational picture. Supabase is the dashboard's read surface; the warehouse is the metric factory. Knowing exactly what crosses the boundary (and what does NOT) is the difference between clean architecture and accidental sprawl.

graph LR subgraph "WAREHOUSE — analytics-warehouse-postgres (Railway, mesh-only)" direction TB W_BR[Bronze
dp5_ledger_bronze.*
dp5_fleet_bronze.*
dp5_crm_bronze.*
~1.4M rows] W_SLD[Silver dims
dim_gl_account
dim_entity, dim_lob
dim_period, dim_holiday] W_SLF[Silver facts
fct_gl_entry
fct_fleet_vehicle
gl_exclusions] W_SEM[Semantic YAML
semantic_models.yml
metrics.yml] W_GD[Gold materialized
gold.metric_value
~1260 rows] W_REG[Registry
registry.metric_published
22 ratified rows] W_BR --> W_SLF W_SLD --> W_SLF W_SLF --> W_SEM W_SEM --> W_GD W_GD --> W_REG end subgraph "SUPABASE — Acme prod (RBAC, dashboard-readable)" direction TB S_GD[mirror_gold.metric_value
mirror — read-only] S_REG[mirror_gold.metric_published
mirror — read-only] S_RBAC[RBAC tables
auth.users, user_roles
existing, untouched] S_OTHER[Other app tables
weekly_reports, etc.
existing, untouched] end subgraph "Sync Pipeline" SYNC[Slice 6.11v2 sync script
warehouse → Supabase
via dlt or psycopg copy] end W_GD -- "ships" --> SYNC W_REG -- "ships" --> SYNC SYNC --> S_GD SYNC --> S_REG style W_BR fill:#3a1f1f,stroke:#f85149 style W_SLD fill:#3a1f1f,stroke:#f85149 style W_SLF fill:#3a1f1f,stroke:#f85149 style W_SEM fill:#3a1f1f,stroke:#f85149 style W_GD fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style W_REG fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_GD fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_REG fill:#1f5b2a,stroke:#3fb950,stroke-width:3px style S_RBAC fill:#1c2230,stroke:#8b949e style S_OTHER fill:#1c2230,stroke:#8b949e style SYNC fill:#3a2c52,stroke:#bc8cff,stroke-width:3px

The ship/no-ship matrix

Object Lives in Ships to Supabase? Reason
dp5_ledger_bronze.* (gl_entries, ap_bills, etc.) warehouse only NO ~1M+ rows. Cost prohibitive in Supabase. Not needed by dashboard. Raw vendor data shouldn't sit in app DB (compliance).
dp5_fleet_bronze.* (vehicles, status_changes, etc.) warehouse only NO Same reasoning. Bronze is internal-only.
dp5_crm_bronze.* warehouse only NO Same reasoning.
silver.dim_gl_account warehouse only NO Internal modeling artifact. Dashboard doesn't need raw account categorization — that logic is encoded in metric definitions.
silver.dim_entity, silver.dim_lob, silver.dim_period, silver.dim_holiday warehouse only NO Internal modeling. Dashboard reads dimensional values via gold.metric_value's entity_id/lob_id/period columns; doesn't need the dim tables themselves.
silver.fct_gl_entry, silver.fct_fleet_vehicle warehouse only NO Hundreds of thousands of rows. Not used by dashboard directly. Letting Supabase consume silver would create a back-channel for defining metrics outside the canonical place — violates §11 governance.
silver.gl_exclusions warehouse only NO Internal. Exclusion logic is applied IN gold materialization; downstream Supabase only sees the result.
semantic_models.yml + metrics.yml git repo (data-pipeline-v5) NO YAML files are version-controlled DEFINITIONS, not data. They live in git, get compiled by MetricFlow, and produce gold rows.
dbt seeds (department_reassignments, known_exclusions, etc.) git repo + warehouse NO Internal modeling input. Their effect propagates to gold; the seeds themselves don't need to be queryable from Supabase.
dbt_artifacts.* (run history, test results) warehouse only NO Operational telemetry. Could ship to Supabase if QA dashboard wants it, but not in v2 scope.
gold.metric_value warehouse + Supabase mirror YES — slice 6.11v2 Dashboard reads this. ~1260 rows (7 metrics × 12 periods × dim combos). Trivial sync cost. Source of every number on the dashboard.
registry.metric_published warehouse + Supabase mirror YES — slice 6.11v2 Dashboard joins this to translate the controller's PDF labels to metric × dimension tuples. ~22 rows now, ~95 at full scale. Trivial cost.
dashboard app's existing tables (weekly_reports, auth.users, RBAC tables, etc.) Supabase only EXISTING — untouched App-owned state. v2 doesn't change anything here. RBAC + auth + dashboard chrome stays exactly as it is.
future "denormalized aggregates for performance" (Phase 3?) Supabase only DEFERRED If/when dashboard query patterns demand pre-aggregated cubes (e.g., "monthly revenue × LOB matrix" for a chart), we'd build them as MATERIALIZED VIEWS in Supabase derived from the mirror. Same governance: derived from gold, never defines new metrics.

The two rules that keep this clean

Rule 1 — Supabase mirror is read-only. The sync script writes mirror_gold.metric_value + mirror_gold.metric_published on a schedule (or on dbt-build trigger). Nothing in dashboard app code or in Supabase migrations modifies these tables. If a dashboard developer wants to change a metric value, the answer is "edit the metric definition in data-pipeline-v5/dbt_project/models/semantic/metrics.yml, push, dbt builds, sync runs, mirror updates."

Rule 2 — If it's not in the matrix above, it doesn't ship. When a future need surfaces ("can we get raw GL entries in Supabase for X?"), the default answer is NO — the request gets reframed: "what NEW metric do you need in gold that would answer X?" If the answer is "I want to query bronze ad hoc," that's an analyst use case for the warehouse directly (via warehouse-mcp or psql), not a dashboard use case.

Why this matters operationally

§13 Decisions to confirm

Three places to confirm or modify

  1. Approve the architecture switch (v1 → v2)? v2 is ~1 hour more work this session but eliminates a refactor that v1 would force at ~50 measures. The cept consultation + stakeholder review surfaced the smells; the right call is to fix shape now, not later.
  2. dbt MetricFlow vs hand-rolled 4-layer? Default recommendation: MetricFlow (it's the open-source standard, runs on-prem, we already use dbt). Fallback if MetricFlow has rough edges on Postgres: hand-roll the same 4-layer pattern as plain dbt models. Architecture is the same; only the implementation tooling changes.
  3. File the dashboard-migrations accounting_parity issue or ignore? Default: file an issue — the workflow has been red 9 days, and even though it's unrelated to this work, leaving a red CI signal in a sibling repo is bad hygiene. ~5 min to file. Or ignore if you've already decided to retire that validator.