Note for readers: This is a real operational planning document from a live engagement, anonymized for publication. Company names, vendor names, region codes, and people have been replaced with placeholders ("Acme", "the accounting system", "the fleet system", "Region A/B", "the controller"). Technical content, architecture, measure counts, and slice plans are unchanged. Published as a case study at eidosagi.com/case-studies/warehouse-planning.
silver-gold-22-measures-plan.html (v1, written 90 minutes earlier).
Three structural smells were caught in the v1 plan during stakeholder review:
fragile leading-digit account join, measure-per-entity identifier explosion,
and conflated measure / metric / dimension semantics. v2 reframes the entire
gold layer to follow the dbt MetricFlow pattern that Airbnb, Uber, Google,
and dbt itself converged on independently.
v1 of this plan was a brute-force port of v4's measure list into per-measure dbt models. Stakeholder review (and a cept consultation) surfaced four issues that compounded into "the wrong shape" — not a wrong implementation, but a wrong architecture. v2 restarts from the right shape.
dashboard-migrations repo's Ledger Gold Parity workflow has
been failing every day since 2026-05-02 against commit 184e9a8. Job
exits 1 at the validator step in 10 seconds.
Investigation: validator at validation/ledger_excel_parity.py
compares fixture JSON (committed snapshots from spreadsheet-lab-v1,
the v1 spreadsheet lab) against live ledger_gold.* materialized
views in Supabase. This is a different gold layer (v1's, in Supabase) from
what we're building (warehouse-postgres). The failure does NOT signal v4 baseline
drift. It signals that the v1 lab's fixture-vs-live parity has bit-rotted, which
is expected once we stopped maintaining the v1 layer.
Action: file an issue in dashboard-migrations to
archive or update the workflow, but it does not gate this plan. Our v4 baseline
comes from v4's reconcile output, not from ledger_gold.
v1's silver fact joined account categories via LEFT(accountno, 1) = gl_leading_digit.
This is string-prefix matching presented as a dimensional join. Three real problems:
gl_accounts
bronze table (already in sources.yml) literally has every account with its
chart-of-accounts category. We should join against THAT.
Fix: new silver model silver/dim_gl_account that
resolves account → category once, with flags like is_capitalized,
is_in_total_costs, is_revenue. Gold/MetricFlow filter
expressions then read WHERE is_revenue, not
WHERE accountno LIKE '4%'.
v1 had 22 distinct measure_ids like consolidated_revenue_region_a,
disposal_cost_roll_off_region_b, etc. The strings encode three things:
the measure (revenue), the entity (region_a), and the controller's stakeholder language ("consolidated").
Why it's wrong:
consolidated_revenue_total IS
revenue rolled across entity_id values. Treating it as
a separate measure makes the relationship invisible to the model.Dimensional modeling vocabulary, as used by every analytics platform:
signed_amount. ONE measure.v1 jammed all four into a single string identifier. That's the category error. v2 separates them.
The architecture below is what every major analytics platform converged on independently. We're catching up to industry, not inventing.
This isn't a team-invented pattern or a company-specific compromise. Every serious analytics platform converged on this independently because the alternatives don't scale.
| Forcing function | Real-world example |
|---|---|
| Metric consistency | Airbnb pre-Minerva: ~5 different "DAU" calculations across teams, all called "DAU," all giving different numbers. Boards getting conflicting reports. Minerva's whole purpose was forcing ONE definition per metric. Acme equivalent waits in the wings: when Marketing builds a "revenue" view that includes one-off Other Income but Finance builds one that excludes it, both called "revenue," trust breaks. |
| Discoverability at scale | Uber's uMetric catalog has ~15,000 metrics. "What metrics exist?" must be a
query against a registry table, not ls **/*.sql. At Acme's eventual
scale (95 → 500+ over next 24 months as the next region comes online + new vendors land),
a registry is mandatory. |
| Versioning + lineage | LinkedIn UMP: every metric definition is versioned. When a definition changes, downstream consumers are notified, dashboards display the version they reference, comparisons across versions are queryable. SQL files in a git log don't have this. |
| Combinatorial scale | Looker LookML: dimensions and measures are ORTHOGONAL. You don't write
revenue_by_region_by_quarter; you define revenue and
region and quarter, then queries combine them at runtime.
Hand-writing per-metric-per-dimension SQL doesn't scale past about 30 distinct
metric × dim combinations. |
| Materialization strategy | Netflix Metric Catalog: hot queries hit pre-aggregated cubes; cold queries run on-demand against silver. The catalog decides materialization based on query patterns. Per-metric tables can't be optimized this way — you've already committed to a fixed materialization shape. |
Airbnb built Minerva (~2018) specifically to migrate AWAY from per-team-per-metric
SQL files. Their pre-Minerva state was effectively
consolidated_revenue_region_a × 5,000 — every team had its own
variant of the same metric, scattered across SQL files, all subtly different.
The migration took years and was expensive precisely because they had let the wrong shape compound. Acme catches this at 22 measures, not 5,000. This is the moment to fix the shape.
dbt MetricFlow is the open-source implementation of this pattern, designed to live
inside an existing dbt project. We're already using dbt; adding MetricFlow is a
pip install dbt-metricflow + YAML files. No new infrastructure.
Open source (we use this): the dbt-metricflow Python
library — compiles semantic_models + metrics YAML into SQL, materializes via
dbt build. Runs anywhere dbt Core runs (our existing Dagster setup).
dbt Cloud only (we don't need): the hosted Semantic Layer API (REST/GraphQL endpoints for BI tools to query metrics dynamically). Our consumers (dashboard) query Postgres tables directly — no REST API needed.
Note on a contradiction in published commentary: some recent
articles (e.g., Medium / Reliable Data Engineering, Apr 2026) state that
the dbt Semantic Layer "requires dbt Cloud (no dbt Core option)." This conflates
the dbt Semantic Layer SERVICE (the hosted Cloud-only API surface) with the
dbt-metricflow LIBRARY (open source, runs with dbt Core). For our
pattern — materialize to Postgres tables, dashboard reads tables directly — we
only need the OSS library. The Cloud-only restriction does not apply.
MetricFlow defaults to dynamic SQL generation — every query against the Semantic Layer compiles fresh SQL and runs it against silver. That's fine for ad-hoc analyst queries; it's wrong for a dashboard hit by every page load.
Slice 6.8v2 must materialize gold.metric_value as a real dbt-built
table on every dbt run. The dashboard reads the materialized table; it
NEVER queries MetricFlow's compile-on-demand path. Pattern (per dbt MetricFlow
docs):
dbt build --select +metric:revenue +metric:disposal_cost ...
# materializes gold.metric_value with all combinations
This decision is locked in: Phase 3 dashboard wiring queries materialized tables only. Dynamic-compute path is reserved for ad-hoc analyst use, never for user-facing reads.
One semantic_model file (the GL fact):
semantic_models:
- name: gl_entry
description: "GL line items with all dimensions resolved."
model: ref('fct_gl_entry')
entities:
- name: entity # region_a | region_b
type: foreign
expr: entity_id
- name: lob # roll_off | fec | residential | portable
type: foreign
expr: lob_id
dimensions:
- name: period
type: time
type_params:
time_granularity: month
expr: period
- name: account_category
type: categorical
- name: is_capitalized
type: categorical
- name: is_revenue
type: categorical
measures:
- name: signed_amount
agg: sum
expr: signed_amount
agg_time_dimension: period
One metrics file (the 7 business metrics):
metrics:
- name: revenue
description: "Total revenue (4xxxx accounts), Acme-side. Negate for display."
type: simple
type_params:
measure: signed_amount
filter: "{{ Dimension('gl_entry__is_revenue') }} = true"
# Multiply by -1 in the published view (revenue is stored as credits).
- name: disposal_cost
description: "Disposal cost (53xxx accounts)."
type: simple
type_params:
measure: signed_amount
filter: "{{ Dimension('gl_entry__account_category') }} = 'disposal'"
- name: rm_cost
description: "R&M (55xxx + 56xxx + 57xxx, EXCLUDING capitalized)."
type: simple
type_params:
measure: signed_amount
filter: |
{{ Dimension('gl_entry__account_category') }} IN ('truck_rm', 'container_rm', 'portable_rm')
AND {{ Dimension('gl_entry__is_capitalized') }} = false
- name: total_costs_cogs_sga_minus_da
description: "COGS + SG&A − D&A (5xxxx + 6xxxx, excluding D&A and Capitalized)."
type: simple
type_params:
measure: signed_amount
filter: |
{{ Dimension('gl_entry__is_in_total_costs') }} = true
Then dbt build --select +metric:revenue compiles these into SQL,
materializes the rows, and puts them in gold.metric_value.
-- registry.metric_published (seed, then evolved as we ratify more)
-- One row per controller-published metric. The bridge from stakeholder language
-- to dimensional language.
published_id, published_label, source_doc, metric_id, entity_id, lob_id, ratified
"M001", "Consolidated Revenue Region A", "controller_2026_02_11.pdf", "revenue", "region_a", NULL, true
"M002", "Consolidated Revenue Region B", "controller_2026_02_11.pdf", "revenue", "region_b", NULL, true
"M003", "Consolidated Revenue Total", "controller_2026_02_11.pdf", "revenue", NULL, NULL, true
"M004", "Disposal Cost Roll Off Region A", "controller_2026_02_11.pdf", "disposal_cost", "region_a", "roll_off", true
"M005", "R&M Cost Region A", "controller_2026_02_11.pdf", "rm_cost", "region_a", NULL, true
... 17 more ratified, 73 unratified (ratified=false)
Dashboard query becomes trivial:
SELECT p.published_label, g.value
FROM registry.metric_published p
JOIN gold.metric_value g
ON g.metric_id = p.metric_id
AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12'
AND p.ratified = true
ORDER BY p.published_id;
The SQL logic in slices 6.2 + 6.3 (filter expressions, exclusion handling, reassignment overrides) ports directly into MetricFlow filter expressions and silver dim flags. The structure they encode (per-measure tables, column-based exclusions, leading-digit join) is wrong and gets rebuilt.
Logic ports: dimension joins, signed_amount calculation, date parsing, reassignment overlay. Structure changes: remove the leading-digit account_category join (replaced by dim_gl_account FK); remove the excluded_from_measure column (replaced by silver.gl_exclusions). The keystone-fact pattern is preserved; the wiring is corrected.
19 CTEs unioned into one monolithic table with measure-per-entity identifiers.
Replaced entirely: the 19 filter expressions become 7 metrics in
metrics.yml, the 22 published combinations become 22 rows in
registry.metric_published, and gold.metric_value is
materialized once by MetricFlow. The generate_schema_name.sql macro
stays (still useful — gold lands in gold. schema).
All 5 seeds (department_reassignments, known_exclusions, fleet_vehicle_type_to_lob, dim_period, dim_holiday) plus their schema.yml entries remain valid. They're consumed by the new silver dims and by MetricFlow filter expressions exactly as they were in v1. No revert needed.
| Slice | What it builds | Estimate | Cumulative | Status |
|---|---|---|---|---|
6.1 |
v4 mappings as v5 seeds — 5 CSVs + schema.yml | 30 min | 0:30 | Done |
6.0r |
Revert slices 6.2 + 6.3 cleanly (git revert + verify branch state) | 10 min | 0:40 | Next |
6.2v2 |
silver/dim_gl_account.sql — joins gl_accounts bronze + account_categories seed, derives is_capitalized + is_in_total_costs flags | 30 min | 1:10 | Next |
6.3v2 |
silver/fct_gl_entry.sql (rewritten) — joins dim_gl_account by accountno (not leading-digit), drops excluded_from_measure column. Plus silver/gl_exclusions.sql (1:N table) | 45 min | 1:55 | Next |
6.4v2 |
Add dbt-metricflow to v5 dependencies; create models/semantic/semantic_models.yml + metrics.yml with 7 metric definitions | 45 min | 2:40 | Next |
6.5v2 |
silver/fct_fleet_vehicle.sql + fleet semantic_model + front_line_truck metric | 25 min | 3:05 | Next |
6.6v2 |
Calendar metrics (weeks_paychecks, workdays_per_month) — semantic_model on dim_period, two metrics | 20 min | 3:25 | Next |
6.7v2 |
seeds/metric_published.csv — 22 ratified rows mapping the controller's PDF labels → (metric_id, entity_id, lob_id) tuples | 20 min | 3:45 | Next |
6.8v2 |
Materialize gold.metric_value via dbt build --select +metrics; wire to Dagster asset graph | 25 min | 4:10 | Next |
6.9v2 |
dbt tests: every published row matches v4 within ±$0.10; dim_period freshness warning; row counts; not_null on value | 35 min | 4:45 | Next |
6.10v2 |
Push branch + open PR + trigger Dagster run + verify in warehouse | 25 min | 5:10 | Next |
6.11v2 |
Sync gold.metric_value + registry.metric_published from warehouse to Supabase (read-only mirror). Dashboard reads Supabase tables directly. | 35 min | 5:45 | Next (CRITICAL — dashboard depends on this) |
silver.dim_gl_account exists with one row per accounting account,
carrying account_category_id, is_revenue,
is_capitalized, is_in_total_costs flags
derived ONCE from the chart of accounts.silver.fct_gl_entry joins dim_gl_account by accountno (not
leading-digit). The excluded_from_measure column is gone.silver.gl_exclusions exists as a 1:N table (recordno, measure_id).semantic_models.yml defines 2 semantic models (gl_entry,
fleet_vehicle, dim_period) with measures + entities + dimensions.metrics.yml defines 7 metrics: revenue, disposal_cost,
rm_cost, total_costs_cogs_sga_minus_da, front_line_truck, weeks_paychecks,
workdays_per_month.gold.metric_value is materialized by MetricFlow via
dbt build --select +metric:*; ~7 metrics × periods × dim
combos rows.registry.metric_published seed + table has 22 rows mapping
the controller's PDF labels to (metric_id, entity_id, lob_id) tuples; ratified=true on all 22.mirror_gold.metric_value
and mirror_gold.metric_published — synced from the
warehouse via a sync script (slice 6.11v2). Dashboard can JOIN these directly
with no warehouse access.-- 1. Metric definitions are intact: 7 metrics
SELECT metric_id, COUNT(*) AS rows_materialized
FROM gold.metric_value
GROUP BY metric_id
ORDER BY metric_id;
-- expect 7 rows
-- 2. Published metrics are intact: 22 ratified rows
SELECT COUNT(*) FROM registry.metric_published WHERE ratified = true;
-- expect 22
-- 3. Every published metric has corresponding gold value for Dec 2025
SELECT p.published_label, g.value, p.ratified
FROM registry.metric_published p
LEFT JOIN gold.metric_value g
ON g.metric_id = p.metric_id
AND COALESCE(g.entity_id, '') = COALESCE(p.entity_id, '')
AND COALESCE(g.lob_id, '') = COALESCE(p.lob_id, '')
WHERE g.period = '2025-12' AND p.ratified = true
ORDER BY p.published_id;
-- expect 22 rows, no NULL values
-- 4. Sentinel cents-precision check
SELECT g.value
FROM registry.metric_published p
JOIN gold.metric_value g USING (metric_id)
WHERE p.published_label = 'Consolidated Revenue Region B'
AND g.entity_id = 'region_b' AND g.period = '2025-12';
-- expect 872850.23 (matches the controller's December PDF to the cent)
| Risk | Likelihood | Mitigation |
|---|---|---|
dbt-metricflow Postgres adapter has rough edges (newer than Snowflake/BigQuery support) |
Medium | If we hit a blocker, fall back to hand-rolling the 4-layer pattern as plain dbt models (the same shape, manually compiled). MetricFlow is preferred because it's the standard, but the architecture works either way. |
| Compiled MetricFlow SQL doesn't honor known_exclusions cleanly | Medium | Encode exclusions as silver.gl_exclusions with a LEFT JOIN ... IS NULL pattern in fct_gl_entry. MetricFlow then filters against the resolved silver table, not against bronze. Tested in slice 6.3v2 before defining metrics. |
| The 'all_4xxx_revenue' wildcard exclusion (Sep 2025 HTN $82,932.50) doesn't fit the standard exclusion pattern | Medium | This is one specific row excluded from ALL revenue metrics for a specific period. Encode as: gl_exclusions row with measure_id='all_revenue' + period filter; MetricFlow's revenue metric definition adds a JOIN to exclude any (recordno, period) tuple in that exclusion class. |
| analytics-warehouse-postgres connection blocked from local agent (mesh-only) | High (already encountered) | Build all SQL/YAML locally; deploy + verify via Dagster (slice 6.10v2). Same approach as v1; well-understood. |
| v4 reconcile output (the test fixture) has Decimal precision that drifts from Postgres numeric round-trip | Low | Test tolerance is ±$0.10, well above any precision drift. Investigate only if a measure misses by >$0.50. |
| The 7-metric assumption is wrong (some published rows need a metric we haven't identified) | Low | Slice 6.7v2 builds metric_published from v4's MEASURES list; if any row can't be expressed as (existing_metric, entity, lob), we add a metric. The 7 are an estimate; could end up being 8 or 9. Architecture handles either. |
| Dimension | v1 plan | v2 plan |
|---|---|---|
| Account category resolution | LEFT(accountno, 1) heuristic | silver.dim_gl_account FK join |
| Exclusion handling | Column on fct_gl_entry (1:1 only) | silver.gl_exclusions table (1:N) |
| Identifier shape | 22 measure_ids encoding entity + LOB in name | ~7 metric_ids + dimension columns |
| Metric definition source | Hand-written SQL in 19 CTEs in one .sql file | YAML metric definitions in metrics.yml |
| Materialization | 22 per-measure gold tables OR 1 monolithic table (debate) | 1 gold.metric_value table (MetricFlow-managed) |
| Stakeholder bridge | None — measure_id IS the stakeholder name | registry.metric_published (named bridge) |
| Adding measures #23-95 | 22 new gold tables × ~1 hour each = ~73 hours of model SQL | ~73 new registry rows + occasional new metric (~1-2 hours per truly-new concept) = closer to ~10-20 hours for the lot |
| Cross-vendor measures (e.g., "revenue per truck") | Would require new gold model joining accounting + fleet | Define new metric referencing both semantic_models; MetricFlow handles join |
| Discoverability | Read 22 SQL files | SELECT * FROM registry.metric_published |
| Industry pattern | None — invented locally | dbt MetricFlow / Airbnb Minerva / LookML / Snowflake Semantic Views |
Locking this in now, before Phase 3 dashboard wire-up creates the temptation to define metrics in Supabase migrations.
| Layer | Lives in | Owner | Allowed? |
|---|---|---|---|
| Metric definitions (YAML) | data-pipeline-v5/dbt_project/models/semantic/ |
dbt + agents (under PR review) | ✅ Canonical |
| Materialized metric values | gold.metric_value in analytics-warehouse-postgres |
dbt build (automated) | ✅ Canonical |
| Stakeholder bridge | registry.metric_published in analytics-warehouse-postgres |
dbt seed + engineer (versioned in repo) | ✅ Canonical |
| App-facing denormalized aggregates | Supabase (Phase 3 sync from warehouse) | Sync script — read from gold, write to Supabase tables | ✅ Read-only mirror |
| NEW metrics defined in Supabase migrations | — | — | ❌ Never |
| Aggregations / business logic in dashboard app code | — | — | ❌ Never |
| Dashboard tile reads metric directly from silver | — | — | ❌ Never (always read gold or Supabase mirror) |
The rule: if a future dashboard need wants a number that doesn't
exist in gold.metric_value, the answer is "add a metric to MetricFlow
YAML + republish," NOT "write a SQL view in Supabase" and NOT "compute it in the
app's React code." Single source of truth for every business number; one path
for adding new ones.
Why this matters: the moment Supabase has its own "revenue"
calculation that drifts from gold.metric_value, stakeholders see
two different numbers in two different views and trust collapses. Same Minerva
lesson Airbnb learned the hard way.
This is the operational picture. Supabase is the dashboard's read surface; the warehouse is the metric factory. Knowing exactly what crosses the boundary (and what does NOT) is the difference between clean architecture and accidental sprawl.
| Object | Lives in | Ships to Supabase? | Reason |
|---|---|---|---|
dp5_ledger_bronze.* (gl_entries, ap_bills, etc.) |
warehouse only | NO | ~1M+ rows. Cost prohibitive in Supabase. Not needed by dashboard. Raw vendor data shouldn't sit in app DB (compliance). |
dp5_fleet_bronze.* (vehicles, status_changes, etc.) |
warehouse only | NO | Same reasoning. Bronze is internal-only. |
dp5_crm_bronze.* |
warehouse only | NO | Same reasoning. |
silver.dim_gl_account |
warehouse only | NO | Internal modeling artifact. Dashboard doesn't need raw account categorization — that logic is encoded in metric definitions. |
silver.dim_entity, silver.dim_lob,
silver.dim_period, silver.dim_holiday |
warehouse only | NO | Internal modeling. Dashboard reads dimensional values via gold.metric_value's entity_id/lob_id/period columns; doesn't need the dim tables themselves. |
silver.fct_gl_entry, silver.fct_fleet_vehicle |
warehouse only | NO | Hundreds of thousands of rows. Not used by dashboard directly. Letting Supabase consume silver would create a back-channel for defining metrics outside the canonical place — violates §11 governance. |
silver.gl_exclusions |
warehouse only | NO | Internal. Exclusion logic is applied IN gold materialization; downstream Supabase only sees the result. |
semantic_models.yml + metrics.yml |
git repo (data-pipeline-v5) | NO | YAML files are version-controlled DEFINITIONS, not data. They live in git, get compiled by MetricFlow, and produce gold rows. |
| dbt seeds (department_reassignments, known_exclusions, etc.) | git repo + warehouse | NO | Internal modeling input. Their effect propagates to gold; the seeds themselves don't need to be queryable from Supabase. |
dbt_artifacts.* (run history, test results) |
warehouse only | NO | Operational telemetry. Could ship to Supabase if QA dashboard wants it, but not in v2 scope. |
gold.metric_value |
warehouse + Supabase mirror | YES — slice 6.11v2 | Dashboard reads this. ~1260 rows (7 metrics × 12 periods × dim combos). Trivial sync cost. Source of every number on the dashboard. |
registry.metric_published |
warehouse + Supabase mirror | YES — slice 6.11v2 | Dashboard joins this to translate the controller's PDF labels to metric × dimension tuples. ~22 rows now, ~95 at full scale. Trivial cost. |
dashboard app's existing tables (weekly_reports,
auth.users, RBAC tables, etc.) |
Supabase only | EXISTING — untouched | App-owned state. v2 doesn't change anything here. RBAC + auth + dashboard chrome stays exactly as it is. |
| future "denormalized aggregates for performance" (Phase 3?) | Supabase only | DEFERRED | If/when dashboard query patterns demand pre-aggregated cubes (e.g., "monthly revenue × LOB matrix" for a chart), we'd build them as MATERIALIZED VIEWS in Supabase derived from the mirror. Same governance: derived from gold, never defines new metrics. |
Rule 1 — Supabase mirror is read-only. The sync script writes
mirror_gold.metric_value + mirror_gold.metric_published
on a schedule (or on dbt-build trigger). Nothing in dashboard app code or in Supabase
migrations modifies these tables. If a dashboard developer wants to change a metric
value, the answer is "edit the metric definition in
data-pipeline-v5/dbt_project/models/semantic/metrics.yml, push, dbt builds,
sync runs, mirror updates."
Rule 2 — If it's not in the matrix above, it doesn't ship. When a future need surfaces ("can we get raw GL entries in Supabase for X?"), the default answer is NO — the request gets reframed: "what NEW metric do you need in gold that would answer X?" If the answer is "I want to query bronze ad hoc," that's an analyst use case for the warehouse directly (via warehouse-mcp or psql), not a dashboard use case.
mirror_gold. Schema is stable: gold.metric_value's columns don't drift because they're materialized by MetricFlow against versioned YAML.