eidos agi

Minimum Viable Data

Data is the primitive, not product.

The smallest structured dataset that produces the results you want is worth more than any app built to process it.

Minimum Viable Product was the right idea when delivery was expensive. Shipping software meant writing code, managing servers, hiring engineers, and burning months before you knew if anyone wanted what you built. MVP reduced the cost of being wrong.

That constraint is dissolving. AI agents can build software in hours. The hard parts of delivery are becoming cheap. What remains hard is knowing what to deliver: what does the customer actually want, what does "done" look like, what are the constraints. That knowledge is structured data, and given the right data, AI can produce the result without a product in between.

372 lines that order groceries

There is a GitHub repo that orders groceries every week. It contains no software, no database, and no deployment:

reeves-grocery/
  contracts/
    shopping-list.schema.json
    meal-plan.schema.json
    pantry-update.schema.json
  preferences/
    dietary.yaml
    cooking.yaml
    shopping.yaml
    household.yaml
    goals.yaml
  pantry/
    current.md
    reserve.md

The contracts define what a valid grocery order looks like. The preferences describe who the person is. The pantry tracks what's on hand. The agent that reads this data has changed three times — Claude Code, then Perplexity Computer, then something else. The data hasn't changed once. The repo doesn't care which AI reads it.

That's what makes data the primitive. The software layer is interchangeable. The data layer is permanent.

MVP vs MVD

Minimum Viable Product asks how little software you can write to ship something. Minimum Viable Data asks how little data you need to define the results you want.

Product (code)
  • You write code that does the thing
  • You maintain dependencies, deployments, versions
  • It works with one platform, one runtime
  • When the model improves, the code stays the same
  • When the platform changes, the code breaks
  • It gets replaced every few years
Data (MVD)
  • You define what "done" looks like as a schema
  • You describe who you are as preferences
  • You capture current state as markdown or YAML
  • When the model improves, the results improve
  • When the platform changes, the data still works
  • It never needs replacing

Why this works

Data quality beats model quality

Andrew Ng's data-centric AI research found that improving data quality is more effective than improving model architecture. In one manufacturing study, model changes produced zero improvement while data quality changes improved accuracy by 16%.

Ng, IEEE Spectrum; MIT Sloan; Nature Scientific Reports (2024).

Code is accumulating faster than it can be audited

AI now writes 30% of Microsoft's code and over 25% of Google's. Anthropic's 2026 report shows agents building entire applications over multiple days. The industry's response is to use agents to review agent-generated code. MVD sidesteps this entirely for tasks where structured data is sufficient.

MIT Technology Review, "10 Breakthrough Technologies 2026"; Anthropic, "Agentic Coding Trends 2026".

The three parts of an MVD

  1. Contracts — JSON Schemas defining what valid output looks like. A shopping list contract says what a grocery order must contain. A deployment contract says what "deployed" means.
  2. Preferences — YAML or markdown files describing who the user is. Dietary restrictions, budgets, brand preferences. These are stable and change slowly if at all.
  3. State — What's true right now. What's in the pantry, what was deployed last, what's in stock. This changes often but the format doesn't.

An AI reads all three and produces results. No code connects them. The contracts define the shape of valid output, the preferences define who it's for, and the state captures the current situation. The AI figures out the rest.

What MVD does not replace

MVD applies to tasks where the output is a defined result for a specific person or team: ordering groceries, reviewing code, planning a deployment, filing a report. These are tasks with clear contracts where you can describe what "done" looks like.

It does not replace infrastructure software. The systems that Walmart uses to process millions of orders per second, the databases that store them, the networks that route them — these are engineering problems where code is the product. MVD is for the layer above, where someone decides what to order and how to describe what they want. The schema that defines what a voice recording looks like will outlive the app that writes it. The grocery preferences will outlive every grocery app.