Case Studies
Real-world tasks. Autonomous agents.
Each case study documents a real task that the Eidos system completed autonomously — the problem, the approach, the outcome, and what it reveals about how AI systems actually work.
These are not demos. They are production systems where our team of Eidos agents read structured data, plan their own work, and produce results — from ordering groceries to securing enterprise software approval.
Grocery ordering
A source repo with YAML preferences and JSON Schema contracts — 3 contracts, 5 preference files, zero lines of code. An AI reads them and orders groceries weekly.
Knox makes agents ask first
A local approval authority for Mac agents that want to spend money, reveal credentials, send email, or take destructive actions. The agent can ask; Knox makes the human approve the specific request.
Warehouse planning, end to end
The actual silver-and-gold redesign document a team wrote when 22 financial measures had to land in a dashboard. Anonymized for publication; the architecture, slice plan, hour estimate, and ship/no-ship matrix are unchanged.
Landfall finds the thing
A client relationship lived across email, texts, chat, PDFs, public records, and local notes. Landfall turned that scattered context into a repeatable messy-data import.
Proprioception in AI agents
An agent shipped three defensible commits and silently made five architectural decisions along the way. An outside-in consultation tool flagged the one the agent wanted to defer — and would have cost three to four hours of refactor later.
What Flowering forgot about flowers
A published methodology for staged AI work met its first consequential decision. The agent demonstrated its failure mode in real time. The fix came from studying actual flower biology — growth without proportional pruning is decay disguised as productivity.
A $161 payment took 44 steps to prove
A real house-sale payable crossed texts, email, finance search, a bank portal, MFA, screenshots, and repo writeback. The payment succeeded; the lesson was that agents need payable lifecycles.
38% of agent deploys violated ceremony
AI agents deploying across five repos hit a ceremony gate on 38% of runs. Every violation was corrected in seconds. The cost model at scale reveals where the real value lives — and it's not in the immediate catches.
Git is a Type 2 dimension table
Manual dashboard edits were the right trade-off for a two-decade stretch. Agents change the calculus — they can only see what's in a file. Configuration-as-code becomes mandatory; and git history, it turns out, is already the right database for versioned config.
Creating a website voice
An agent audited the existing site copy, identified the authentic voice patterns, diagnosed where new pages deviated, and produced concrete voice rules. Then rewrote two philosophy pages to match.
Enterprise software approved by Eidos
Eidos read the organization's contracts, planned a 10-stage approval pipeline, executed nine stages, and presented one decision to a human.
Coming next
Voice notes
A Mac app records voice, transcribes locally, classifies notes through an agent SDK, and stores them in recording buckets that a search engine indexes.
Project improvement
A forge scores any project across 10 dimensions, fixes the highest-impact gap, and records a snapshot. The snapshot schema is the contract.
Learning capture
A forge that extracts learnings from sessions, routes project-level fixes directly and proposes forge-level improvements to the overseer.
Bridges: metaphor meets machine
How one-sentence "bridges" between philosophy and engineering make technical ideas land — the psychology of why "a heartbeat is a cron job" works, and how to write them deliberately.
Kinetic metaphors: when animation IS the concept
The site's static SVGs were made kinetic — funnels that narrow, bars that fill, arrows that flow. Documenting which ones readers respond to and why motion works where prose doesn't.
The pattern across every case study: structured data goes in, an agent figures out how to produce the result, and the system gets better each time it runs.
How it works: The DAG of AI