E EidosAGI

A $161 payment took
44 steps to prove.

The work was not sending money. The work was finding truth across email, texts, finance data, browser state, MFA, screenshots, and repo memory.

Reeves + Gmail + Wells Fargo Zelle + Computer Use + Scridos
Wells Fargo Zelle confirmation showing money sent to David Sanders for 161 dollars with invoice memo and confirmation code.
Final proof: Wells Fargo Zelle confirmation for invoice 1135, copied into the house-sale repo as evidence.

A house-sale invoice for $161 should have been a five-minute task. Instead, it became a live case study in why personal agents need durable payable lifecycles: invoice ingestion, payment-status search, human approval, bank execution, evidence capture, and task writeback. The payment succeeded. The system learned why it was too hard.

The problem

The invoice was ordinary: R&D Residential Services, invoice 1135, $161 for an attic bathroom toilet repair at a house being prepared for sale. The vendor had asked the listing agent about payment. The owner wanted the bill resolved.

That sounds easy until the agent has to answer the questions a careful human asks before sending money: What invoice? Who sent it? Was it already paid? Which payment rail? Which account? Can we trust the recipient? Where is the confirmation? Did the task system learn the result?

The hard part was not the Zelle button. The hard part was that the truth lived across several surfaces: Heather's texts, Gmail, Reeves Email, Reeves Finance, Wells Fargo Zelle activity, Wells Fargo account activity, a browser session, an MFA checkpoint, a Desktop screenshot, and the 239 Eagle repo.

The approach

The agent treated the payment as a graph instead of a single browser action. First it found the invoice. Then it tried to prove non-payment. Then it asked for human authorization. Then it used the bank portal only after the evidence chain was strong enough. Finally it wrote the result back into the project substrate.

239EagleDr/wiki/239-eagle/work/david-sanders-invoice/evidence.md

What had to be proven

  • Heather's text referred to David Sanders / R&D Residential Services.
  • QuickBooks invoice 1135 existed and was for $161.00.
  • The work was the attic bathroom toilet repair.
  • The payment instructions were Zelle to 336-391-1973 or CashApp to $828davidsanders.

What had to be ruled out

  • Reeves Email did not find the invoice.
  • Gmail found the invoice but no receipt.
  • Reeves Finance found no earlier matching payment.
  • Wells Fargo Zelle and account history found no prior payment to the vendor.

Only after those checks did the human approve payment and choose Zelle. The bank still required a trust checkpoint and SMS verification. The successful state was eventually proven by a Desktop screenshot, not by a clean API event.

The evidence

The reconstruction produced a 44-step work log. Ten of those steps required a human in the loop. Several were not "approval" in the formal sense; they were human judgment, source correction, auth, or steering around a broken automation surface.

human-in-the-loop moments
1. Heather surfaced the payable.
2. Daniel challenged whether it had already been paid.
3. Daniel approved paying it.
4. Daniel chose Zelle over CashApp.
5. Daniel pushed for direct Wells Fargo / Zelle history.
6. Daniel steered the browser strategy toward Comet / Computer Use.
7. Daniel handled bank login and security context.
8. Daniel gave final "just send it" authorization.
9. Wells Fargo MFA required Daniel.
10. Daniel pointed the agent to the Desktop screenshot that proved completion.
Phase Work Evidence Human loop
Trigger Heather said David Sanders was asking about payment. Text context captured in the repo evidence log. Yes
Invoice discovery Reeves Email searches failed; Gmail found the QuickBooks invoice reminder. Invoice 1135, $161.00, R&D Residential Services. No
Payment-status proof Reeves Finance searched amount, vendor, name, phone, CashApp handle, and invoice number. No matching prior payment found. Daniel challenged the sufficiency.
Bank portal proof Computer Use and Comet navigated Wells Fargo Zelle and account activity. No prior payment to Sanders / R&D / phone / $161. Login and tooling steering.
Authorization Payment prepared to David Sanders, $161.00, invoice memo. Wells Fargo review screen. Daniel approved final send.
Bank execution Wells Fargo added recipient trust and MFA gates. Security modal and final confirmation screenshot. MFA / final bank boundary.
Writeback Screenshot copied into repo; task and payable marked done/paid. Confirmation WFCT1259CCPF; scridos task doctor passed. No

The final confirmation was copied into the repo, and the task system was checked with scridos task doctor. The important detail is that proof did not end at the bank. It ended when the project substrate knew not to pay the invoice again.

The graph

Heather text
invoice search
payment search
approval
bank portal
MFA
screenshot proof
repo writeback

Before and after

What happened
  • An invoice lived in Gmail, while the task lived in a repo and the stale payment signal lived in finance sync.
  • The agent searched several surfaces manually, and the human had to keep asking whether the proof was good enough.
  • The bank step depended on browser state, Comet behavior, MFA, and a Desktop screenshot.
  • The repo learned the result only after a separate documentation pass.
What the agent system should become
  • A payable lifecycle ingests invoice evidence, extracts amount, vendor, work, due date, and payment rails automatically.
  • A finance facade searches all local money surfaces by amount tolerance, vendor aliases, phone numbers, handles, and memo text.
  • A payment runner stops at every human-only boundary: auth, MFA, final send, and legal or bank warnings.
  • After confirmation, evidence capture and task/payable writeback are part of the same workflow, not an afterthought.

The result

The payment succeeded. David Sanders was paid $161.00 by Wells Fargo Zelle, with memo "Invoice 1135 - 239 Eagle attic bathroom toilet repair" and confirmation WFCT1259CCPF. The confirmation screenshot was preserved in the 239 Eagle repo, and the canonical task/payable records were marked done.

The stronger result is the case study itself. A tiny payment exposed the full shape of personal-agent operations: evidence, uncertainty, approval, execution, and memory. The agent did not need to be more clever at clicking Wells Fargo. It needed a better payable substrate.

That is the lesson. If a task has money, law, health, credentials, identity, or reputation attached to it, the agent does not just need tools. It needs a lifecycle with proof, human gates, and durable writeback.