Relay: A Control Plane for Agent Work
Why we added human approval, bounded briefs, and context layers before trying to scale multi-agent software work.
AI agents are becoming useful at doing real work. They can read code, make changes, run tests, explain tradeoffs, and hand results back to a human. That is no longer the surprising part.
The harder problem starts when agent work becomes continuous. One agent proposes work. Another implements it. Another reviews it. Another tests it. A human still needs to decide what is allowed to start, what needs changes, what is blocked, and what is actually done.
At that point, the problem is not just intelligence. It is coordination.
Relay is a small attempt to solve that coordination problem. It is an admin-first project board and CLI workflow for managing AI agent work. Agents can propose scoped cards. Humans approve or reject them. Developer, reviewer, and tester agents work through a shared state machine. Every handoff carries bounded context instead of asking the next agent to read everything again.
The goal is not to build Jira for agents. The goal is simpler: make agent work visible to humans and keep agent context small enough that each session does not start from zero.
Why Relay Exists
Single-agent workflows can be informal. You give an agent a task. It explores the repo. It makes a change. It reports back. That works for one prompt and one result.
It breaks down when work spans multiple agents, multiple roles, and multiple sessions. The same questions come up again and again:
What work is approved?
Who owns this card?
Why did the admin ask for changes?
What did the developer actually change?
What should the reviewer inspect first?
What evidence exists before we call this done?
What is blocked?
What should the next agent read?
Without a shared control surface, the default answer is usually a transcript. The next agent reads a long conversation. The human reads a long conversation. Everyone tries to reconstruct the current state from notes, tool output, corrections, and partial plans.
That is expensive. It burns tokens. It wastes time. It makes supervision harder. A human should not need to read the whole transcript to answer a basic operational question: what is waiting on me?
Relay exists because agent work needs a readable control loop.
What Relay Is
Relay has two main surfaces. Agents use a CLI. Humans use a local web board.
The CLI lets agents create cards, submit work for approval, claim approved work, read briefs, add notes, write context, move cards through the workflow, and check their inbox. The web board gives the human admin approval queues, active work, card timelines, dependencies, agent presence, inbox items, and missing-context signals.
Underneath that is a simple workflow:
The important part is the approval gate. Agents can propose work. Humans decide when work starts.
That boundary matters. Without it, a system of agents can generate and execute work continuously, even when the work is vague, low-value, too large, or not aligned with the human’s priorities.
Relay separates proposing from executing. PM agents can draft cards. Admins approve or send them back. Developer agents claim only approved work. Reviewers and testers move work forward only after reading the relevant context. Admins mark work done after reviewing evidence.
That is the core idea: make the state of agent work explicit enough that a human can supervise it.
How Relay Solves The Context Problem
The naive way to coordinate agents is to let every agent read everything. That works at first. It does not scale.
A transcript contains useful discoveries, mistakes, shell output, corrections, test failures, implementation details, final decisions, and noise. The next agent has to separate signal from noise again.
Relay does not remove the event history. It changes the default read path. Instead of making the transcript the main handoff artifact, Relay uses context layers and briefs.
Context layers are typed summaries:
Feature briefs explain product context.
Project maps explain repository structure, commands, and conventions.
Implementation notes explain what changed and why.
Validation evidence records what was tested and what remains uncertain.
Handoff intent tells the next role what to do first.
Human review summaries translate technical work into plain English.
These layers are bounded. They have character caps. They are not allowed to become another endless transcript. They are also durable. If a layer changes, the new one supersedes the old one. The latest layer is easy to read, but older context remains available for audit.
When an agent claims a card, Relay returns a role-specific brief. A developer gets the context needed to start implementation. A reviewer gets implementation notes, validation evidence, and human summaries. A tester gets validation context and the implementation story. An admin gets the broadest view.
Agents can still inspect the repo. They can still read the timeline. Relay is not hiding information. It is changing the default from:
read everything, then figure out what matters
to:
read the brief, then inspect what matters
That is the main productivity bet. The system should not ask every agent to rediscover the same project map, the same history, and the same handoff intent before doing useful work.
The Philosophy
Relay is built around a few simple beliefs.
Human control should be visible
If agents are doing real work, humans need a place to approve, pause, reject, and close that work. Human control should not be hidden inside chat messages. It should be visible as state.
Approved work should look different from proposed work. Blocked work should be obvious. A card waiting for admin review should not be buried in a transcript.
Agent context should be bounded
More context is not always better. At some point, more context becomes more noise.
Relay tries to keep the default agent read path bounded: card fields, relevant layers, recent events, decisions, and next action. If an agent needs more, it can ask for more. The default should be small enough to be useful.
Workflows should be explicit
Agents should not learn the process by failing commands. Relay exposes valid transitions. It records events. It keeps ownership visible. It makes dependencies first-class.
The workflow is not just documentation. It is enforced by the tool.
Summaries are part of the work
For agent systems, a good handoff is not optional. If a developer finishes implementation but does not explain what changed, the reviewer starts with archaeology. If a tester validates behavior but does not write evidence, the admin has to trust a claim.
Relay treats implementation notes, validation evidence, handoff intent, and human summaries as part of the work.
Keep the tool small
Relay is intentionally local and small. It uses Node.js, SQLite, a plain HTTP server, and static web files. The current package has no runtime npm dependencies.
That is a product choice. The goal is not to build a massive platform before the workflow is proven. The goal is to make the coordination loop inspectable, hackable, and easy to run.
Challenges
Relay does not make the hard parts disappear. It makes some of them explicit.
Agents still need to comply
Relay works when agents follow the loop:
check inbox -> claim work -> read brief -> work -> write context -> move with handoff
The tool can make this path easy. It can return the brief during claim. It can show missing context. It can warn when a move lacks evidence. But it cannot guarantee that every agent will write a useful summary.
Agent compliance is still a real product risk.
Schema does not guarantee quality
A field called “implementation notes” does not guarantee good implementation notes. A card with acceptance criteria can still be vague. A human review summary can still be shallow.
Relay can enforce structure. It can cap size. It can surface gaps. It can make the right behavior obvious. It cannot replace judgment.
Context can go stale
A project map is useful until the repository changes. A feature brief is useful until the product direction changes.
Relay supports superseding old context, but it does not yet solve the larger question of staleness. That policy probably needs to come from real usage, not theory.
Local coordination is not security
Relay is a coordination protocol, not a security boundary. Roles are self-declared in the local workflow. Any process with access to the SQLite database can perform actions.
For the current local-first use case, that is acceptable. For remote agents and hosted workflows, Relay would need real authentication, authorization, and audit boundaries.
The value still needs proof
We ran a small experiment comparing transcript-heavy handoffs with Relay briefs and context layers. The Relay version reduced estimated starting context by roughly 64 percent in that setup.
That is useful evidence, but it is not final proof. It does not prove exact billing savings. It does not prove better outcomes. It does not prove long-term quality improvements.
The real test is whether humans can supervise more work with less confusion and whether agents can start useful work without repeatedly rebuilding the same context.
Where This Goes
Relay is not a smarter agent. It is the operational layer around agents.
It gives humans a board, approvals, dependencies, evidence, and timelines. It gives agents a CLI, briefs, context layers, inboxes, and a predictable workflow. Both sides share the same state machine.
That is the point.
As agents become more capable, coordination becomes more important, not less. Speed without coordination creates noise. Relay is an attempt to turn agent speed into work a human can actually supervise.
Comments and suggestions are welcome. Let me know what do you think!





