Deepak Sarda, the CTO of Endowus, a digital wealth platform in Asia, named it this week, in a short piece called "Agents lack Agency." Within hours a thread formed beneath it. A head of technology infrastructure at a global bank wrote that he spends more time worrying about whether his estate is clean than about deploying agents. An engineering lead at a fintech wrote that the real question is not humans supervising every action, it is organizations redesigning accountability itself. Sarda's own line was the sharpest in the room. Organizations scale through delegation, and delegation was never really about getting more tasks done. It was about creating more places where judgment can live.
Where the responsibility goes.
Agents can take on execution. They cannot take on responsibility.
When we say a person is responsible for an area, we mean something deeper than "please finish these tasks." We mean: understand the intent, use judgment, escalate when needed, carry the context over time. Their reputation, their duty, their future credibility are all attached to the outcome. An agent can produce the sentence "I've got this." It cannot own it.
So when agents absorb more execution but cannot hold the obligation, that obligation does not disappear. It moves upward, to the human approver, the process owner, the executive, the board. Sarda draws the distinction precisely. Accountability is where the question lands after the fact. Responsibility is the obligation carried inside the work, before and during it. This is why the "flatter org, run by agents" story feels incomplete. You can scale execution without scaling delegation at all. You just pile the unscaled part on fewer shoulders.
The missing seat.
Broad accountability works in large firms because some of its nodes are deliberately independent of the work.
Here is the part worth pulling on, because it is where the answer hides. A CEO is accountable for the whole company and inspects almost none of it. That works because underneath sits a human responsibility network: risk owners, finance controllers, legal counsel, engineering leads, internal audit. And the load-bearing fact about that network is that some of those nodes are independent by design. Internal audit, risk, the external auditor. Their value is not that they execute the work well. Their value is that they can take a position on the work that no one inside it can take about themselves.
A control system wrapped around a human sits on top of all of that. Role expectations, professional norms, reputation, consequence. The controls do not carry the full load, because the obligation underneath them does. A control system wrapped around an agent is doing something narrower. Logs, permissions, and evals are necessary and useful, and they constrain behaviour from the outside. They do not create obligation inside the agent.
So as execution moves to agents, the scarce resource is not supervision capacity. It is independent, evidenced judgment that a board can actually rely on. An independent read on an AI estate is not a dashboard. It needs four things:
- A board-ready position, not a live feed. A dashboard shows activity. A board needs a signed read: is this AI compounding on the balance sheet, or exposing it, and where are the exceptions.
- A materiality threshold set before the work, not after. Agreed at the start, so a finding is either an exception that matters or an item for the appendix. Otherwise every reviewer relitigates what "bad enough" means.
- Evidence a third party can re-examine. Production traces and working papers, not screenshots. The point of evidence is that someone who did not do the work can check it.
- A seat that did not build the thing it is judging. The team that builds the harness cannot be the independent read on its own harness. Independence is the whole point.
Put those four together and you have the missing seat for AI in finance: an independent AI audit, an evidenced operating read that ends in an opinion a board can sign its name to. Almost no one has built it yet.
What boards can rely on.
The audit opinion the board relied on was never written by the desk that placed the trades.
In twelve years across Goldman Sachs and JPMorgan, the thing I watched hold under regulators was the three lines of defense. The first line owns the workflow. The second sets the controls. The third issues an opinion precisely because it did not do the work. That separation was not bureaucracy. It was the mechanism that let a board sign its name to something thousands of people had touched.
Agents compress first-line execution fast. They leave the third line untouched, and the third line was never built to opine on non-deterministic output it cannot reproduce. A regulated firm running its own in-house eval harness runs straight into this. The harness can be excellent and the obligation still moves up to the person who built it, because a builder cannot be the independent read on the build. The instruments the third line needs for AI do not exist yet in most firms. That gap is the work.
Who can still answer for it.
Agents can scale what gets done. They cannot scale who answers for it.
The firms that treat that as a measurement problem, and build the independent, evidenced read before a regulator or a board asks for it, are the ones that get to keep delegating. The rest will quietly shrink their ambitions back to the size of the responsibility their existing people can carry, which is the opposite of what the technology promised them.
Sarda ended with the right question. We may be scaling execution. We are not necessarily scaling delegation. The answer turns on a small, old test. Who can still say "I've got this," and mean it. For the parts we hand to agents, the honest answer today is no one. Building that seat, with evidence behind it, is how firms keep delegating as more work moves to agents.
With credit to Deepak Sarda, CTO of Endowus, whose piece "Agents lack Agency" prompted this one. Worth reading in full.
