What operations people actually use agents for
Not the things in the pitch. The things they come back to every day without thinking about it. From what we have seen across a dozen retainer clients over the past two years, the durable use cases cluster around a few patterns.
Routing is the most common. Not clever routing — basic routing. A new form submission arrives, the agent reads it, determines which team or person it belongs to, and sends it there. This sounds trivial but it was, for most of these clients, a task that required a human to spend twenty minutes a day doing manually. The agent does not make better decisions than the human did. It just does not get distracted, forget, or go on holiday.
The second durable use case is status summarization. An agent pulls data from Airtable or HubSpot or whatever system of record the team uses, and produces a short summary at a set interval. Not a dashboard — a text summary sent to a Slack channel or email thread. People actually read these, which surprised us.
Where they disappoint six months in
The agent that was built for edge cases usually ends up not being run for edge cases. The team finds a workaround that is less friction than invoking the agent correctly. This is a design problem, not a technology problem, but it presents as technology.
Anything that requires the agent to understand context it was not trained on also degrades. A client changed their internal project naming convention eight months after deployment. The agent kept working, sort of, but started making categorization errors nobody noticed for two weeks. This is exactly the kind of thing a maintenance retainer catches — we updated the system prompt and added a monitoring step that would catch the same pattern in the future.
The gap between "the agent runs" and "the agent is useful" is real and persistent. We have seen agents that had a 95% completion rate on their assigned tasks, but the 5% failures were the ones that cost the most time to recover from. Completion rate is the wrong metric. Recovery cost is the right one.
A note on expectations
The teams that get the most from agents after six months are the ones who treated the deployment as the start of an iteration cycle, not the end of a project. They report breakdowns. They ask for changes to the scope when the original design does not match how the work actually moves. They do not expect the agent to be intuitive.
The teams that were disappointed had generally expected something more like a new employee — something that would learn the role, adapt, improve without intervention. Agents do not do that. They do what they were built to do, consistently, until something in the environment changes. Reliability in a narrow lane, not generalization.