Agent FTE estimates the effective engineering capacity that AI coding agents add to a team, expressed in full-time-equivalent units and grounded in accepted, durable outcomes such as merged pull requests, completed tasks, and verified changes. It answers how much real engineering capacity agents contribute, in a unit that leaders and finance already understand.
Agent FTE is a team and system level estimate. It is not a headcount-replacement claim, and it is not a way to rank individual engineers. It reads raw agent activity, then discounts it down to the work the engineering system actually accepted and kept.
Most agent reporting stops at usage. Sessions run, prompts sent, and tokens consumed all show activity. None of them show capacity. Agent FTE converts accepted output into a familiar denominator so leaders can compare agent contribution against team size, hiring plans, and budget.
| Finding | What It Means |
|---|---|
| Agent FTE is capacity, not activity. | It is estimated from accepted, durable outcomes, not from sessions, prompts, or token volume. |
| It uses a unit leaders already have. | Expressing agent contribution in full-time-equivalent terms lets it sit next to headcount, hiring plans, and budget. |
| It is an estimate, not a precise count. | Agent FTE depends on stated assumptions about human-equivalent effort per accepted outcome. Treat it as a modeled range. |
| It is bounded by the environment. | A weak agent environment lowers real Agent FTE because fewer sessions convert into accepted work. |
| It is not a headcount-replacement number. | Agent FTE describes added engineering capacity, not engineers to remove. Using it that way breaks the metric and the team. |
Agent FTE is a modeled estimate, not a direct measurement. The method is to start from accepted outcomes, translate those into human-equivalent effort, express the result as full-time-equivalent capacity, and then price it. Each step carries an assumption that should be stated and revisited.
The calculation moves through four layers:
| Layer | Input | What It Produces |
|---|---|---|
| Accepted outcomes | Merged AI-assisted PRs, completed tasks, verified changes that survive review and follow-up. | The durable work agents actually contributed, filtered of retries, dead ends, and rewritten output. |
| Human-equivalent effort | Estimated effort a team would spend to produce the same accepted outcomes, by work type. | Accepted agent work restated in engineer-hours or engineer-days. |
| Agent FTE estimate | Human-equivalent effort divided by the working capacity of one full-time engineer over the same period. | Effective added capacity in full-time-equivalent units. |
| Cost per Agent FTE | Token and platform cost over the period, divided by the Agent FTE estimate. | The efficiency view: what one unit of added agent capacity costs. |
The first layer is the one that keeps the metric honest. Only accepted, durable outcomes count, which is why Agent FTE is downstream of Engineer-Agent Effectiveness. If sessions do not convert into work the system keeps, they do not add to Agent FTE, no matter how many tokens they burned.
The estimate is sensitive to assumptions. Human-equivalent effort per accepted outcome varies by work type, repository, and reviewer standard. A one-line config fix and a new service are both accepted outcomes, and they are not the same amount of engineering. Credible Agent FTE reporting shows the assumptions, segments by work type, and presents a range rather than a single confident number.
Larridin measures the inputs for this estimate through AI workforce views: agent-hours against human-equivalent hours, accepted outcomes by team and repository, and cost per accepted outcome. Those signals feed the layers above. The output is an operating estimate, reviewed on a cadence, not a fixed figure.
A VP Engineering runs an organization of about 100 engineers. A vendor pitch claims the coding agent "replaces 20 engineers." Rolled up from raw activity, the numbers look supportive: agents ran thousands of sessions and produced code volume that, on paper, maps to roughly 20 people of output.
The accepted-outcome view is different.
Start from what merged and survived. Over the quarter, agents contributed to accepted work that, by the team's own effort estimates, would have taken about 8 engineer-months across a 3-month period. That is roughly 2.7 Agent FTE of added capacity, not 20. The gap is the difference between generated output and durable output. Much of the raw session volume was exploration, retries, rewritten diffs, and code that reviewers rejected or replaced within weeks.
The naive claim counted activity. The Agent FTE estimate counted accepted, durable work and stated its assumptions.
That reframes the decision. The VP is not deciding whether to remove 20 engineers. The VP is deciding how to grow the roughly 2.7 Agent FTE the team already earns, which teams and repositories convert agent work best, and what raising Agent Readiness would add. The number is smaller and far more useful. The illustrative figures here are for shape, not benchmarks.
Agent FTE sits at the end of a conversion funnel. Each stage discards work that does not turn into durable capacity.
| Stage | Question | What Survives |
|---|---|---|
| Activity | Are agents being used? | Sessions, prompts, tokens across platforms. |
| Engagement | Are agents used on real work? | Task-linked sessions, AI-assisted PRs. |
| Accepted outcomes | Does the work get accepted? | Merged PRs, completed tasks, verified changes. |
| Durable outcomes | Does the work survive? | Accepted work that is not reworked, reverted, or rewritten soon after. |
| Agent FTE | How much capacity is that? | Human-equivalent effort restated in full-time-equivalent units. |
Segment the estimate rather than reporting one company-wide figure. Agent FTE by team, by repository, and by work type shows where agent capacity is real and where it is thin. A well-tested service with clear conventions will convert agent sessions into accepted work at a higher rate than a legacy repo with flaky CI and missing context.
A useful Agent FTE view should answer:
The first caveat is precision. Agent FTE is an estimate built on effort assumptions, and it should be reported as a range with those assumptions visible. A single confident decimal implies a measurement the method does not support.
The most common failure mode is treating Agent FTE as a headcount-replacement number. It describes added capacity from accepted agent work. It does not identify engineers to remove, and using it that way pushes teams to inflate accepted-outcome counts and game the estimate. A second failure mode is individual ranking. Agent FTE is a team and system estimate. Attributing a personal Agent FTE to each engineer rewards visible agent usage over durable engineering.
Two more failure modes are worth watching. Counting generated volume instead of accepted, durable outcomes inflates the number toward the naive vendor claim. Ignoring the environment hides the real constraint, because a weak setup lowers real Agent FTE regardless of how capable the agent is.
| Bad framing | Better framing |
|---|---|
| "The agent replaces N engineers." | "Agents add roughly N Agent FTE of accepted capacity this quarter." |
| "This engineer is worth 3 Agent FTE." | "This repository converts agent work into Agent FTE at a high rate." |
| "Agent FTE is up, so we can cut headcount." | "Agent FTE is up. Where should we point the added capacity?" |
| "We generated 20 FTE of code." | "We accepted and kept work equal to about 3 Agent FTE." |
Start by separating activity from capacity. Pull agent sessions, prompts, and tokens to one side, and accepted, durable outcomes to the other. Agent FTE is estimated only from the second side.
Then put the estimate in context:
Agent FTE is most useful as a grounded estimate you can defend in a budget review, not a headline you use to justify a cut.
No. Agent FTE estimates the accepted engineering capacity agents add, in full-time-equivalent units. It describes added capacity, not engineers to remove, and it is not a headcount-replacement claim.
It starts from accepted, durable outcomes, translates those into human-equivalent effort by work type, restates that effort as full-time-equivalent capacity, and then divides cost by the estimate to get cost per Agent FTE. It is a modeled estimate with stated assumptions.
Those show activity, not capacity. Much of that activity is retries, exploration, and rewritten output that the system never keeps. Agent FTE counts only accepted, durable work, which is why it tracks real contribution more closely.
No. It is a team and system estimate. Assigning a personal Agent FTE rewards visible agent usage over durable engineering and misreads what the metric measures.