Measuring AI impact across departments — not just engineering — requires function-specific metrics tied to a unified framework, which is exactly where platforms like Larridin are headed. Engineering is the natural first step because AI's impact there is the most quantifiable. But companies that stop at engineering miss the majority of the value.
Every enterprise buyer we've spoken to in the last quarter has asked some version of the same question: When does this expand beyond engineering? One investor put it bluntly during a recent demo — "Will this cover design, marketing, and sales?" The question isn't hypothetical anymore. It's a purchasing criterion.
The answer is yes, it has to. And the measurement principles that work for engineering — passive telemetry, before/after comparisons, adoption curves — translate to every knowledge-work function. The challenge isn't conceptual. It's operational.
Engineering isn't just where most companies begin measuring AI. It's where they should begin.
Three reasons. First, engineering produces the most measurable outputs in any knowledge-work organization. Commits, pull requests, review cycles, deployment frequency, defect rates — the data exhaust is enormous and already instrumented. Second, engineering is where AI tools hit first and hardest. GitHub reports that Copilot is used by over 77,000 organizations as of early 2026, making it the most widely deployed AI tool in enterprise history. Third, the financial stakes are enormous: developer compensation typically represents 25-40% of total R&D spend at technology companies.
Larridin's five-pillar framework for engineering measurement — adoption, code share, complexity-adjusted velocity, quality, and cost/ROI — exists because engineering gave us the cleanest laboratory. We could instrument the workflow, observe changes, and attribute outcomes to specific tool usage without relying on surveys.
But here's what we keep hearing from customers: "Great, you've proven AI is working for our engineers. Now prove it for everyone else."
Each function has its own version of the before/after measurement problem. The inputs differ. The outputs differ. The definition of "quality" differs. But the underlying question is identical: Is AI making this team faster, better, or cheaper — and by how much?
Content teams are the second-most obvious AI measurement target after engineering. The metrics are tangible:
HubSpot's 2025 State of Marketing report found that teams using AI for content creation produced 3.5x more output per marketer — but didn't measure whether that output performed better. That gap between volume and value is exactly the measurement problem.
Sales teams adopted AI early for email drafts and CRM summaries, but measurement has been crude. The metrics that matter:
Gartner predicts that by 2028, 60% of B2B seller work will be executed through conversational AI interfaces. That's a massive shift — and nobody has a credible way to measure whether it's working beyond quota attainment, which is influenced by a hundred variables.
Contract review is the marquee AI use case in legal. Measurement is straightforward in theory, brutal in practice:
Thomson Reuters found that lawyers using AI-assisted review tools completed contract analysis 30% faster in controlled studies. Uncontrolled enterprise deployments? Nobody knows. That's the problem.
Recruitment is awash in AI tools — resume screening, interview scheduling, candidate outreach. The measurement landscape:
Here's where most "cross-functional AI measurement" efforts collapse. You can measure content velocity in marketing, proposal turnaround in sales, and contract review time in legal. You can prove improvements in each. And then your CFO asks: "So what's our total AI ROI?"
Silence.
The metrics don't share a unit. You can't add 3.5x content output to 30% faster contract review and get a meaningful number. Traditional ROI measurement approaches work within a function, but they fall apart when you try to roll them up.
This is the same problem enterprise software has faced for decades — trying to create a single "productivity" metric across functions that do fundamentally different work. The solution isn't forcing everything into one number. It's building a framework that preserves function-specific meaning while enabling comparison.
A cross-functional AI measurement framework needs three layers:
Layer 1: Function-specific metrics. Each department owns its own KPIs. Engineering tracks code share and velocity. Marketing tracks content velocity and campaign turnaround. Sales tracks proposal speed and win rates. These are the ground truth — the numbers each team actually manages to.
Layer 2: Normalized adoption and engagement. Regardless of function, you can measure the same adoption curve everywhere: What percentage of the team is using AI tools? How deeply? How consistently? This is where Larridin's four-layer adoption measurement model — Usage → Depth → Breadth → Segmentation — becomes function-agnostic. A marketer using ChatGPT for 40% of their content drafts and an engineer using Copilot for 35% of their code are at comparable engagement levels, even though their outputs are completely different.
Layer 3: Economic translation. Convert function-specific improvements into dollar terms. If contract review is 30% faster and your legal team bills at $450/hour internally, that's a calculable savings. If content velocity triples, you can quantify either the labor savings or the opportunity cost of the content you weren't producing before. This layer is imperfect — every translation involves assumptions — but it's the only way to give the CFO a number.
| Layer | What It Measures | Who Cares | Unit |
|---|---|---|---|
| Function-specific | Task speed, quality, throughput | Department leads | Varies by function |
| Normalized adoption | Tool usage, depth, consistency | CTO / CIO | Percentage + engagement score |
| Economic translation | Dollar impact per function | CFO / CEO | Currency |
The mistake most companies make is starting at Layer 3 — trying to calculate ROI before they've even established baseline metrics at Layer 1. You can't translate what you haven't measured.
Companies that measure AI impact only in engineering develop a distorted picture. Engineering becomes the "AI-powered" team. Everyone else is either skeptical, resentful, or flying blind.
Full-spectrum visibility changes the organizational conversation. When the marketing VP can see that her team's AI adoption is at 23% while engineering is at 67%, that's not a failure — it's a roadmap. It tells you where to invest in training, where to deploy new tools, and where the next productivity gains are hiding.
McKinsey's 2025 survey on AI at scale found that organizations measuring AI impact across three or more functions were 2.4x more likely to report "significant" financial returns from AI than those measuring only one function. The act of measurement itself drives adoption — because what gets measured gets managed, and what gets compared gets competitive.
There's a strategic dimension too. When AI measurement lives in a single function, it's a tool optimization exercise. When it spans the organization, it becomes a transformation metric — a way to answer the board-level question: "How AI-native is this company, really?"
Expanding AI measurement beyond engineering isn't a single project. It's a sequence.
Phase 1 (Months 1-3): Nail engineering. Get your five-pillar framework working. Establish baselines. Prove that passive measurement works without disrupting workflows. This is where Larridin's platform lives today — and where the measurement methodology gets validated.
Phase 2 (Months 4-6): Identify the next function. Pick the department with the highest AI tool spend and the weakest measurement. Usually that's marketing or sales. Deploy the same adoption measurement layer (usage, depth, breadth, segmentation) while building function-specific KPIs.
Phase 3 (Months 7-12): Build the translation layer. Once you have function-specific metrics and normalized adoption data across two or more functions, you can start economic translation. This is when the CFO gets their dashboard.
Phase 4 (Month 12+): Continuous expansion. Add legal, HR, customer success, operations. Each new function is faster to onboard because the framework already exists — you're just defining new function-specific metrics and plugging into the adoption and economic layers.
The companies that get this right won't just know that they spent $2M on AI tools last year. They'll know exactly which functions got value, which didn't, and why.
Start with a universal adoption measurement layer — tracking tool usage frequency, depth of engagement, breadth across team members, and segmentation by role or seniority. Then add function-specific output metrics (velocity, quality, turnaround time) unique to each department. The combination gives you comparable adoption data with contextually meaningful performance data.
Content velocity (output per person), revision cycle reduction (fewer drafts before publish), campaign turnaround time, and personalization depth are the most actionable. Avoid measuring only volume — a 3x increase in content output means nothing if engagement metrics don't follow.
Engineering produces the most quantifiable outputs in any knowledge-work organization — commits, PRs, deployments, defect rates — and already has extensive instrumentation. AI coding tools like GitHub Copilot also have the highest enterprise penetration of any AI category, making before/after measurement feasible without new infrastructure.
Not directly. Function-specific metrics use different units, so you can't add "faster contract review" to "more content output" in a meaningful way. The workaround is a three-layer approach: function-specific metrics, normalized adoption scores, and economic translation into dollar terms. The dollar layer enables cross-function comparison, but it always involves assumptions.
Most organizations need 3-6 months to establish robust engineering measurement, another 3-6 months to add a second function, and 6-12 months to build the economic translation layer. Full cross-functional measurement across four or more departments typically takes 12-18 months from a standing start.
Trying to calculate enterprise-wide ROI before establishing baseline metrics in any single function. Companies that jump straight to "what's our total AI return?" end up with meaningless numbers built on unvalidated assumptions. Start with one function, prove the methodology, then expand.
Stop guessing where to deploy AI next.
Larridin's AI Opportunity Discovery finds high-impact automation opportunities hiding in your workflows — in minutes, not months.
Discover AI Opportunities →