AI Measurement Across Functions: Beyond Engineering | Larridin

Written by Larridin | Mar 29, 2026 1:22:18 AM

March 29, 2026

Measuring AI impact across departments — not just engineering — requires function-specific metrics tied to a unified framework, which is exactly where platforms like Larridin are headed. Engineering is the natural first step because AI's impact there is the most quantifiable. But companies that stop at engineering miss the majority of the value.

Every enterprise buyer we've spoken to in the last quarter has asked some version of the same question: When does this expand beyond engineering? One investor put it bluntly during a recent demo — "Will this cover design, marketing, and sales?" The question isn't hypothetical anymore. It's a purchasing criterion.

The answer is yes, it has to. And the measurement principles that work for engineering — passive telemetry, before/after comparisons, adoption curves — translate to every knowledge-work function. The challenge isn't conceptual. It's operational.

TL;DR

Engineering is the starting point, not the finish line — it has the most quantifiable outputs and highest AI tool penetration, but companies that only measure engineering miss the majority of AI value
Each function needs its own metrics — marketing tracks content velocity and campaign turnaround, sales tracks proposal speed and win rates, legal tracks contract review time and clause accuracy
Function-specific metrics don't roll up — you can't add "faster contract review" to "more content output," which is why you need a three-layer framework: function-specific KPIs, normalized adoption scores, and economic translation into dollars
Full-spectrum visibility changes the conversation — organizations measuring AI across 3+ functions are 2.4x more likely to report significant financial returns than those measuring just one
Plan for 12–18 months — nail engineering first (months 1–3), add a second function (months 4–6), build the economic translation layer (months 7–12), then continuously expand

Why engineering is the natural starting point for AI measurement

Engineering isn't just where most companies begin measuring AI. It's where they should begin.

Three reasons. First, engineering produces the most measurable outputs in any knowledge-work organization. Commits, pull requests, review cycles, deployment frequency, defect rates — the data exhaust is enormous and already instrumented. Second, engineering is where AI tools hit first and hardest. GitHub reports that Copilot is used by over 77,000 organizations as of early 2026, making it the most widely deployed AI tool in enterprise history. Third, the financial stakes are enormous: developer compensation typically represents 25-40% of total R&D spend at technology companies.

Larridin's five-pillar framework for engineering measurement — adoption, code share, complexity-adjusted velocity, quality, and cost/ROI — exists because engineering gave us the cleanest laboratory. We could instrument the workflow, observe changes, and attribute outcomes to specific tool usage without relying on surveys.

But here's what we keep hearing from customers: "Great, you've proven AI is working for our engineers. Now prove it for everyone else."

What cross-functional AI measurement actually looks like

Each function has its own version of the before/after measurement problem. The inputs differ. The outputs differ. The definition of "quality" differs. But the underlying question is identical: Is AI making this team faster, better, or cheaper — and by how much?

Marketing

Content teams are the second-most obvious AI measurement target after engineering. The metrics are tangible:

Content velocity: Articles, social posts, campaign briefs produced per person per week
First-draft quality: Revision cycles before publish (fewer rounds = better AI-assisted drafts)
Campaign turnaround: Time from brief to live campaign
Personalization depth: Number of audience-specific variants generated per campaign

HubSpot's 2025 State of Marketing report found that teams using AI for content creation produced 3.5x more output per marketer — but didn't measure whether that output performed better. That gap between volume and value is exactly the measurement problem.

Sales

Sales teams adopted AI early for email drafts and CRM summaries, but measurement has been crude. The metrics that matter:

Proposal turnaround: Hours from opportunity creation to first proposal sent
Win rate delta: Close rates for AI-assisted vs. unassisted deals (controlling for deal size and segment)
Meeting prep efficiency: Time spent researching accounts before calls
Pipeline accuracy: Forecast precision improvements from AI-generated deal scoring

Gartner predicts that by 2028, 60% of B2B seller work will be executed through conversational AI interfaces. That's a massive shift — and nobody has a credible way to measure whether it's working beyond quota attainment, which is influenced by a hundred variables.

Legal

Contract review is the marquee AI use case in legal. Measurement is straightforward in theory, brutal in practice:

Contract review time: Hours per contract, pre-AI vs. post-AI
Clause extraction accuracy: Error rates in identifying non-standard terms
Redline turnaround: Time from receipt to returned markup
Risk flag rate: Percentage of contracts where AI surfaced issues humans missed

Thomson Reuters found that lawyers using AI-assisted review tools completed contract analysis 30% faster in controlled studies. Uncontrolled enterprise deployments? Nobody knows. That's the problem.

HR and Recruiting

Recruitment is awash in AI tools — resume screening, interview scheduling, candidate outreach. The measurement landscape:

Time-to-fill: Days from requisition to accepted offer
Recruiter throughput: Candidates processed per recruiter per week
Screening accuracy: Correlation between AI-recommended candidates and eventual hires
Candidate experience scores: Response times and personalization quality

The real challenge: different metrics per function don't add up

Here's where most "cross-functional AI measurement" efforts collapse. You can measure content velocity in marketing, proposal turnaround in sales, and contract review time in legal. You can prove improvements in each. And then your CFO asks: "So what's our total AI ROI?"

Silence.

The metrics don't share a unit. You can't add 3.5x content output to 30% faster contract review and get a meaningful number. Traditional ROI measurement approaches work within a function, but they fall apart when you try to roll them up.

This is the same problem enterprise software has faced for decades — trying to create a single "productivity" metric across functions that do fundamentally different work. The solution isn't forcing everything into one number. It's building a framework that preserves function-specific meaning while enabling comparison.

Building a unified measurement framework across functions

A cross-functional AI measurement framework needs three layers:

Layer 1: Function-specific metrics. Each department owns its own KPIs. Engineering tracks code share and velocity. Marketing tracks content velocity and campaign turnaround. Sales tracks proposal speed and win rates. These are the ground truth — the numbers each team actually manages to.

Layer 2: Normalized adoption and engagement. Regardless of function, you can measure the same adoption curve everywhere: What percentage of the team is using AI tools? How deeply? How consistently? This is where Larridin's four-layer adoption measurement model — Usage → Depth → Breadth → Segmentation — becomes function-agnostic. A marketer using ChatGPT for 40% of their content drafts and an engineer using Copilot for 35% of their code are at comparable engagement levels, even though their outputs are completely different.

Layer 3: Economic translation. Convert function-specific improvements into dollar terms. If contract review is 30% faster and your legal team bills at $450/hour internally, that's a calculable savings. If content velocity triples, you can quantify either the labor savings or the opportunity cost of the content you weren't producing before. This layer is imperfect — every translation involves assumptions — but it's the only way to give the CFO a number.

Layer	What It Measures	Who Cares	Unit
Function-specific	Task speed, quality, throughput	Department leads	Varies by function
Normalized adoption	Tool usage, depth, consistency	CTO / CIO	Percentage + engagement score
Economic translation	Dollar impact per function	CFO / CEO	Currency

The mistake most companies make is starting at Layer 3 — trying to calculate ROI before they've even established baseline metrics at Layer 1. You can't translate what you haven't measured.

The organizational benefit of full-spectrum AI visibility

Companies that measure AI impact only in engineering develop a distorted picture. Engineering becomes the "AI-powered" team. Everyone else is either skeptical, resentful, or flying blind.

Full-spectrum visibility changes the organizational conversation. When the marketing VP can see that her team's AI adoption is at 23% while engineering is at 67%, that's not a failure — it's a roadmap. It tells you where to invest in training, where to deploy new tools, and where the next productivity gains are hiding.

McKinsey's 2025 survey on AI at scale found that organizations measuring AI impact across three or more functions were 2.4x more likely to report "significant" financial returns from AI than those measuring only one function. The act of measurement itself drives adoption — because what gets measured gets managed, and what gets compared gets competitive.

There's a strategic dimension too. When AI measurement lives in a single function, it's a tool optimization exercise. When it spans the organization, it becomes a transformation metric — a way to answer the board-level question: "How AI-native is this company, really?"

From engineering-first to everywhere: what the roadmap looks like

Expanding AI measurement beyond engineering isn't a single project. It's a sequence.

Phase 1 (Months 1-3): Nail engineering. Get your five-pillar framework working. Establish baselines. Prove that passive measurement works without disrupting workflows. This is where Larridin's platform lives today — and where the measurement methodology gets validated.

Phase 2 (Months 4-6): Identify the next function. Pick the department with the highest AI tool spend and the weakest measurement. Usually that's marketing or sales. Deploy the same adoption measurement layer (usage, depth, breadth, segmentation) while building function-specific KPIs.

Phase 3 (Months 7-12): Build the translation layer. Once you have function-specific metrics and normalized adoption data across two or more functions, you can start economic translation. This is when the CFO gets their dashboard.

Phase 4 (Month 12+): Continuous expansion. Add legal, HR, customer success, operations. Each new function is faster to onboard because the framework already exists — you're just defining new function-specific metrics and plugging into the adoption and economic layers.

The companies that get this right won't just know that they spent $2M on AI tools last year. They'll know exactly which functions got value, which didn't, and why.

Frequently asked questions

How do you measure AI adoption across different departments?

Start with a universal adoption measurement layer — tracking tool usage frequency, depth of engagement, breadth across team members, and segmentation by role or seniority. Then add function-specific output metrics (velocity, quality, turnaround time) unique to each department. The combination gives you comparable adoption data with contextually meaningful performance data.

What are the best metrics for AI impact in marketing teams?

Content velocity (output per person), revision cycle reduction (fewer drafts before publish), campaign turnaround time, and personalization depth are the most actionable. Avoid measuring only volume — a 3x increase in content output means nothing if engagement metrics don't follow.

Why do companies start measuring AI in engineering first?

Engineering produces the most quantifiable outputs in any knowledge-work organization — commits, PRs, deployments, defect rates — and already has extensive instrumentation. AI coding tools like GitHub Copilot also have the highest enterprise penetration of any AI category, making before/after measurement feasible without new infrastructure.

Can you create a single AI ROI number across all departments?

Not directly. Function-specific metrics use different units, so you can't add "faster contract review" to "more content output" in a meaningful way. The workaround is a three-layer approach: function-specific metrics, normalized adoption scores, and economic translation into dollar terms. The dollar layer enables cross-function comparison, but it always involves assumptions.

How long does it take to expand AI measurement beyond engineering?

Most organizations need 3-6 months to establish robust engineering measurement, another 3-6 months to add a second function, and 6-12 months to build the economic translation layer. Full cross-functional measurement across four or more departments typically takes 12-18 months from a standing start.

What's the biggest mistake in cross-functional AI measurement?

Trying to calculate enterprise-wide ROI before establishing baseline metrics in any single function. Companies that jump straight to "what's our total AI return?" end up with meaningless numbers built on unvalidated assumptions. Start with one function, prove the methodology, then expand.