“In a system, improving one part does not guarantee better outcomes overall. In fact, local improvements can be blocked, diluted, or even reversed if the rest of the system isn’t able to adapt.”
State of AI-assisted Software Development, DORA Research 2025

Introduction

Generative AI promises to revolutionize software development, with studies from Microsoft, GitHub, and McKinsey reporting individual developer productivity gains of up to 2x. However, for the vast majority of companies that are weaving this new capability into an existing business, individual and organizational impact do not mirror each other; despite widespread adoption of AI coding assistants, most established companies, startups, and scale-ups see no measurable improvement in overall software delivery performance, according to DORA research. The TL;DR? Teams are automating chaos — bolting AI tools onto broken processes and expecting magic.

As we work with companies looking to make better use of generative AI, we are reminded that the very human process of collectively creating and building products requires thoughtfulness and empathy. In this post we share our observations and build on our research and experience to propose a Crawl-Walk-Run AI Adoption Framework as a means to side-step the “AI Productivity Paradox”, a term coined by Faros AI.

Part 1: The Problem - Automating Chaos

The core challenge facing organizations is that AI acts as a powerful amplifier, magnifying both existing strengths and existing weaknesses across the entire development lifecycle.

First, it exposes flaws at the very beginning of the process. Poorly defined requirements or unclear context is simply a bad “prompt” for your engineering team that results in the creation of code that is fundamentally misaligned with the actual need. Would like to do this faster with AI?

Second, even when requirements are clear, when underlying development tooling and processes are flawed, introducing AI tools can lead to larger, more frequent, and lower-quality pull requests. This overwhelms downstream processes like code review and testing which creates new bottlenecks, increases cognitive load on senior developers, and ultimately results in flat or even declining delivery metrics.

These two problems feed each other in a vicious cycle. Bad inputs create more code, which then clogs the already broken downstream pipes. The result is the paradox we see in the data: a frenzy of individual activity that fails to translate into meaningful organizational progress.

"AI’s primary role is as an AMPLIFIER - magnifying existing strengths AND weaknesses. The greatest returns come from a strategic focus on the organizational system, not the tools themselves." - DORA Report 2025

The AI Productivity Paradox in Numbers

The evidence for this disconnect is compelling. In July 2025, Faros AI published telemetry analysis from over 10,000 developers and found that while individual productivity metrics soared with AI adoption, organizational bottlenecks worsened significantly.

Individual Productivity	Organizational Bottlenecks
Tasks Completed+21%	Code Review Time+91%
Pull Requests Merged+98%	Pull Request Size+154%
PRs Handled Daily+47%	Software Delivery PerformanceFLAT

Table 1: The AI Productivity Paradox. Data from the Faros AI analysis for the DORA 2025 Report highlights how individual gains create systemic bottlenecks. [4]

This data is corroborated by a 2025 Bain Technology Report, which found that while teams using AI assistants see 10-15% productivity boosts, the time saved is often not redirected to higher-value work, limiting ROI. The report concludes that real value emerges only when GenAI is applied across the entire software development lifecycle, which requires fundamental process changes.

DORA Value Stream Management

Part 2: The Crawl-Walk-Run AI Adoption Framework

To address the challenge, we are experimenting with a Crawl-Walk-Run approach that balances quick wins with long-term structural improvements. We have seen two potential paths: 1) Assume that the team is representative of the industry and tackle the highest leverage processes (e.g., PR reviews, testing) or 2) Understand what software development processes are in place, starting with requirements gathering through deployment and maintenance. We will focus on 2) in the remainder of this post.

By analyzing their SDLC, teams can to pin-point the highest leverage change, while documenting the process as a whole. A great place to start is DORA’s Value Stream Mapping with the ultimate goal of improving 4 key metrics: a) Lead Time (time-to-build), b) Deployment Frequency (time between releases), c) Failure Rate and d) Recovery Time.

An example value stream map showing the flow of a backlog item to production.

Crawl Phase: Stabilize Human Processes

Once the SDLC has been mapped, the Crawl phase focuses exclusively on leveraging AI to improve the human-driven elements of the software development lifecycle - there are NO coding agents at this stage; it’s all about process. The goal is to create a predictable, collaborative, and high-trust environment. Rushing this phase is the primary cause of the AI Productivity Paradox in our opinion.

Examples:

GenAI-Assisted Requirements Gathering:
- Systematic Questioning: Deploy an AI agent to guide product managers through a systematic framework of questions to ensure all aspects of a requirement are considered and to generate standard documentation for steps down the line.
GenAI-Powered Sprint Management:
- Context Assembly: Use AI to automatically assemble technical context (from architecture docs, PRDs, past sprints) for the upcoming sprint, reducing prep time.
- Ceremony Automation: Use AI to generate summaries of retrospectives, identify action items, and help schedule and manage sprint ceremonies.
Standardized User Story Writing:
- Standardized Stories: Use genAI to standardize the creation of user stories to ensure they are clear, concise, and testable.
- Acceptance Criteria Generation: Use genAI tools to automatically generate clear, concise, and testable acceptance criteria for each user story, ensuring accountability and reducing ambiguity.

Walk Phase: Automate Development Processes

Once human processes are stable and some quick wins have been achieved, the Walk phase introduces automation to improve the speed, reliability, and efficiency of the development pipeline. The goal is to create fast, dependable feedback loops that free up developers from manual toil.

Examples:

GenAI-Powered Test Automation:
- Test Generation: Use AI to automatically generate unit tests and boilerplate mock data for new code, ensuring coverage from the start.
- Edge Case Identification: Prompt AI to suggest non-obvious edge cases and security vulnerabilities based on the code's logic.
GenAI-Enhanced CI/CD:
- Pipeline as Code: Use AI to generate and debug pipeline configurations (e.g., GitHub Actions workflows), reducing setup time.
- Failure Analysis: Feed pipeline error logs to an AI model to get instant root cause analysis and suggested fixes.
Intelligent Code Review:
- Automated PR Summaries: Use AI to generate human-readable summaries of complex pull requests, accelerating the review process.
- Semantic Review: Go beyond linting. Use AI to review for logical flaws, potential race conditions, and adherence to architectural patterns, allowing human reviewers to focus on strategic fit.

Run Phase: Introduce AI-Assisted Coding

Only after establishing stable human processes and robust automation can an organization safely and effectively introduce AI coding tools. The Run phase focuses on leveraging AI as a productivity multiplier on a solid foundation.

Examples:

Autonomous Bug Fixes:
- Agentic Triage & Root Cause Analysis: An AI agent receives a bug report, autonomously reproduces the issue, traces the root cause through the codebase, and identifies the exact files and lines of code to be fixed.
- Automated Remediation: The agent writes the fix, generates the corresponding unit tests, runs the test suite, and, if all tests pass, opens a pull request for human approval.
Agentic Feature Development:
- From Ticket to PR: A developer assigns a well-defined user story to an AI agent. The agent autonomously breaks down the task, writes the code, creates the tests, and opens a pull request for review.
- Human-in-the-Loop: The developer’s role shifts from writing code to reviewing the agent’s proposed solution, providing feedback, and approving the final merge.
Autonomous Tech Debt Refactoring:
- Automated Codebase Analysis: An AI agent continuously scans the codebase to identify, categorize, and prioritize high-impact technical debt.
- Autonomous Refactoring: The agent autonomously executes safe refactoring strategies for the most critical areas of tech debt, opening small, targeted pull requests for human review.

Part 3: The Missing Piece - A Minimum Viable Platform

This framework focuses on the people and process side of AI adoption, but we recognize that's only part of the story. As the DORA research rightly points out, a modern development platform is a critical prerequisite for realizing the full value of AI, especially for the agentic workflows in the Run phase. However, this doesn't mean you need a massive, upfront platform investment. We believe in a progressive platform investment model, with a Minimum Viable Platform (MVP) that evolves with each phase:

Crawl: Starts with simple API access and cloud-based LLMs.
Walk: Evolves to include a modern CI/CD pipeline and container infrastructure.
Run: Matures into a full agentic platform with orchestration, real-time data, and robust governance.

Defining the specifics of each MVP is a complex topic that deserves its own deep dive. In a forthcoming article, we'll explore this question and share our findings on building the right platform at the right time to support a truly agentic organization.

Conclusion

The promise of generative AI in software development is real, but it is not a silver bullet. The AI Productivity Paradox demonstrates that technology alone, when applied to a flawed system, only serves to create new and more complex problems. The path to unlocking AI's potential does not begin with buying a tool; it begins with the disciplined, often unglamorous work of fixing foundational human processes.

The enhanced Crawl-Walk-Run framework provides a pragmatic roadmap for leaders. By first stabilizing human processes, then building robust automation, and only then introducing AI, organizations can ensure they are amplifying strength, not chaos. By incorporating a pre-assessment phase, clear metric-driven gates, a parallel platform track, and embedded change management, the framework evolves from a simple sequence into a dynamic system for building lasting organizational capability. This structured, patient, and process-first approach is the most effective strategy for crossing the chasm from impressive AI demos to transformative business impact.

References

Microsoft Research, "The Impact of AI on Developer Productivity: Evidence from GitHub Copilot." https://www.microsoft.com/en-us/research/publication/the-impact-of-ai-on-developer-productivity-evidence-from-github-copilot/
GitHub, "Research: measuring the impact of AI on developer productivity and happiness." https://github.blog/2022-09-07-research-measuring-the-impact-of-ai-on-developer-productivity-and-happiness/
McKinsey & Company, "Yes, you can measure software developer productivity." https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/yes-you-can-measure-software-developer-productivity
DORA, "State of AI-assisted Software Development, DORA Report 2025." https://dora.dev/research/2025/dora-report/
Faros AI, "Key Takeaways from the DORA Report 2025 & The AI Productivity Paradox." https://www.faros.ai/blog/key-takeaways-from-the-dora-report-2025
Bain & Company, "For Generative AI, Pilot Projects Are the Easy Part." https://www.bain.com/insights/for-generative-ai-pilot-projects-are-the-easy-part-bain-technology-report-2025/
DORA, "Value stream management." https://dora.dev/guides/value-stream-management/

The AI Productivity Paradox: Why Individual Output Increases But Organization Impact Stalls

Introduction

Part 1: The Problem - Automating Chaos

The AI Productivity Paradox in Numbers

Part 2: The Crawl-Walk-Run AI Adoption Framework

Crawl Phase: Stabilize Human Processes

Walk Phase: Automate Development Processes

Run Phase: Introduce AI-Assisted Coding

Part 3: The Missing Piece - A Minimum Viable Platform

Conclusion

References

Ready to Build Something Amazing?