AI Writes Half Our Code. We're Working Harder Than Ever.

I use AI coding tools every day. As CTO of a YC-backed startup, I’ve watched our team adopt Cursor, Copilot, and Claude over the past two years. I’ve seen a junior engineer scaffold an entire microservice in an afternoon. I’ve also seen that same service cause a production incident three weeks later because nobody fully understood what it was doing.

The industry narrative is straightforward: AI writes code faster, developers are more productive, companies need fewer engineers. The data tells a stranger story. AI is writing more of our code than ever. Developers think they’re faster. They’re measurably not. Companies are laying people off based on what they believe AI will do, not what it actually does. And the engineers who remain are working longer hours, not shorter ones.

I wanted to understand why.

The numbers are real

GitHub Copilot now generates 46% of code in files where it’s enabled, rising to 61% in Java projects. Over 20 million developers use it. Ninety percent of Fortune 100 companies have deployed it. AI-authored production code across all developers hit 26.9% in February 2026, up from 22% the quarter before.

Autonomous agents are gaining traction too. Cognition’s Devin published its 2025 performance review after 18 months in production. Its PR merge rate doubled from 34% to 67%. On bounded tasks like security patches, it runs 20 times faster than a human. On test generation, 10 to 14 times faster. Thousands of companies, including Goldman Sachs and Nubank, now use it. But it struggles with ambiguous requirements (25% success rate) and new architecture (15%). The distinction matters.

GitHub’s platform metrics tell the output story: pull requests merged up 23% year over year in 2025, commits up 25%. New iOS apps increased 50%. New websites increased 40%. We are producing more software than ever before.

These are real numbers. AI is doing real work. But raw output is a misleading metric for productivity, and what’s happening beneath the surface complicates the story.

The study nobody predicted

In July 2025, METR, a nonprofit AI research lab, published a randomized controlled trial that landed like a brick in the industry discourse. Sixteen experienced open-source developers completed 246 tasks on codebases they knew well — some having contributed for five years or more. Tasks were randomly assigned as AI-allowed or AI-prohibited. The developers used Cursor Pro with Claude 3.5 and 3.7 Sonnet. They were paid $150 an hour.

The result: developers were 19% slower with AI.

Before the study, those same developers predicted AI would make them 24% faster. After experiencing the slowdown, they still believed they had been 20% faster. That’s a 39-percentage-point gap between what happened and what they thought happened. And 69% said they’d keep using AI anyway.

There’s a reason this happens. An experienced developer working on a familiar codebase carries context that no autocomplete can match. They know which abstractions leak, which tests are flaky, which module was written by someone who left two years ago. AI doesn’t have this context. So the developer ends up in a loop: prompt, wait, read the suggestion, decide it misses something, fix it, re-prompt. That cycle feels productive because things are happening on screen. The wall clock tells a different story.

A difference-in-differences study on Cursor adoption found the same pattern over a longer timeframe. Initial velocity went up. Then code complexity crept up. Static analysis warnings accumulated. Long-term velocity slowed down. The first sprint looked great. The sixth sprint paid for it.

This doesn’t mean AI coding tools are useless. GitHub measured completion speed on isolated tasks. METR measured experienced developers on real codebases. Both findings can coexist. AI helps when you’re working outside your expertise, on unfamiliar code, or on mechanical tasks. It slows you down when you already know what you’re doing.

The code quality problem

A December 2025 CodeRabbit study analyzed 470 open-source pull requests — 320 AI-co-authored, 150 human-only. AI-generated PRs had 1.7 times more defects: 75% more logic errors, up to twice the security vulnerabilities, and nearly eight times more excessive I/O issues. The PR acceptance rate: 32.7% for AI code versus 84.4% for human code.

That gap is the cost of what I call the 70% problem. AI gets you most of the way fast. The remaining 30% — error handling, edge cases, production hardening — is where the actual engineering lives. That part takes just as long as it always did.

Google’s DORA 2025 report confirmed this at the organizational level: 90% of developers use AI, over 80% believe it increases their productivity, but AI adoption correlates negatively with software delivery stability. Teams ship faster and break more. The report’s central finding: “AI doesn’t fix teams. It amplifies existing strengths and weaknesses.”

The layoffs are based on a forecast, not a finding

Half a million tech workers have been laid off since 2022. In 2025, roughly 127,000 lost their jobs in the U.S. About 70,000 of those cuts were directly linked to AI adoption.

But here’s what the aggregate numbers hide: 60% of executives reduced headcount in anticipation of AI’s future impact. Only 2% made large layoffs because AI had actually replaced the work. Companies aren’t cutting because AI proved it could do the job. They’re cutting because they expect it will.

The perception gap from the METR study — where developers believe they are 39 percentage points more productive than they actually are — may propagate upward. If engineers think AI makes them significantly faster, and managers observe faster demos, the signal reaching executives is distorted before it arrives.

Where did the displaced engineers go? LinkedIn tracked over 500 laid-off workers: 42% landed at other tech companies (fintech, AI, healthtech), 28% moved to non-tech industries, 15% joined startups. One number stood out — 63% of laid-off tech workers in 2026 reported starting their own companies.

Rehiring speed varies wildly by specialization. AI/ML engineers find new roles in 1.4 months. Frontend engineers take 4.2 months. Product managers take 4.8 months. The market is telling you which skills it considers substitutable and which ones it considers complementary to AI.

Junior engineering hiring is down 30%. Overall developer employment is projected to grow 15-18% through 2034. The market isn’t shrinking. It’s hollowing out the middle.

Why everyone is busier

This is the part that keeps nagging at me. If AI makes us more productive, we should have more time. Every piece of evidence says the opposite.

A UC Berkeley research team embedded with a tech company for nine months, tracking 40 workers for Harvard Business Review. What they found: workers “worked at a faster pace, took on a broader scope of tasks, and extended work into more hours of the day, often without being asked to do so.” They filled breaks, evenings, early mornings with additional work. Nobody told them to. The tools made it possible, so they did it.

The Harness report from March 2026 quantified the damage. Among developers who use AI coding tools very frequently, 96% work evenings or weekends multiple times a month, compared to 66% of occasional users. Very frequent users also report longer incident recovery times: 7.6 hours versus 6.3 hours. The people leaning hardest into AI are burning out fastest. TechCrunch ran a piece in February with the headline: “The first signs of burnout are coming from the people who embrace AI the most.”

The mechanics are predictable. A feature that used to take two weeks now takes four days. The sprint doesn’t get lighter. Three more features fill the gap. The new pace becomes the baseline, and nobody adjusts expectations down. Management sees the demos coming faster. Individual contributors are the ones reviewing AI output, catching the bugs it introduced, explaining to QA why a function silently fails on null input.

I wrote about this same dynamic in AI SRE: managers see cleaner reports and faster summaries; ICs see more code to review, more production issues, and more surface area to maintain. Both are telling the truth. They’re looking at different parts of the system.

In METR’s February 2026 update, 30-50% of developers declined to participate without AI access. They’ve become dependent on a tool that measurably makes them slower. That should worry us.

Jevons was right

In 1865, economist William Stanley Jevons observed that when James Watt made the steam engine more fuel-efficient, England’s coal consumption didn’t decrease. It increased. Cheaper energy unlocked uses that hadn’t been economical before. Factories that couldn’t justify steam power suddenly could. Total demand grew because the unit cost dropped.

Alex Palcuie, who runs AI reliability engineering at Anthropic, called this “the favorite paradox in the AI industry” at QCon London. He was talking about operations, but the same logic applies to development. AI makes code cheaper to produce. Organizations produce more of it. More code means more integration points, more failure modes, more things to monitor, test, and maintain. “All the improvements in the tooling will be cancelled by this ever-growing complexity.”

The evidence is accumulating. GitHub commits are up 25% year over year. New applications are being built at rates that would have been impossible three years ago. But mean time to recovery has gotten worse every year since 2021. Organizations taking more than an hour to recover rose from 47% to 82% over the same period.

This is Jevons playing out in real time. AI lowers the cost of writing code, and the world responds by wanting more software. Not a little more. Dramatically more. The Bureau of Labor Statistics projects 15-18% growth in developer employment through 2034, even as AI handles an increasing share of the typing.

There’s an optimistic version of this: AI enables a long tail of software that was never viable before. Small businesses get custom tools. Niche problems get purpose-built solutions. More people can build, not just professional engineers.

There’s a less comfortable version too. If AI handles the tasks that junior engineers used to cut their teeth on — boilerplate, bug fixes, test writing — the pipeline that produces senior engineers breaks. Today’s staff engineer got there by spending years writing bad code, debugging at 2 AM, learning what error handling means by watching it fail. If that apprenticeship disappears, we get experienced engineers who can’t be replaced and a generation that never got the reps. Lisanne Bainbridge predicted exactly this in 1983: automate the training tasks, and you undermine the human capability the system needs when automation fails.

The DORA report captured the tension well: “AI doesn’t fix teams. It amplifies existing strengths and weaknesses.” AI is a multiplier, and multipliers are indifferent to what they multiply.

What I actually think

I’ve spent two years watching this from both sides — as someone who writes code with AI every day and as someone who manages a team that does the same.

AI is doing real work. That’s not in dispute. But the industry is measuring the wrong thing. Lines generated, tasks completed, sprint velocity — those capture output. They don’t capture understanding. And understanding is what makes software work in production over months and years.

The METR finding, that experienced developers are slower with AI but believe they’re faster, should worry us more than it does. We’re building organizational strategies on top of a perception gap. Companies are laying people off based on productivity gains that haven’t materialized in controlled measurement. They’re burning out their strongest AI adopters. They’re accumulating code faster than they’re accumulating the judgment to maintain it.

Jevons tells us this won’t resolve by producing less code. It never does. When something gets cheaper, you get more of it. The resolution has to come from recognizing that code was never the bottleneck. Judgment was. It still is.

The companies that get this right will use AI the way good pilots use autopilot: as leverage that frees human attention for the decisions machines can’t make. The ones that get it wrong will learn what the METR study already showed — that speed you can’t perceive accurately is speed you can’t manage.