Avoiding AI-Driven Complexity Inflation

Updated: April 26th, 2026

At this point, in late April 2026 and with 4.7 and 5.5 in the wild, agentic tooling - whether it's Claude Code, Codex, Cursor, Kiro, whatever - should be adding some level of capacity to your software development team. If it's not, start here, and if your calibration was from before about November 2025 and based on what LLM tools could do prior to Claude 4.5, a recalibration on capabilities and how they've exploded in the last ~6 months is worth your time.

Great code is unlikely to emerge in places it wasn't previously, but a good or great engineer using the tools well seems able to produce something roughly 1.5x to 3x what they could writing code by hand.

One thing I don't see being discussed as much is what to do with that extra capacity. From experience, I'm seeing three primary output paths:

More features. This is the simplest and easiest to measure path. A team that could produce 10 features in n increments can now produce 15, or 20, or more.
Higher quality features. The added edge case handling, tests and testability, observability, resilience, performance makes those same 10 features higher quality.
More feature complexity. The same features, but instead of treating complexity as the apex predator it is, and only spending it where absolutely needed, each feature becomes more complex.

Humans being humans, my fear is that way too much of the excess capacity gained from LLMs is going to end up in that third bucket. Time and energy is still a thing humans naturally conserve, and much like letter writing, it takes effort and rigor to make software as simple as it can be and no more complex than it has to be.

What's the right balance if you're able to be intentional about it? If I had a prescription, it would depend - as always - on the company or product stage. Something like this:

Early stage / pre-sales startup: 80% more features, 20% more quality. If any existing features are low quality and critical to early sales / referrals, prioritize rebuilding them applying first principles thinking.

Mid-stage scale-up in a single vertical: 60% more features. 40% more quality. Target some retrofitting or rebuilding of existing low quality features if they might lead to frustration or churn.

Mature companies / products: 20% more features, 80% more quality. Mostly retrofitting existing low-quality features that are leading to frustration or churn.

Of note: I wouldn't allocate anything deliberately to more feature complexity. The canonical rules of making software don't change whether humans or agents are writing code: DevX is important, testability is important, and as grug says: complexity very, very bad.

posted in Artificial Intelligence