Judgment > Code

January 5, 2026

Code is getting cheap. Judgment is not.

Last month I watched a junior engineer spin up a full OAuth flow with refresh tokens, session management, and role-based access in about an hour. Prompts, iterations, done. That used to be a week of reading docs and debugging redirect URIs.

My first reaction was something like awe. My second was: okay, so what actually matters now?

The shift I'm feeling

As the marginal cost of implementation collapses, the bottleneck shifts to specification, verification, and operation—the parts of engineering that decide whether software survives contact with reality.

This pattern isn't new. Every few decades, something that used to be hard becomes abundant. Compute. Storage. Bandwidth. Each time, the skills that mattered shifted—not because the old skills became useless, but because they stopped being the constraint.

I've lived through a couple of these shifts. Not as many as some people, but enough to recognize the feeling. That mix of "this changes everything" and "wait, the hard parts are still hard."

We're watching the same thing happen to code.

The conflation

For most of the industry's history, writing code was the bottleneck. Because it was expensive, we built our culture around it. We valorized implementation. We treated "shipping" as the achievement and "maintenance" as someone else's problem. (I definitely had this attitude early in my career. Ship it, move on, let future-me deal with the consequences. Future-me was not happy.)

This led to a conflation: we started treating "software engineering" as synonymous with "writing code."

But software engineering is the discipline of making software hold up in the real world. Writing code was always a means to an end. We just treated it as the end because it was the expensive part.

The economics now

LLMs are compressing the marginal cost of producing code. I've felt it in my own work: tasks that used to be about implementation are now about specification and review. The time moves from writing to reviewing.

When code is cheap to produce, three things happen:

Surface area explodes. More code gets written, faster. More code means more interactions, more edge cases, more ways to fail. The hard bugs live in the seams. I've debugged enough of these to know—the bug is almost never in the code you just wrote. It's in how that code interacts with everything else.

Verification becomes the bottleneck. If you can generate code in minutes but it takes hours to test, review, and validate—the constraint shifts. The question isn't "can we build it?" but "how do we know it's right?"

Ownership matters more than authorship. When anyone can generate code, the credit isn't in writing it. The accountability is in operating it. The system doesn't care who wrote the code. It cares whether someone is keeping it working at 3am when it breaks.

The pipeline

I think of software engineering as a pipeline:

Intent  Specification  Implementation  Verification  Operation

Until recently, implementation was the bottleneck. Now it's getting compressed. The constraint is moving upstream (specification) and downstream (verification and operation):

Specification: define intent precisely. Most bugs aren't in the code; they're in the gap between what you asked for and what you meant. I'd estimate 70% of the bugs I've fixed in my career were specification bugs disguised as code bugs.

Verification: prove behavior under constraints. Tests, types, contracts, observability.

Operation: own outcomes over time. Monitoring, incident response, migrations, deprecations. The stuff that's boring until it's not.

Models compress implementation. They don't compress judgment.

What judgment means

By judgment, I mean the ability to make good decisions under constraints:

Coherence. Does the system make sense as a whole? Can someone new understand it? (Can you understand it in six months?)

Correctness. Does it do what it's supposed to—not just in the happy path, but in the edge cases and failure modes?

Safe change control. Can you modify this system without breaking it? Do you have tests, rollback, observability?

Constraints. Reliability. Security. Performance. Maintainability. Cost. The things that don't show up in demos but determine whether the system survives production.

A concrete example from last year: I reviewed a generated retry loop that looked correct. Clean code, sensible structure. But under load, it would have overwhelmed a downstream service—no backoff, no circuit breaker, just hammering away. The code was syntactically fine. The judgment about failure modes was missing. That's a production incident waiting to happen.

When code is abundant, engineering becomes the art of constraints: defining what should happen, proving it did, and owning what happens when it doesn't.

Demo loops vs. production loops

There's a narrative that "vibe coding" is the future—that you don't need to understand what the code is doing, you just prompt your way to something that works.

This is true for demos. It's false for production.

A demo runs once, on your machine, with your inputs, in front of an audience that wants it to succeed. A system in production keeps running—through changing requirements, unexpected load, malicious inputs, dependency updates, team turnover.

The feedback loops are different. In a demo, you know immediately if it works. In production, failures are delayed and distributed. The code that "worked" in November causes an incident in February, and no one remembers why it was written that way. (Ask me how I know.)

Demos reward speed. Production rewards judgment.

This has happened before

When compilers automated machine code, assembly programmers worried about obsolescence. What happened? The demand for software exploded. More people could build things, so more things got built, and the people who understood systems were more valuable than ever.

When cloud computing made infrastructure abundant, you didn't need someone to rack servers—but you desperately needed someone who understood distributed systems, failure modes, and operational complexity at scale. I remember thinking "well, I guess I don't need to understand networking anymore" when I first moved to the cloud. I was wrong. I needed to understand it differently.

Each time, the total demand didn't decrease—it increased. But the nature of the skill changed. The commodity layer got automated. The judgment layer got more valuable.

This is just the beginning

Moore's Law trained us to expect exponential improvement—to watch today's breakthrough become tomorrow's commodity. AI capability doesn't follow the same curve (the drivers are messier: model scale, data, inference costs, tooling), but the pattern rhymes: what feels like magic this year becomes a baseline next year.

We're already seeing this in agentic coding tools and scaffolders. The boundary of what can be automated will keep moving. Anyone who tells you they know exactly where it stops is guessing.

The question isn't whether the shift is coming—it's what remains valuable on the other side. My bet: context, accountability, and judgment under uncertainty. The things that require owning outcomes, not just producing artifacts.

What to do Monday

If you're early in your career, don't stop at code. Learn to write tests, read logs, and do incident reviews. Build things that have to work, not just things that have to demo.

If you're experienced: the skills you've built—systems thinking, failure modes, operational discipline—are exactly what's getting scarce. Build the review, eval, and rollout machinery that makes judgment scale. This is what I'm trying to do with Noēsis.

Suppose you're building tools: ship verification surfaces, not just generators. The world will be drowning in generated code. What it needs is help knowing whether that code is any good.

The work now

We're in a transition. The old model hasn't fully ended; the new one hasn't fully arrived. It's uncomfortable. It's also kind of exciting if you squint at it right.

But the direction is clear. The bottleneck is moving from production to judgment. The value is accruing to people who can keep systems coherent under constraints.

Code will be plentiful. Judgment will be scarce.

The job is the same as it's always been: make software hold up in the real world. That's the work. That's always been the work.