jonoropeza.com

Making software and software development teams. Mostly the people parts.


On Using AI To Make Software (in early 2026)

1

"Watch out for other people's changemaking campaigns" is the type of advice I find myself liking to give, as I push well into my third decade in tech.

So when I hear the roar and fuss of AI hype, the tone and assertions of all it can do, the suggestions of what tomorrow might look like, the manic motions to hitch your own content to the train (like this post of mine), the tangible impact the hype has had on the market, and the resistance forming against it, all of which seems to be building toward the biggest IPOs in history, public offerings with the unstated suggestion of we shall capture every inch of margin in every meaningful knowledge work vertical? My antennae are up: this is likely the awareness phase of an attempt to affect massive change.

Where are we on a hype cycle model? It's hard to pick an exact inflection point, but sometime between the releases of Opus 4.5 + GPT 5.2 and now 4.6 and 5.4, it became imperative for tech companies (the little slice of the world I'm most exposed to) to signal (a) you're an AI company not an xyz company (b) your engineers (or you) are using xyz tooling to ship features and fix bugs at a tremendous rate. We've gone from IntelliSense is so good now it's almost creepy to CEOs bragging about their best engineers not writing any code in under four years. Extrapolating even linear growth from here makes 2030 seem like a wildly different place. And none of the suggestions are self-limiting to linearity.

Yes, there are massive incentives for OpenAI and Anthropic in telling a story of exponential competence growth. And there are historical precedents for populations pushing back against rapid technological change, Luddism and general technophobia being no new things. Those two poles will attract the masses. I'm more interested in the grey areas where most of us live: Where go us makers of software who cringe at both extremes, and instead want to understand the nuances, utility and limits of the thing?

This middle ground is what I will attempt to find in the rest of this piece.


2

Acknowledging the hype and the distaste it can leave in your mouth, it's apparent to me that there's been a shift over the last six months. Andrej Karpathy said it really well here, emphasis mine:

It is hard to communicate how much programming has changed due to AI in the last 2-3 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December [2025] and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.

A lot of energy is going into discourse on code generation, but I'm finding LLMs useful across the entire software lifecycle:

  • Discovery / ideation - AI is much less likely than a human to fall for recency bias, power of personality or their own whims when parsing customer feedback. Gathering evidence for what competitor roadmaps might look like is a full-time role at many big companies, now a full table with your choice of columns is available within a 20 minute Extended Think. All divergent activities that could be done by a human, they just take a lot of time, and unlike a human, an agent won't 80/20 the thing. Diligent task completion will be a theme...
  • Requirements & Design - I don't like AI for doing the actual writing of writing docs like PRDs or tech specs, but LLMs are a phenomenally persistent editing partner. This is another theme: I feel strongly about the words in text pieces being written by the writer, with the LLM as coach.
  • Implementation - Yes, Opus 4.6 and GPT 5.4 code really well, at least in my experience in the TypeScript & Pythonic web services and experiences I work in. And can be very very good when prompted well inside of a well-thought-out loop... I'll say more about that in a moment.
  • Verification (testing & QA) - Building and maintaining run-in-any-env, deterministic integration test suites and building towards natural language e2e tests are my current thought. I never want to debug or add yet another await to another flaky Cypress script again, and why should any of us when agentic click-throughs will be better, more thorough testers anyway. Again this idea of diligence...
  • Code Review & Refinement - Combo of hand-rolled skills and tools like Copilot or CodeRabbit. By hand-rolled, I mean collaborated on with an agent. Yet another theme: Everything is meta in this world, everything is turtles all the way down. The best way to learn to use the tools better is to ask the tool...
  • Release and Operations (monitoring, rollback, runbooks, triaging bugs, etc) - Again, a lot of AI's usefulness isn't that it can do what humans can't, it's that it can do what humans won't. Cursor and Claude Code using the DataDog MCP or the pup cli have the conscientiousness and orderliness you always wished your on-call engineers had. They monitor without fail and dig deep on every anomaly.

Is AI performing magic in any of these aspects? No. As I go through my day to day, maybe waiting for a prompt or four to resolve, I'll ponder what's happening here: I have a small army of extremely diligent assistants, whose feelings don't get hurt if I have to countermand my instructions or abandon a thread unfinished. I'm not sure what this means for human relations - I suspect we will struggle as more and more of our textual and eventually voice interactions play out this way, and we slip into talking to our peer humans this way, too.

Does using AI ever slow me down? Yes. Whether it's inducing analysis paralysis in the ideation phase, following bad paths that a sycophantic agent is reinforcing, falling for red herrings in monitoring, picking tiny nits in code reviews, there are plenty of ways using AI can actually make a task more tedious, the output less accurate and slower to produce.

Given all of this, is using AI net-net making me a more productive builder in a real world, established, brownfield environment with plenty of legacy code to work in and around?

Yes.

Read on for an example of how I'm using LLMs in the programming itself.


3

So: What does a real-world coding loop that works in a productionized, highly in-use codebase where bugs and mistakes cost real money look like?

Here's a loop I'm having success with. This will change as the wrapper apps (Cursor, Claude Code, Kiro et al) adapt and subsume logic like this into their layers. Guessing this will look completely archaic when I read this in 2027, or maybe even by mid-2026:

  • Develop a few specialist skills. Or just steal Garry Tan's.
  • Make a PLAN.md which is a mostly empty markdown file, maybe one header that says "Next steps for project".
  • Make a HISTORY.md which will be a decision and change and how-we-did-this log
  • Make an AGENTS.md or a CLAUDE.md, and reference the skills, PLAN.md and HISTORY.md and what they're each for
  • New chat, "/plan-ceo-review I'm thinking of a new feature to do x, y, z, help me fill out PLAN.md", have a dialogue, end result is a filled out PLAN.md
  • New chat, "/plan-end-review review the next plan", have a dialogue, at the end of the dialogue "build it".
  • Next to last step of the build: Ask it to add to CLAUDE.md, HISTORY.md and clear PLAN.md
  • Last step of the build: "/review", go through review and testing

Why is this working for me?

  • I have not lost touch with the code or turned over parts of the codebase to the agents; I still look at every diff and make decisions at the which-module-goes-where level. For that reason, I currently prefer Cursor for most tasks, though Claude Code has its spots too.
  • For that reason, this works much better in parts of our codebase that I know well, versus working in something new.
  • Because I know the roadmap and structure of most of the projects I'm working on (we use about half a dozen different patterns through our TypeScript & Python codebases), I'm able to start each PLAN myself and assess the iterations the agent is making against both reality and desired future shape.
  • Because I know most of the code myself or my team is working on well, I'm able to assess and pre-review each specific diff and prune out undesirable directions or offer quick corrections, and usually assess the quality of the reviews as well.
  • Knowing all this helps me understand when an addition to the CLAUDE file is likely worthwhile or likely a bad path

All this comes together at a higher level in ways like this three-layered system Gurwinder suggests here. My iterations are faster, and I'm able to much more quickly assess whether a change is feasible or likely to lead to a mess - staying out of one of the classic traps of software making, the dreaded three point ticket that turns into a thirteen.

Note that I'm not using AI much for little shortcuts. Creating a commit message and pushing the code takes me, what, 15 seconds? For now I prefer to do it myself. I'm using it for tasks that might take me minutes, but often would take me hours, and require determination and continuous focus over that period of time.

What happens when I don't use a loop like this, when I just fire into a new codebase and prompt Claude or a model in Cursor to go do something? Sometimes it gets it. Other times I end up in one or more of the traps I mentioned above under "Does using AI ever slow me down". This week a naive prompt in a part of the codebase I know poorly took down a component in a dev env. And then because I didn't have permissions to restart the component, I had to shoulder tap a teammate who does. My poor usage of AI slowed us both down.

I facepalmed humbly and went back to working on my loop.


4

So that's a workflow that works in at least one mature codebase. What about all these breathless claims we read regularly on social about one-shotting someone's version of some app or another?

My typical first response to a breathless post is to click the bio. If it says founder of an AI company that's a wrapper around the big provider APIs, well, there you go: someone else's changemaking framework at work.

The rest? Well, it's often the first 80/20 of a perhaps-demoable web or mobile app. And that's always been easy to stand up. Yes, it's easier now. But we shouldn't mistake a lightweight, happy-path-only web app for full stack+data software systems that are observable, secure, resilient, testable, accessible, extensible, great DevX, have the right abstractions and the right consistency / performance tradeoffs. That's always been where the difficulty of software engineering has been.

And those aspects of software that make for durable systems and delightful experiences sit downstream in importance from product direction, principles, taste, market fit. And even those product aspects sit downstream of distribution, founder/market fit, right place / right time.

It's funny even having to remind people of that. I see excitement over demos of one or two aspects of a broad software surface, and yes I know the excitement comes from projecting what else might become possible if the trajectory holds. But that doesn't mean that making high quality, high uptime productionized software systems no longer requires specific knowledge.


5

I can't go there without taking a stab at what skills I see getting more valuable in this new world.

The work of software development has been most often like a game of chess. A series of puzzles that, when solved in sequence, results in a winning system. As a chess player, you have ideas, themes in mind, but also you'll pivot if you discover another opportunity or your opponent makes a blunder. LLMs writing software don't work that way (at least not today). They play the game best when you start with what you want. Often the best way to define what you want seems to be Q&A with a model.

Pulling up a layer of abstraction has always been about this. Lower level programmers have always been critical of how those working in the higher levels are ignorant of their memory and compute usage etc.

One good way to think about how writing code might change in the back half of the 20s is to look into how the tools are being made. I suggest looking over OpenClaw's architecture, either a summary like this or reading over the codebase with your LLM of choice. I haven't spun up my own lobster yet - this is not the kind of thing you're likely to see allowed in a productionized environment yet - but I've learned a lot just from how the workspace config files are laid out and how dynamic context is managed.

What kinds of skills will be more valuable as syntax production gets cheaper and faster? This seems a likely place to start:

  • knowing what you want
  • decomposing problems
  • creating good context
  • recognizing bad output
  • verifying rigorously
  • exercising taste
  • owning outcomes
  • designing loops and guardrails
  • laying out workspaces for ergonomics and efficiency


6

So, despite being annoyed by the tone of the hype, I'm an eager adopter of AI. At my core I'm a builder, a software maker. It's a thing in the world I've felt drawn to since I was a small boy. In college, I studied CS, not business. So this idea that I can make complex software systems working solo or in a small team is very exciting.

A lot of technologists reacted with distaste to the code academy / bootcamp boom of the 2010s. I embraced it. Some of my favorite peers got into the industry via the bootcamp route. A few things they have in common: they took the profession seriously, they've continued furthering their educations (including pursuing CS degrees), and they each got a helping hand into the profession via entry level, internship or junior roles.

So now I ask myself this: If you come through a bootcamp, or even if you've just gotten a CS degree, how do you get into this profession anymore with entry level roles being slashed? While it's true that, like mate selection, an individual usually only needs one job, there's something to be said for abundance states and what that does for the psyche of the searcher. And, with the end of the ZIRP era, we made a hard pivot from wild abundance in 2021ish to a scary scarcity of entry level / pre-senior roles here in 2026.

The CS background, never a hard requirement, does represent depth of knowledge that bootcamps don't and won't approach. Shallow breadth was valued simply because supply of even basic competence in web engineering, the ability to glue this to that, lagged the demand - demand fueled by near-zero-interest-rate investment philosophies.

It seems like LLMs are now at that point where their output can fill much of the remaining demand. The ability to stitch together a few React components - absent any deep knowledge of systems design, how to balance consistency with availability, how to achieve quality in all its dimensions - will no longer be sufficiently rare as to warrant huge offers to mid-level web software engineers. The wave, having crested in the final throes of ZIRP, will begin to recede. And that will and is causing emotional reactions.

I have an emotional reaction: I get sad when I see folks advising people not to go to a bootcamp or even not to study Computer Science in 2026. Not least because people told me when I was in high school in the 90s that "all the software developer jobs were going away" - overseas, according to that narrative. Also because WTF; in a university system that produces thousands of art history majors a year, why would there not be value in studying Comp Sci? Degree programs have never focused on applied knowledge. You are learning how to learn and think, gaining deep subject knowledge is a second order effect. Acquiring applied programming skills mostly happens at the workbench.

So it is today with AI.

Yes, the role may look much different by the time you get there.

But I think the wise move is to trust that this hiring slowdown on juniors is either temporary, or that there will be new paths for the bolder and brighter to follow. Just as I ultimately trusted that "all the software developer jobs were going away" would be and was washed away by the rise of something nobody expected: the dot com boom. And how much of it is part of the hype, anyway? After all, as someone on Reddit pointed out, even Spotify with its latest breathless claim this winter of its best software engineers writing no code, is still hiring software engineers.


7

So that's where I'm at, as of March 2026: Between the hype and the fear, I believe there's a huge and hugely enjoyable leverage to be found in using agentic tools in every aspect of the SDLC to make software. They can slow you down or lead you down bad paths if used poorly.

Definitive conclusions seem ridiculous when each day lately brings fresh capabilities or ideas about them. So I'm going to let this fan out in a few directions. Will any or all of them seem silly in a few years, or even a few months? Seems likely. Which? Seems really hard to say.

  1. Something I mentioned above: I don't think using an LLM to write text is the right move. AI writing currently has a few tells that, as soon as I encounter them on socials or in work Slack, my brain shuts off. I deeply believe in writing to think, and if you skip the writing of the words, you're skipping the most important part. Maybe I'm like the teacher telling the students to write out their essays by hand: I'll submit to being open to being wrong here. But for now, every word you read on here or - if you're my coworker reading this, on Slack or in Confluence - I wrote myself.
  2. Turtles all the way down: Something else I mentioned above: The best way to learn how to use AI better might be to ask AI. And the better you are at using AI, the better you are at asking AI how to use AI. So there's a huge advantage to putting in 60, 30, even 10 minutes a day getting better, and letting that compound.
  3. Any task where a primary mode of failure is that the human who's doing it might give up halfway due to tedium or an 80/20 assumption or just natural laziness? Top candidate for using AI.
  4. To get better results from tasking up an LLM, let the LLM task you up. A well-written markdown doc helps the agent understand from your first principles what you really want. And a Q&A session is a great way to get there.
  5. Dunning-Kruger and Parkinson's Law will be amplified in the world of LLMs. If there's anything I know about human nature, it's that humans hate thinking any more than we have to. Given a task that is expected to take an hour, and then accomplishing that task / getting to an acceptable answer in 5 minutes with Claude or ChatGPT, there is a very low probability that the next move will be to use the remaining 55 minutes (a) to go 11x deeper on the subject or (b) accomplish 11 more tasks. It is our nature to use the 55 minutes in relaxation, or writing another breathless post on something we know a little about but have a lot of articulate words to say about - thanks to AI. Only those who have gone deep into the hurt locker of programming themselves - or have the 'gift' of deep anxiety, or an actual forcing function like their company is about to go bankrupt - will be able to power through their biology to get to either depth or breadth in a problem space.
  6. This might be nothing, or it might be my most profound idea about using AI: Using AI resembles petitionary prayer, where the most valuable result is the forcing function for the user to decide what it is they actually want, and whether or not it's delivered to them might be best considered a second order effect.
  7. Our deep desire to not think any harder than we have to will remain a primary limiting factor in the application of agentic tooling.
  8. The jerkiness and time compression is real right now. I started musing on this piece in January. And then 4.6 came out, and I had to rewrite it to reflect a new point of view. If this continues, this piece and the points of view it contains will seem archaic in 6-9 months.
  9. All of this assumes LLMs will remain a tool, and that AGI including full agency is not coming in this wave. If I'm wrong about that, then little of this matters anyway, because everything changes. 
posted in Artificial Intelligence


Claude 4.6

This is something else entirely. I've been working on a long-form piece on using AI in a large, in-production codebase for the last month or so. My tack has been towards caution and towards measured usage of Claude 4.5 and GPT 5.2 in specific, targeted aspects of the SDLC.

The last week working with 4.6 has been eye-opening. It's completing and even occasionally one-shotting tasks - including coding - that previously I would expect to have to have a long, sometimes painful, sometimes slower than just doing it myself iteration with the agents.

Not saying this to scare anyone or create FOMO. More like to put a bookmark in time - here is where I felt a shift, a passing of the tipping point of utility. A moment when before this I could say that using AI as a software developer was a very good thing to explore tactically, and after which I may end up saying that not using AI as a software developer makes little sense strategically.

I'll be back in a few weeks with that more thorough post...

Edit: You can find it here.

posted in Artificial Intelligence


The Best Things I Read In 2025

The reason I'm picky - and getting pickier - each year about what I read is that what I read is what's available to think.

I walk a fair amount: 10,000+ steps a day. If your goal is having more time to think, I recommend a 20-30 minute walk each day. And then there's the question of what to think about. For me, I either give my mind something to think about, or it picks something out of the Worry Stack. Iterating your Worry Stack regularly is important when done intentionally, but do it consistently and/or continuously and/or mindlessly, and you'll burn out on whatever's causing the worries. If you've got a job as a software engineering leader, that'll probably be the source of many of the worries on the stack, and a thing to guard against is hating/resenting "the job" through lack of discipline to control our thoughts.

So an antidote is to choose. Reading provides my conscious mind choices, and the subconscious better paths to influence the choice.

Attention, awareness of agency and the mechanics of the mind were primary themes again this year, as you'll see. Here are the best things I read in 2025:

Rao Reading Algorithm by Arun Rao

I've been thinking a lot about how the machinery of my systems & regular activities works, and also how complacency hinders it, and then finally what effortful outcomes look like. This might be it for reading: "...reading is practice directing attention, causing your awareness to curate what comes into that awareness, for a purpose... I’ve rewired my brain so reading to learn is highly pleasurable"

The essay also offers an answer to Why Read The Classics, via Jacques Barzun who lived to 104 and, if you follow just that Wikipedia link and the author's advice to build out your semantic tree of knowledge, offers at least an afternoon's worth of new nodes, "in order to live in a wider world".

The AI Reflex by Blake Graham

Another post about technique and foundational machinery. This time it's the toolset used while working. Work as in coding and writing.

Speaking of writing: One thing I realized this year is how many of us work as professional writers, or at least the core output at work is writing to persuade. Whether it's short form on Slack/Discord, responding to a customer email, or a longform doc describing a problem and remediation plan, or a well crafted series of prompts that causes code to be written or reviewed, writing well is the primary method we use to affect change and ultimately deliver value.

The author on why his git commit graph has trended up by an order of magnitude: "Yes, the models improved, but that is not why my commit graph changed. I built a reflex. I stopped deciding whether to use AI and started reaching for it the way I reach for a light switch. The better models enabled the transformation. The reflex made it automatic."

Cognitive load is what matters by Artem Zakirullin

A deep dive into developing architecture and writing code that's optimized for reasoning and comprehensibility.

One of the odd things about making software is how averse we are to admit when the software is getting too complex, because of what it implies: that you are struggling, that you can't handle the load, maybe not smart enough.

In most shops, the people who get things done couple raw horsepower with time in seat and good long term storage. I was blessed with really good recall . Especially when I try. So I can remember how the thing works, either because I wrote it myself or I spent an afternoon tracing through a callstack and developing a dependency hierarchy in my head.

But long term storage is not working memory. Even amongst those who can get in there and work the lines as they say in chess, there is a reluctance to do so unless/until absolutely needed. A piece that just missed this list is actually a comment on Reddit, and this quote: "Chess is a constant struggle between my desire not to lose and my desire not to think."

"Good engineering management" is a fad by Will Larson

Software engineering management is weird. By engineering management I - and I think most authors - mean any role, regardless of title (Director, etc) where you (a) have people who report to you (b) are not in the leadership room with the CEO (aka you're not an executive).

I've been focused on these roles for the last seven years of my career now, after a decade or so of working as an IC, an executive and a founder. Mid-level management roles are exciting, and also weird in a specific way: They're all similar, but also they're all different, and nobody tells you that, nor does anyone typically tell you exactly what's expected of you until it's too late.

When a peer or friend or mentee asks a question about performance reviews etc, I usually answer this: Engineering Leaders deliver the product roadmap (aka their squads/teams ship projects on time), maintain their systems (aka minimize bugs and user frustrations and don't get hacked) and develop their people (aka hire, fire and promote people). Very few do all three well. If you are doing all three well, you're probably getting at least a decent perf review.

During ZIRP, you could get away with doing two or even one of these well; while my favorite leaders to report to are good at all three, I've had quite a few peers who fit what the author calls "orchestrators". As he points out, the specific post-ZIRP expectations are different. And they always will be, as tides turn and conditions change and expectations for a startup shift.

As an industry, venture funded SaaS has just been through one of the most severe shifts I've ever seen in my career: From prioritizing topline growth at nearly all costs to prioritizing profitability at the expense of nearly all growth.

This is easily the most immediately useful piece on this list: a steal & adapt framework for guiding your own focus and skillbuilding.

a simple mechanistic theory of jhanas by Bayes

Most of these pieces are pure pleasure to reread and re-parse. I realized this year that I don't even know what the best things I read are until I reread them and process them in close contact with each other. So a primary reason I write and post the best things I read each year? It's to know what the best things I read are. See also this and also this for more on that topic.

Rereading this piece bummed me out at first, and it's because another year has gone by that I haven't progressed into the jhanas. I still can only do the first one, and even that only sometimes. I still think this is a fantastic guide to getting into the jhanas, and it's totally on me for not making the time to practice.

So why is this piece on here? Because it unlocked for me two things. One, a way of diagnosing negative feedback loops as they're beginning in my mind. And two, a mechanistic way of hacking my own pleasure system. That's all: just a simple mental model that the author presents that led to two big insights. The second of which was key to me for enabling action related to the next item on the list...

Viscerality by Simon Sarris and How to like everything more by Sasha Chapin and The Consolation Of Apricots by Diane Ackerman

A trio on cultivating pleasure.

Of all the things we learn from our parents and school in the primordial brain soup called "growing up", there's one that I think would have the maximum impact to humanity if introduced broardly: Liking things is a skill that can be built, emotionally reacting to art, circumstance and deliciousness can be cultivated and programmed, and you can "develop a crush on the creator".

Focusing attention and awareness on pleasure can also be a way to keep the Worry Stack at bay.

What To Do by Paul Graham

My eldest niece turned 21 this year, and is graduating college next year. Class of 26. Over Christmas I tried my best not to give unsolicited advice or ask too many questions like so what are you going to do next. Maybe the most curious reaction though is thinking about the world through a young person's eyes has inspired me to think about what my long trajectory looks like. This is a foolish statement, but here goes: Deep into my 40s, I feel youthful, and as though life will take many new interesting turns and possibilities.

One thing I've never been good at: Asking for advice. I feel as though I'm burdening the recipient with my own failure to comprehend. "Feel" is the right word there - my belly aches and I have a tinge of vertigo just sitting here thinking of asking someone for advice. But people really like being asked for advice! It's not a burden for them at all. Especially if they're given a moment to think. Being asked for advice gives the recipient free range to talk about themselves and share stories.

Anyway, when I'm lucky enough to be asked for advice by a young person, my go to is to ask "Have you read Paul Graham's essays yet?"

Dandelion Wine by Ray Bradbury

The story of a summer 100 years ago. Though I read several novels this year, this is the one that really moved and stuck with me. Fascinating as a comparison with today, a reminder of things that have changed and things that haven't. Beautiful for its deep sadness amid an underlying optimism. A reviewer on Good Reads has already said exactly what I want to say concisely: "I've never thought so much about my own mortality without running away from the subject in fear and forced-naivete. I've never felt more fulfilled by a reading experience on both an intellectual and spiritual level as I was with Dandelion Wine."

How The System Works by Charles C. Mann

The intro essay We Live Like Royalty and Don’t Know It offers a free preview to the one and only paywalled piece on my list. I fit squarely into the author's Most People These Days, as I understand only superficially what it takes to create civilization from scratch. One of the things I continuously wonder about - even working as I do so closely to real estate lending - is why real estate has become so expensive.

A lot of this I feel is my own ignorance: I failed at a goal. In 2009 when I moved to the Pacific Northwest, one of my intentions was to buy some land and build a cabin. One reason for doing that is the satisfaction of doing a thing like that. But the biggest aspect for me is that I primarily learn by doing. And by building a cabin from scratch, I supposed I would learn how all the things like power and water and sewage work, at least at the scale of one. 

Moving from purely building with bits into being able to reason about building with atoms too was a theme of 2025 for me, and while I'd like to explore 3D printing, I'm increasingly becoming interesting in ways that mechanical control flow and system orchestration can be handled with code.

Cities are routers in network society by Gordon Brander

A piece of writing usually makes my end of year list primarily because it either catalyzed a new way of looking at something, or because it provided a launching point for an order of magnitude or more of exploration.

This piece is very much in the latter camp. To start, any story whose timeline traces back to Westphalia has my attention. That's a tree I've been building for a while: Western Civ II, the last 500 years. And then references Stewart Brand and Marshall McLuhan. If you've never explored Brand or the Whole Earth Catalog, there's an afternoon of branch-building for you - one that'll probably include Amazoning a book that just missed this list, What The Dormouse Said by John Markoff.

Most of all, I've always loved cities. I live in a West Coast city at a time when a lot of my friends and family, having long abandoned city life for the peace of the suburbs, wonder what's wrong with me. They appreciate aspects of cities - especially the job opportunities - but they like to keep the mess at arms length. A lot of that has been by design, sadly.

Long explorations of interesting subjects is another way of avoiding spending time on the Worry Stack. I'll close with a partial list of a few branches from this piece that turned into hours of exploration:

  • Dark Factories
  • The intersection of bits (which I've spent my career manipulating) and atoms (kind of like the dark side of the moon to me, a fascination)
  • Jugaad and really this whole paragraph: "AI and automation increasingly absorb the scalable aspects of production. That leaves us to putty over the cracks. “Work” takes on a DIY/jugaad quality, focused on creatively hacking together powerful resources to solve contextual problems." - which actually describes what I'm really good at really well.
  • “The line is blurring between remote workers and tourists,” - I was doing this in 2004. I had a laptop and I would fly to San Francisco on a Thursday night, by myself, and instead of taking time off I would work from coffee shops. People thought I was crazy.
  • https://ephemerisle.org/index.php/Ephemerisle & https://www.mars.college/

That's it, the best things I read in 2025. Tempting to include a list of pieces that just missed the list, but I'm going to resist that. Besides, your LLM of choice can probably generate a great related list if you liked any of these. Find previous years here. Happy 2026!

posted in Reading List


Look Upstream

Whatever problem has you frustrated today, it's almost certainly worth a few moments to consider "what would need to be true for us to not have this problem at all?".

Generally speaking, the way to not have a problem is to look upstream, and see what conditions lead to you having the problem in the first place.

Running out of gas is a real world example of a problem best solved by not having the problem at all. Upstream of running out of gas is habitually stopping to get gas when the tank is below half.

That example calls out something else: There's problems, and there's problem patterns. Looking upstream doesn't solve the acute problem of being out of gas right now. Too late for that, now the problem must be solved with a call to AAA or a walk to a gas station or a helpful Samaritan in a pickup. But the pattern can be curtailed: always getting gas when the tank is below half will make it so you never have to solve this "I'm out of gas" problem ever again.

Heading upstream is also good at preventing related problems you don't have. I think of this as preventing families of problems. Lets say a natural disaster hits your community. Hurricane, fire, flood, take your pick. Authorities issue "GO NOW". If you habitually fill the gas tank when it dips below half, you're good. If you don't, you might now be running out of gas somewhere AAA isn't going to help you, and now you have a Life In Danger problem on your hands.

ChatGPT, Claude and Gemini are all really good at helping us brainstorm ideas. What would need to be true for this to not be a problem in the first place is a great prompt to wrap up a thread with an LLM on how to solve a problem.

Part of that is being willing to timebox outlandish-sounding ideas and earnestly consider them before (usually) discarding them. Twenty minutes is often enough. If you do that five times a month, you'll have spent a little over an hour and a half. If you think you're good at generating ideas now, it's a good test. Ninety minutes is less than 1% of a standard work month.

You can be a great problem solver, and great problem solvers are almost always rewarded in software development. But much more valuable is the one who prevents problems or whole families of problems from needing solving at all. if you're interested in your job being AI-proof as an engineering leader - and I don't know if we should or shouldn't be worried about that, but plenty of people seem to be - I can't think of an easier way. 

posted in Problem Solving


Tasting 2023 In Oregon

TLDR: 2023 Oregon Pinot to me will be remembered for wines that are open and beautiful at release. An echo of 2012, 2014, 2015.

Not a lot of things in wine get me as excited as the variability between Oregon pinot noir vintages. Living here for 16 years now means each one conjures memories of that year.

The movements between vintages can be as elegant and surprising as the movement on the palate in the wines themselves. 2021s with their massive (for Oregon) structures, the smaller 2022s when so many lead with their earth or savory notes, and now 2023.

I've heard some pooh-pohing of the 2023s - they're all the same! said to me by more than one winemaker or distributor - but I'm always aware that even in my own mind there's a bit of a contrarian tendency to claim love for the unloved and to snub the easily loved. 2007, 2011, 2019, those were hard vintages to love at release. Now they're favorites for many of us.

How I'm approaching 2023 so far is contextual to what I've done with the previous two vintages:

2021s: I bought good quantities of my favorite single-vineyard wines to hold.

2022s: I mostly avoided this year, which I acknowledge that I may regret if they evolve like 2007s, 2011s and 2019s.

I really enjoy 2023, and have been buying them in reasonable quantities, quite a bit more than 2022. I'm especially excited about the entry level wines - ~$25 and sometimes blessedly still below that - and have bought a reasonable quantity to PnP over the next 2-3 years.

A few specific standouts:

  • Evesham's 2023 WV. Ridiculously multi-dimensional for this price: prettiness, ruggedness, brightness, balance, length. This is one you can (or could) get outside the region at Whole Foods around $25.
  • Cameron's 2023 Clos Electrique is more open at this point than any vintage I remember. Maybe 2015. Their WV and Dundee Hills are both ready to rock as well.
  • Walter Scott's 2023 La Combe Verte. Such bright red fruit, herbal, wide open.

Some winemakers' notes on 2023

posted in Wine


Ephemeral Test Suites

I realized something the other day that now feels obvious: Opus 4.1 and other models are so good at writing bash, it makes sense to rethink how we treat tests. Unit tests are naturally 'write once, run forever'. But integration and e2e tests don’t scale that way—each one wants its own environment setup, fixture resets, and timeouts. Some of them can now be 'write once, run once, throw away'.

So when making a change that might affect many interfaces, like a major version update of your web app framework (Next.js / Rails / Express etc) for example, you can just ask Cursor or Claude Code to walk your route table, write bash to hit each endpoint, and make a plan to fix whatever problems are found.

Or to take another common angle, when making a change to one endpoint that you're worried might have downstream impacts on subsequent calls in the same real world workflow, you can pivot 90 degrees and ask the LLM to produce many inputs for the same sequence and get into some ugly edge cases.

In the former case, I might want to commit those tests to the codebase and get them running in CI. In the latter case, this might be such a bespoke use case that it makes more sense to just throw away the suite.

Test suite ephemerality is appetizing when writing a test suite takes an agent a few minutes, rather than the hours it might require when written by hand. If I'm putting hours or even days into writing tests, it's pretty likely I'm going to want to treat them as precious things to be held onto and run forever. Whereas knowing they can be regenerated at a whim, I'm much more likely to just throw them away.

A lightweight framework for deciding might look like this:

  • Add it to CI: We have a way to deterministically run integration tests in CI already (not a given in every codebase, sadly), the tests are performant (not easy), it’s a stable contract, a recurring risk, or you're fixing a regression that would be horrendous egg-on-face if it came back. (I've written before about not really being a fan of integration tests in CI, so if this skews towards exclusiveness there's that to consider)
  • Ephemeral when: it’s exploratory, migration-specific, multi-system brittle, you’re mapping blast radius, or (probably most common) it might be a useful suite but getting it deterministic, performant and not impacting other tests in parallel is intractable. 


posted in Software Development


On Building AI Applications

A coworker shared this video called 12-Factor Agents: Patterns of reliable LLM applications which for me was a total "ah ha!" moment. Not so much as any of it was new, but a concise synthesis of lots of loose thoughts I've been having while designing and building AI systems over the last few years, which to me is even more exciting. If you're building or thinking of building an AI system, I highly recommend watching it. It's 17 minutes well spent.

Here are the three most important ideas / principles that I wish I'd known before I started designing and developing agentic systems:

Key idea: Effective AI systems are built using a series of agents - distinct prompts and models - with each agent given specific context - broad or deep but rarely both - and the system composed of multiple in both dimensions. The "monoprompt" approach is very rarely going to work well.

For a simple example: In a bot designed to discuss the pros and cons of various retirement planning strategies, one agent might summarize a user query broadly, while another dives deeply into financial datasets relevant to the asset classes being discussed, and a third would dive deep into the user's own specific profile and goals. Only the first agent would be given broad user and multi-domain context in its prompt. The other two would be isolated to context that supports depth in their focus areas.

Key idea: AI product development skateboard-level prototyping is really easy, almost too easy. It's really simple to get the first 80% going and assume that the next 20% will be just as easy as the first 80%. In a way it follows most software development in that most competent web developers could build 80% of Facebook in a couple weeks - CRUD with persistence on a handful of entities being a very solved problem - but getting the full next 20% - an app that scales to billions of users - takes thousands of engineers upwards of a decade.

Again, be wary of the monoprompt. It's really easy to get 80% of the way there with one and assume the other 20% is simply edge case handling.

Key idea: Not every problem is a good problem for an agent-driven system. In the video the presenter frames it as "realize this isn't a good problem for agents". My experience is that there's more nuance, and that it's actually more of a distinction between building an AI-driven system with a human-in-the-loop, or a human-driven system with AI-in-the-loop. I've touched on this before, but I'll write an updated post soon. 

posted in AI Agentic Design