Recipes don't work for AI agents as well as you think

contents

Why recipes don’t survive contact with reality
This isn’t just one founder’s opinion
Where the recipe still has a place
The dividends principle

The most reliable line in my AI agent’s prompt isn’t an instruction. It’s a cultural value:

Our product always adheres to “the dividends principle”: we reward user investment immediately with delight or magic.

It sits at the top of the onboarding prompt, above 200 lines of explicit branching logic, and it handles more situations correctly than any of the rules underneath it. I didn’t expect that when I wrote it.

Almost every prompt I’ve seen for an AI agent looks like an instruction manual. The 200-line prompt files produce agents that follow the script decently when the script fits, but produce useless outcomes when it doesn’t. The one-line value, weirdly, if done well, generalizes and produces more reliable outcomes.

Why recipes don’t survive contact with reality

The instinct to write a step-by-step recipe feels intuitive. The more I spell out, the more reliable the agent will be, right? That’s true some of the time, but anyone who has tried knows that agents don’t work as reliably as you would like. The pure recipe approach is wrong for the same reason that giving a new hire a 200-page manual is the wrong way to onboard them.

A new employee with a manual can probably handle the situations and examples covered in the document, but they most likely won’t substitute when the situation is off-pattern, can’t tell you why the policy exists, and will fall back to “I’ll ask my manager” the moment something is unfamiliar. Help them internalize your culture instead, alongside the manual, i.e. what the company values, what good judgment looks like here, what you’d do in their seat if you had to choose, and they’ll handle the situations the manual covers, plus a bunch of adjacent ones that you never anticipated, all roughly the way you would have wanted them to.

The same thing happens with AI agents. The 200-line prompt gives the agent a guide for 80-90% of onboarding scenarios. The one-line value gives it a tiebreaker it can apply across every adjacent scenario, including the ones you didn’t anticipate.

This isn’t just one founder’s opinion

The strongest evidence in favor of the principles-vs-recipe point is that all strong teams, across time and discipline, eventually arrive at it from different angles.

Anthropic’s published guidance for CLAUDE.md files (the project-context file every Claude Code session reads) is unambiguous: as few instructions as possible, ideally only ones which are universally applicable to your task. If removing a line doesn’t result in an obvious mistake, remove it. HumanLayer’s published version recommends keeping the file under 60 lines ideally, and no more than 300. That’s the team that ships agents giving direct advice to the operators of agents.

Randall Bennett of Bolt Foundry, building agent-reliability tooling for several years, frames the same idea from the other side: “agents are about culture; computers are about [instructions].” The system prompt is for values; the rest is what computers can already do.

This isn’t limited to modern-day employees and AI agents. People arrived at the same conclusion two centuries ago, on the battlefield. The Prussian army got beaten badly enough at Jena and Auerstedt in 1806 that they had to rethink how orders should work. The doctrine they landed on, eventually called mission command, has the same shape: subordinates are told what effect to achieve and why, then decide locally how. It’s a response to the same problem AI operators have today, where the principal can’t anticipate everything that happens in the field. Centralized intent, decentralized execution.

Where the recipe still has a place

I’m not saying you should write zero instructions. There’s still a floor of explicit constraints worth writing down, especially the ones that would be expensive to discover by trial and error. “Don’t refund without checking the user’s payment method first.” “All dates are in the format YYYY-MM-DD.” “We use Jira instead of Linear.” The reason they sit in the prompt instead of being inferred from a principle is that the cost of inferring wrong is high.

Beyond those specific instructions, the principles do most of the adjacent work, and the fact that we cover an explicit floor means we handle the set of cases where getting it wrong by guessing is too costly.

The other thing the principles-only framing misses is operator responsibility. This is the part I find easy to dismiss and hard to actually do. Refining the principles file is real work. You don’t write a values document once and walk away. You watch what the agent does, you notice the choices that surprise you, and you go back and update the intent. If the agent makes a choice you didn’t expect, your intent wasn’t good enough yet. I’d argue this is closer to managing a team than writing a spec.

The dividends principle

In my own agent-coded products, the example I mentioned above has been working so effectively that it’s the whole reason I wanted to write this post. The dividends principle: good products reward user investment immediately, not after some milestone. Whether it’s time investment that a user makes, or the willingness to give us information about themselves during onboarding, or an integration or configuration step, I want my product to reward the user with a moment of magic or delight. I added this to my onboarding agent’s system prompt as a value rather than a rule.

What surprised me was how durable it turned out to be. Different agents, different scenarios, but they kept making decisions that respected the principle. They also came up with some great new ideas I would have never thought of, even when nothing in the prompt explicitly told them to apply the principle to that case. The agent uses the cultural value for ideation as well as in tiebreaker scenarios when explicit instructions run out, which is most of the time in practice.

I’m still figuring out a few things. How many principles is the right number? Mine have been creeping; if I add one a week I’ll be back to a procedure manual by August. And how does this scale across teams; can a principle that works for one operator be reused by another, or is it culture-bound the same way company values are?

As models continue to improve, I’d bet most of the prompts being written right now are going to get thrown out and rewritten as values-based documents within the next 18 months. The ones that don’t will only apply to cases where the total range of possible outcomes is small enough that exhaustive rules can cover them all. That’s a smaller category than most teams think it is.

The agents of the future will look less like compliance contractors and more like new hires who internalized the culture. The work shifts from writing rules to articulating intent, and from reviewing outputs to refining the intent when the output surprises you. It’s slower than it sounds, in the way that managing a team is slower than writing a spec, but the compounding will definitely show up later.

Why recipes don’t survive contact with reality#

This isn’t just one founder’s opinion#

Where the recipe still has a place#

The dividends principle#

The tedious middle is where the next AI wave lives

AI products will live or die based on this rule

Why recipes don’t survive contact with reality

This isn’t just one founder’s opinion

Where the recipe still has a place

The dividends principle