MIKEB.MD
ES
AI

The secret loop of an agent

Mike Bermeo ⏱ 8 min read Leer en Español

In the previous post we saw that Claude Code is not a chat: on the outside it looks like a conversation, but on the inside it behaves like a system that observes, acts, and decides. And at the end we left an open question.

What is the mechanism that makes that behavior possible?

If the system doesn’t resolve everything in a single pass, there has to be some structure connecting each step. A structure that lets it look at something, act, see what happened, and decide again.

That structure is the loop. And it’s worth opening it up.

The sequence that drives everything

An agent doesn’t receive an instruction and resolve it in one shot. It enters a cycle. The basic sequence has four moments:

  1. Interprets — reads the available context and understands where it stands.
  2. Decides — chooses the most useful next action.
  3. Acts — executes that action using a tool.
  4. Evaluates — looks at the result and decides whether it has enough, or whether it needs another pass.

If it doesn’t have enough, it goes back to the beginning. If it does, it responds.

flowchart LR
  I(["🧠 Interprets"]):::model --> D
  D{{"What to do?"}}:::decision --> A
  A(["🔧 Acts"]):::tool --> E
  E(["📋 Evaluates"]):::result --> G
  G{{"Enough?"}}:::decision
  G -->|"No"| I
  G -->|"Yes"| R(["💬 Responds"]):::output

  classDef model    fill:#312e81,stroke:#818cf8,color:#e2e2f0,stroke-width:2px
  classDef decision fill:#1c1917,stroke:#f59e0b,color:#fbbf24
  classDef tool     fill:#164e63,stroke:#22d3ee,color:#e2e2f0
  classDef result   fill:#1c1917,stroke:#78716c,color:#a8a29e
  classDef output   fill:#14532d,stroke:#4ade80,color:#e2e2f0

  linkStyle 4 stroke:#818cf8,stroke-width:2px,stroke-dasharray:5
  linkStyle 5 stroke:#4ade80,stroke-width:2px

The arrow going backward is the most important piece of this diagram. Without it there’s no loop — just text generation. With it, the system can reach out, touch a tool, look at the world, and come back with new information.

That’s where it stops spinning only on what it already knew.

What actually happens in a task

Theory is one thing. Concrete behavior is another.

Imagine you ask the agent: “find out why this function is failing”. That’s not a question you can answer with intuition. You have to search. You have to look. You have to compare.

Here’s what the loop looks like in action for that task:

sequenceDiagram
  actor U as User
  participant M as 🧠 Model
  participant RF as readFile
  participant GR as grep
  participant BA as bash

  U->>M: "find out why handleAuth is failing"

  Note over M: Turn 1 — I need to see the code
  M->>RF: readFile("src/auth/handleAuth.ts")
  RF-->>M: function code (expects 2 args)

  Note over M: Turn 2 — who calls it and how
  M->>GR: grep("handleAuth(", "src/")
  GR-->>M: 3 references found

  Note over M: Turn 3 — one call with 1 arg looks suspicious
  M->>BA: bash("npm test -- --filter=auth")
  BA-->>M: FAIL: expected 2 arguments, got 1

  Note over M: I have concrete evidence now
  M->>U: "The problem is in api/routes.ts line 47..."

Four exchanges. Three different tools. Each turn changes what the model knows, and that change determines what it does next.

In code, each of those turns looks something like this:

Turn 1 — read the function code
// Agent decides: I need to see what handleAuth receives
const file = await tool.readFile("src/auth/handleAuth.ts")
// Observes: function expects (userId: string, token: string)
// Decision: enough? No. I need to see who calls it.
Turn 2 — find the references
// Agent decides: I need to see how it's called across the project
const refs = await tool.grep("handleAuth(", "src/")
// Observes: 3 results. One in api/routes.ts passes only 1 argument.
// Decision: enough? Almost. I need to confirm that's the failure.
Turn 3 — confirm the hypothesis
// Agent decides: I'll run the test to verify
const test = await tool.bash("npm test -- --filter=auth")
// FAIL: handleAuth expected 2 arguments, got 1 at routes.ts:47
// Decision: enough? Yes. I have concrete evidence.

Notice: the agent didn’t invent the answer on turn 1. Not on turn 2 either. It built it through steps. It only responded when it had something more than a guess.

That’s what the loop changes.

Each turn is a decision

There’s something easy to miss in this mechanism: the loop is not mechanical repetition.

A for iterates a fixed number of times. A while keeps going until a condition changes. An agent’s loop looks much more like the second, but the condition isn’t a counter or a simple boolean. It’s a context evaluation.

At each turn, the agent implicitly answers one question:

If the answer is “I’m missing something”, it picks a new action and goes back. If the answer is “I have enough”, it exits the cycle and responds.

In pseudocode, the logic looks like this:

the-loop-from-inside.ts
async function resolveTask(task: string) {
let context = analyze(task)
while (!haveEnough(context)) { // evaluates every turn
const action = decideNextStep(context)
const result = await execute(action) // touches the world
context = incorporate(context, result)
}
return draftResponse(context)
}

The two highlighted lines are the heart of the system. The first is the question. The second is the action. Everything else is context accumulating turn by turn.

The loop is not infinite

Understanding the loop also means understanding its limits.

The cycle doesn’t exist for the agent to spin indefinitely. It exists so it can repeat only what’s necessary. And that means navigating between two real risks:

flowchart TD
  T(["📥 Task received"]):::neutral --> P & S & E

  P(["⚡ Responds now"]):::bad
  P --> PB(["No evidence"]):::bad
  PB --> PR(["❌ Likely wrong"]):::danger

  S(["✅ Healthy loop 2-5 turns"]):::good
  S --> SB(["Evidence verified"]):::good
  SB --> SR(["✓ Grounded response"]):::success

  E(["🌀 Keeps searching"]):::bad
  E --> EB(["10+ turns no convergence"]):::bad
  EB --> ER(["❌ No response"]):::danger

  classDef neutral fill:#1e1b4b,stroke:#818cf8,color:#e2e2f0
  classDef bad     fill:#1c1917,stroke:#78716c,color:#a8a29e
  classDef danger  fill:#450a0a,stroke:#f87171,color:#fca5a5
  classDef good    fill:#1a2e1a,stroke:#4ade80,color:#86efac
  classDef success fill:#14532d,stroke:#4ade80,color:#e2e2f0

  linkStyle 1,2 stroke:#f87171,stroke-dasharray:4
  linkStyle 3,4 stroke:#4ade80,stroke-width:2px
  linkStyle 5,6 stroke:#f87171,stroke-dasharray:4

The failure on the left is responding before looking: the agent returns something plausible, but without real evidence behind it. The failure on the right is continuing to investigate without converging: the agent keeps opening files, searching patterns, running commands, and never decides it has enough.

The number of turns isn’t what measures the quality of the work. What measures it is the quality of each decision to continue or stop.

Correction, not sophistication

There’s a tendency to see the loop as a sign that the system is “more sophisticated”. More steps, more impressive.

But that inverts the causality.

The loop doesn’t exist to seem more capable. It exists to be able to self-correct. Each turn is an opportunity to contrast what the model assumes against what the world shows. Instead of inventing, it can review. Instead of assuming, it can verify.

Without loopWith loop
Source of the answerModel’s intuitionAccumulated evidence
Error handlingIgnores or fabricates themDiscovers and corrects them
Confidence basis”I think it’s this way""I verified it this way”
What happens if it failsResponse fails silentlyThe cycle detects and adjusts
Quality on complex tasksDegrades quicklyHolds up with more steps

The right column doesn’t require a smarter model. It requires the structure to be able to review itself.

That’s what the loop adds: not intelligence, but correction.

What changes when you see it this way

When you understand that Claude Code works with an internal cycle, several things that might have seemed odd start making sense.

You understand why it sometimes advances in stages instead of responding immediately. You understand why it reads a file before answering something that seemed obvious. You understand why a good response is usually the result of several micro-decisions you never see.

And it changes how you read what’s happening on screen.

You stop looking only at the output. You start seeing the process that produces it.

What comes next

The loop tells you how the agent works.

But we still haven’t opened the more concrete question that follows: what does “acting” actually mean. Because we’ve seen that the agent can read files, search text, run commands. But we haven’t looked at it closely.

What happens when it decides to use a tool. What a tool is, where they come from, how it chooses them.

That’s what the next episode is about.