What it actually means to use a tool

The previous post left a specific question open: what actually happens when the agent decides to use a tool. We know the loop exists. We know the system picks an action on each turn. But we didn’t open the action itself.

That’s what this post is about.

And the answer has a counterintuitive part worth stating upfront:

when Claude Code “uses a tool”, the model doesn’t execute anything.

It requests.

That distinction seems small. It isn’t.

The model doesn’t execute. It emits a request.

When the model decides it needs to read a file, it doesn’t open the file.

What it produces is a structured output that says, in effect: I want to use this tool, with these parameters. Here’s what a real Claude Code request looks like:

{
  "type": "tool_use",
  "id": "toolu_01Hy3k9mX2pL8vRqN4wTzA",
  "name": "read_file",
  "input": {
    "path": "src/auth/handleAuth.ts"
  }
}

That’s all that comes out of the model. Not an action. A structured intent with an id to track the response.

The runtime receives that, opens the file, and returns the result tied to the same id:

{
  "type": "tool_result",
  "tool_use_id": "toolu_01Hy3k9mX2pL8vRqN4wTzA",
  "content": "export async function handleAuth(\n  userId: string,\n  token: string\n): Promise<Session> {\n  const user = await db.users.findById(userId);\n  if (!user) throw new AuthError('User not found');\n  return createSession(user, token);\n}"
}

The tool_use_id is the thread connecting the request to its result. The model doesn’t need to know how it was executed — it just receives the content and continues.

What happens next occurs outside the model. There’s another component of the system — the runtime — that receives that request and turns it into something real.

flowchart LR
  M(["🧠 Model"]):::model --> E(["📤 Emits request"]):::model
  E --> RV(["✔ Runtime validates"]):::runtime
  RV --> RX(["⚙ Runtime executes"]):::runtime
  RX --> RR(["📥 Returns result"]):::runtime
  RR --> MI(["🧠 Model interprets"]):::model
  MI --> O(["💬 Decides next step"]):::output

  classDef model   fill:#312e81,stroke:#818cf8,color:#e2e2f0,stroke-width:2px
  classDef runtime fill:#164e63,stroke:#22d3ee,color:#e2e2f0
  classDef output  fill:#14532d,stroke:#4ade80,color:#e2e2f0

  linkStyle 1,2,3 stroke:#22d3ee,stroke-width:2px
  linkStyle 4,5 stroke:#818cf8,stroke-width:2px

The model appears twice: first emitting the request, then interpreting the result. In between, the runtime does the physical work. That structure isn’t an implementation detail. It’s the central architecture of the system.

The layer that translates

Between the model’s language and the real action, there’s a gap. The model speaks in intentions. The operating system speaks in system calls. The runtime is what crosses that gap in both directions.

Outbound: it takes the model’s structured intent, validates it, converts it into a real operation, and runs it.

Inbound: it takes the raw result of that operation and converts it into something the model can read and use on the next turn.

flowchart LR
  MI(["🧠 Intent"]):::model
  RV(["✔ Validates"]):::runtime
  RE(["⚙ Executes"]):::runtime
  W(["🌍 Filesystem / Shell"]):::world
  RF(["📋 Formats"]):::runtime
  MR(["🧠 Interprets"]):::model

  MI --> RV --> RE --> W --> RF --> MR

  classDef model   fill:#312e81,stroke:#818cf8,color:#e2e2f0,stroke-width:2px
  classDef runtime fill:#164e63,stroke:#22d3ee,color:#e2e2f0
  classDef world   fill:#1c1917,stroke:#78716c,color:#a8a29e

  linkStyle 0,1 stroke:#22d3ee,stroke-width:2px
  linkStyle 2 stroke:#78716c,stroke-width:2px
  linkStyle 3,4 stroke:#818cf8,stroke-width:2px

The key word here is translation.

The model produces an intent. The runtime translates it into an action. The result of that action gets translated back into something the model can read. That translation cycle is what turns a tool from an idea inside a prompt into a real capability of the system.

Chat vs. agent: where language ends

This separation is also the deepest difference between a chat and an agent.

A classic chat lives entirely inside language. It can describe what it would do if it read a file. It can imagine what a command would return. It can produce text that sounds like it verified something.

But it didn’t verify anything.

An agent with real tools can cross that boundary. It stops imagining and starts checking. It can say: I’m not going to guess — I’m going to look.

	Chat	Agent
What it produces	Text about the world	Requests to the runtime
Source of the result	Training + inference	Real execution
Can be wrong about facts	Yes, without knowing	Only about interpretation
Touches files, shell, APIs	No	Yes, through the runtime
Verifies or imagines	Imagines	Verifies

The model still decides everything that matters

Up to this point it might seem like the model is passive — just emitting JSON while the runtime does the real work.

That’s not right.

The runtime executes. But the model decides:

which tool to use
what parameters to call it with
what the returned result means
whether that’s enough to respond or whether it needs another turn

flowchart LR
  D1{{"Which tool?"}}:::decision --> D2{{"What params?"}}:::decision
  D2 --> RT(["⚙ Runtime"]):::runtime
  RT --> D3{{"What does this mean?"}}:::decision
  D3 --> D4{{"Respond or continue?"}}:::decision
  D4 -->|"Continue"| D1
  D4 -->|"Respond"| R(["💬 Response"]):::output

  classDef decision fill:#1c1917,stroke:#f59e0b,color:#fbbf24
  classDef runtime  fill:#164e63,stroke:#22d3ee,color:#e2e2f0
  classDef output   fill:#14532d,stroke:#4ade80,color:#e2e2f0

  linkStyle 4 stroke:#818cf8,stroke-width:2px,stroke-dasharray:5
  linkStyle 5 stroke:#4ade80,stroke-width:2px

The runtime is a box inside that cycle. It executes what the model requests. But the decisions before and after that box belong to the model.

The relationship isn’t model versus runtime. It’s model to decide, runtime to execute.

Why this separation matters

The model/runtime split might look like an engineering detail. But it has direct consequences for three things.

Safety. The model can request to read a file outside the allowed directory. The runtime rejects it. That permission layer doesn’t live inside the model — it lives in the runtime. The model can’t grant itself access.

Reliability. The runtime can validate parameter formats before executing, retry on failure, and limit the scope of an operation. The model doesn’t have to handle those cases — the runtime absorbs them.

Extensibility. Adding a new tool doesn’t require retraining the model. You register it in the runtime and describe it correctly. The model learns to use it from the description, not from a new training run.

What comes next

Now we have the three base pieces of the architecture.

The first post explained that Claude Code is not a chat — it’s a system that works. The second opened the loop. This one opened the action inside that loop.

What we haven’t looked at closely are the most concrete tools the system has: the filesystem and the shell.

Those two tools define most of what an agent can actually do inside a real project. Not as concepts — as physical operations on your machine.

That’s what comes next.