Skip to main content

How Agentix Works

The core pattern

Every Agentix interaction follows the same three-step shape:

AgentixAgentOptions  →  AgentixClient  →  query()  →  stream of Messages
  1. AgentixAgentOptions — declare what the agent is: which provider, which model, which tools, what permissions, how persistent.
  2. AgentixClient — the runtime session. Holds state between turns and owns the connection to the LLM and any MCP servers.
  3. query(prompt) — an async generator. Call it with a prompt; iterate the yielded Message objects to consume the agent's response.
from agentix import AgentixAgentOptions, AgentixClient, AssistantMessage, ResultMessage, TextBlock

options = AgentixAgentOptions(
provider="anthropic",
model="claude-sonnet-4-20250514",
system_prompt="You are a helpful assistant.",
)

async with AgentixClient(options) as client:
async for msg in client.query("List the files in the current directory."):
if isinstance(msg, AssistantMessage):
for block in msg.content:
if isinstance(block, TextBlock):
print(block.text)
if isinstance(msg, ResultMessage):
break

The agent loop

When you call query(), Agentix enters an agent loop — a cycle of LLM calls and tool executions that continues until the agent reaches a natural stopping point.

                        ┌─────────────────────────────────┐
│ Agent Loop │
│ │
prompt ──────────────►│ 1. Build context window │
│ 2. Call LLM │
│ 3. Parse response │
│ ├─ Text only → stop │
│ └─ Tool calls → execute │
│ ├─ tool result back │
│ └─ loop again ───────────┤
│ │
│ Limits: max_iterations, timeout │
└─────────────────────────────────┘


ResultMessage

The loop runs up to max_iterations times (default: 20). Each iteration is one LLM call, which may produce zero or more tool calls. The loop exits when:

  • The LLM produces a response with no tool calls (stop_reason="end_turn")
  • max_iterations is reached (stop_reason="max_iterations")
  • client.interrupt() is called from another task (stop_reason="interrupt")
  • The LLM hits its token limit (stop_reason="max_tokens")

The message stream

query() yields messages in order as they occur. You always receive every message type for a full run:

UserMessage          ← your prompt (emitted once at the start)
AssistantMessage ← LLM response for turn 1 (may contain ToolUseBlock)
AssistantMessage ← LLM response for turn 2
...
ResultMessage ← final summary; always the last message

When the agent uses tools, the AssistantMessage includes ToolUseBlock entries followed by ToolResultBlock entries showing what each tool returned.

Content blocks inside AssistantMessage

AssistantMessage.content  →  list of content blocks
├── TextBlock(text) — model's text output
├── ThinkingBlock(thinking, signature) — extended reasoning (Anthropic only)
├── ToolUseBlock(id, name, input) — model requesting a tool call
└── ToolResultBlock(tool_use_id, content, is_error) — result fed back to model

The ResultMessage

ResultMessage is always the last yielded message, regardless of success or failure. It is safe to access even on errors — msg.result never raises.

async for msg in client.query("..."):
if isinstance(msg, ResultMessage):
if msg.is_error:
print(f"Failed: {msg.result} ({msg.subtype})")
else:
print(f"Done in {msg.num_turns} turns, {msg.duration_ms}ms")
stop_reasonMeaning
"end_turn"LLM finished naturally
"max_tokens"LLM token limit hit
"max_iterations"Loop iteration limit reached
"interrupt"client.interrupt() was called
subtype (on error)Meaning
NoneSuccess
"error_max_turns"Exceeded max_iterations
"error_max_budget_usd"Exceeded max_tokens_budget
"error_timeout"Agent timed out
"error_unknown"Unexpected error

Session continuity

AgentixClient automatically maintains session state between calls. You don't track session IDs manually — the client remembers the last session_id from ResultMessage and reuses it on the next query().

async with AgentixClient(options) as client:
async for msg in client.query("My name is Alice."):
if isinstance(msg, ResultMessage): pass

# Client uses the same session — agent remembers "Alice"
async for msg in client.query("What is my name?"):
if isinstance(msg, ResultMessage):
print(msg.result) # "Your name is Alice."

To start fresh: client.reset_session(). The next query() will open a new session.

Sessions are persisted to disk by default under ~/.agentix/projects. See Sessions for forking, resuming, and listing sessions.


Context window management

As a conversation grows, older messages are automatically summarized and compacted to keep the context window within max_context_tokens (default: 16,384 tokens). The memory_window setting (default: 10) controls how many recent messages are always kept verbatim.

Full history:  [msg1][msg2]...[msg20][msg21]...[msg30]  ← growing

▼ (exceeds max_context_tokens)
Compacted: [SUMMARY of msg1..msg20][msg21]...[msg30]

The PreCompact hook fires before compaction so you can log or intervene. See Sessions.


Tools

Agentix comes with built-in tools (Bash, Read, Write, Grep, WebSearch, etc.) that the LLM can call during the agent loop. You can restrict which tools are available, add custom tools, or intercept every tool call via hooks or permission callbacks.

Control access:

options = AgentixAgentOptions(
allowed_tools=["Read", "Glob", "Grep"], # only these tools
disallowed_tools=["Bash"], # always blocked
)

Add a custom tool:

@client.tool(name="GetWeather", description="Get current weather for a city.")
async def get_weather(city: str) -> str:
return f"Sunny, 22°C in {city}"

See Tools & Permissions for the full built-in tool list and permission modes.


One-shot vs. multi-turn

One-shot — use the top-level query() function when you don't need session continuity:

from agentix import query, AgentixAgentOptions, ResultMessage

async for msg in query("What is 2 + 2?", options=AgentixAgentOptions()):
if isinstance(msg, ResultMessage):
print(msg.result)

Each call to the top-level query() creates and destroys a new client. Use AgentixClient when you need multi-turn sessions, custom tools, or lifecycle control.


Key limits and their defaults

SettingDefaultWhat it controls
max_iterations20Max agent loop iterations per query()
max_tokens4096Max tokens per LLM response
max_context_tokens16384Triggers context compaction when exceeded
tool_timeout120sMax time a single tool call may take
retry_max_attempts3LLM call retries on transient failures