How Agentix Works
The core pattern
Every Agentix interaction follows the same three-step shape:
AgentixAgentOptions → AgentixClient → query() → stream of Messages
AgentixAgentOptions— declare what the agent is: which provider, which model, which tools, what permissions, how persistent.AgentixClient— the runtime session. Holds state between turns and owns the connection to the LLM and any MCP servers.query(prompt)— an async generator. Call it with a prompt; iterate the yieldedMessageobjects to consume the agent's response.
from agentix import AgentixAgentOptions, AgentixClient, AssistantMessage, ResultMessage, TextBlock
options = AgentixAgentOptions(
provider="anthropic",
model="claude-sonnet-4-20250514",
system_prompt="You are a helpful assistant.",
)
async with AgentixClient(options) as client:
async for msg in client.query("List the files in the current directory."):
if isinstance(msg, AssistantMessage):
for block in msg.content:
if isinstance(block, TextBlock):
print(block.text)
if isinstance(msg, ResultMessage):
break
The agent loop
When you call query(), Agentix enters an agent loop — a cycle of LLM calls and tool executions that continues until the agent reaches a natural stopping point.
┌─────────────────────────────────┐
│ Agent Loop │
│ │
prompt ──────────────►│ 1. Build context window │
│ 2. Call LLM │
│ 3. Parse response │
│ ├─ Text only → stop │
│ └─ Tool calls → execute │
│ ├─ tool result back │
│ └─ loop again ───────────┤
│ │
│ Limits: max_iterations, timeout │
└─────────────────────────────────┘
│
▼
ResultMessage
The loop runs up to max_iterations times (default: 20). Each iteration is one LLM call, which may produce zero or more tool calls. The loop exits when:
- The LLM produces a response with no tool calls (
stop_reason="end_turn") max_iterationsis reached (stop_reason="max_iterations")client.interrupt()is called from another task (stop_reason="interrupt")- The LLM hits its token limit (
stop_reason="max_tokens")
The message stream
query() yields messages in order as they occur. You always receive every message type for a full run:
UserMessage ← your prompt (emitted once at the start)
AssistantMessage ← LLM response for turn 1 (may contain ToolUseBlock)
AssistantMessage ← LLM response for turn 2
...
ResultMessage ← final summary; always the last message
When the agent uses tools, the AssistantMessage includes ToolUseBlock entries followed by ToolResultBlock entries showing what each tool returned.
Content blocks inside AssistantMessage
AssistantMessage.content → list of content blocks
├── TextBlock(text) — model's text output
├── ThinkingBlock(thinking, signature) — extended reasoning (Anthropic only)
├── ToolUseBlock(id, name, input) — model requesting a tool call
└── ToolResultBlock(tool_use_id, content, is_error) — result fed back to model
The ResultMessage
ResultMessage is always the last yielded message, regardless of success or failure. It is safe to access even on errors — msg.result never raises.
async for msg in client.query("..."):
if isinstance(msg, ResultMessage):
if msg.is_error:
print(f"Failed: {msg.result} ({msg.subtype})")
else:
print(f"Done in {msg.num_turns} turns, {msg.duration_ms}ms")
stop_reason | Meaning |
|---|---|
"end_turn" | LLM finished naturally |
"max_tokens" | LLM token limit hit |
"max_iterations" | Loop iteration limit reached |
"interrupt" | client.interrupt() was called |
subtype (on error) | Meaning |
|---|---|
None | Success |
"error_max_turns" | Exceeded max_iterations |
"error_max_budget_usd" | Exceeded max_tokens_budget |
"error_timeout" | Agent timed out |
"error_unknown" | Unexpected error |
Session continuity
AgentixClient automatically maintains session state between calls. You don't track session IDs manually — the client remembers the last session_id from ResultMessage and reuses it on the next query().
async with AgentixClient(options) as client:
async for msg in client.query("My name is Alice."):
if isinstance(msg, ResultMessage): pass
# Client uses the same session — agent remembers "Alice"
async for msg in client.query("What is my name?"):
if isinstance(msg, ResultMessage):
print(msg.result) # "Your name is Alice."
To start fresh: client.reset_session(). The next query() will open a new session.
Sessions are persisted to disk by default under ~/.agentix/projects. See Sessions for forking, resuming, and listing sessions.
Context window management
As a conversation grows, older messages are automatically summarized and compacted to keep the context window within max_context_tokens (default: 16,384 tokens). The memory_window setting (default: 10) controls how many recent messages are always kept verbatim.
Full history: [msg1][msg2]...[msg20][msg21]...[msg30] ← growing
│
▼ (exceeds max_context_tokens)
Compacted: [SUMMARY of msg1..msg20][msg21]...[msg30]
The PreCompact hook fires before compaction so you can log or intervene. See Sessions.
Tools
Agentix comes with built-in tools (Bash, Read, Write, Grep, WebSearch, etc.) that the LLM can call during the agent loop. You can restrict which tools are available, add custom tools, or intercept every tool call via hooks or permission callbacks.
Control access:
options = AgentixAgentOptions(
allowed_tools=["Read", "Glob", "Grep"], # only these tools
disallowed_tools=["Bash"], # always blocked
)
Add a custom tool:
@client.tool(name="GetWeather", description="Get current weather for a city.")
async def get_weather(city: str) -> str:
return f"Sunny, 22°C in {city}"
See Tools & Permissions for the full built-in tool list and permission modes.
One-shot vs. multi-turn
One-shot — use the top-level query() function when you don't need session continuity:
from agentix import query, AgentixAgentOptions, ResultMessage
async for msg in query("What is 2 + 2?", options=AgentixAgentOptions()):
if isinstance(msg, ResultMessage):
print(msg.result)
Each call to the top-level query() creates and destroys a new client. Use AgentixClient when you need multi-turn sessions, custom tools, or lifecycle control.
Key limits and their defaults
| Setting | Default | What it controls |
|---|---|---|
max_iterations | 20 | Max agent loop iterations per query() |
max_tokens | 4096 | Max tokens per LLM response |
max_context_tokens | 16384 | Triggers context compaction when exceeded |
tool_timeout | 120s | Max time a single tool call may take |
retry_max_attempts | 3 | LLM call retries on transient failures |