tzro — Local Offload Engine for AI Agents
A background engine that pairs with your primary AI agent hosts. Offload long-running codebase searches, multi-source web research, and bulk tool executions to local models—saving cloud costs and keeping your main context clean.
Delegate Heavy Tool-Use Sequences Locally
Traditional AI agents waste expensive cloud tokens and context slots running loops of standard tools (reading files, listing directories, scraping pages). tzro is built to run alongside your favorite agent client, delegating complex tool sequences to local, offline models that execute them cheaply, safely, and durably.
The Strategist
- Execution Frequency: Invoked exactly once at task startup.
- Abstract Graph Blueprint: Translates your high-level goal into a coarse, dependency-mapped JSON blueprint defining tool-use constraints.
- Minimal System Overhead: Only processes compressed skill metadata, completely avoiding cloud timeouts or context-window slot thrashing.
- Cost Efficiency: Yields up to 80% prompt reduction by offloading execution steps and raw outputs to local models.
The Tactician
- Execution Frequency: Runs local tool-use loops in the background per compiled action node.
- GBNF-Constrained Completion: Backus-Naur grammar structures guarantee 100% syntactically correct tool arguments.
- 5-Layer Compaction: Compresses large tool responses (up to 85% reduction) on-device before summarizing results.
- Priority Preemption: Wipes background task context slots instantly to serve sub-450ms interactive human chats.
Model Context Protocol (MCP) Setup
Connect tzro directly to your daily AI assistant. By running as an MCP server, your coding agent (Claude Desktop, Cursor, Antigravity) can instantly offload long-running workflows, search memory databases, and run local commands in the background.
Build the MCP Server Binary
Compile the server binary target from the root repository directory:
go build -o bin/tzro-mcp ./cmd/tzro-mcp
Register in Client Configuration
Select your workspace or client configuration interface to insert the server definition:
{
"mcpServers": {
"tzro": {
"command": "/absolute/path/to/tzro/bin/tzro-mcp",
"args": [],
"env": {
"PORT": "8080"
}
}
}
}
Run the Handshake Verification Test
Test the server manually from the terminal to verify standard input/output JSON-RPC message framing. Launch the compiled binary:
./bin/tzro-mcp
Paste this initialization JSON block and press
Enter. The server should respond with its
protocol capabilities metadata:
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": { "name": "test-client", "version": "1.0.0" }
}
}
Safeguard the Stdio Pipe
Standard input and output are strictly reserved for JSON-RPC
serialization framing. Any debug logs or print statements
written to stdout will corrupt the protocol
stream and drop client connections.
Exposed Protocol Tools
Once registered, tzro exposes its entire execution, memory, and model control capability suite:
- tzro_run: Run durable DAG workflows from natural language
- tzro_status: Inspect status and outcomes of a specific task
- tzro_list_tasks: List recent planning and execution tasks
- tzro_configure_tools: Dynamically register third-party stdio MCP daemons
- tzro_memory_query: Query memory graphs and facts semantically
- tzro_memory_ingest: Store memories in the SQLite vector database
- tzro_kg_neighborhood: Multi-hop graph traversal starting from an entity
- tzro_kg_add_entity: Create/update nodes and edge relationships
- tzro_rag_context: Retrieve graph-RAG context for prompting
- tzro_skills_add: Sync custom procedural micro-skills (SOPs)
- tzro_hook_approve: Resume paused human-in-the-loop task steps
- tzro_client_tool_submit: Submit outputs of client-side executed tools
- tzro_model_list: List downloaded & active local GGUF models
- tzro_model_set: Change active local tactician worker models
Autonomous Agent Offload & Wait Protocol
To avoid high-latency cloud context slot thrashing, client-side agents must delegate non-reasoning tasks to the local durable tzro engine.
The Offload Decision Rule
tzro_run (Local durable DAG
executor)
Mandatory Delegation Areas
- Codebase Exploration & Analysis: Codebase navigation or file pattern searching must run via a local Probe Node utilizing reactive Thought Chains.
- Multi-Source Web Research: Batching web searches, scraping, and compiling findings.
- Memory Ingestion Pipelines: Bulk storing memories and vector representations.
Delegation Triggers
Evaluate DAG offloading if about to make 3 or more sequential external tool calls, or has made 5 or more in-context calls.
Wait Protocol & Prompts
The Wait Protocol
- Stop In-Context Execution: Do not call tools or execute parallel steps in the client session.
- Set One-Shot Timer: Schedule a wakeup reminder using the client's timer tool.
-
Verify Complete: Check task status via
tzro_status; resume when status returnscompleted. -
Consume Synthesized State: Read only the
terminal_synthesisoutcome. Avoid reading raw node output properties.
Suggested Prompt Templates
A. Research & Ingestion
"Use web_search to find the latest changes and trends in the AI orchestration space, compile the findings, and save the final structured summary to memory using the save_memory tool."
B. Multi-System Automation
"Execute a workflow to query recent lead records using salesforce_query, run deduplication check with postgres_insert, and post the execution report to the slack_message tool."
C. Codebase Exploration (Probe Node)
"Explore the project at /path/to/repo using a Probe Node. Read the top-level structure, then follow the most important files to understand the architecture. Produce a structured summary covering purpose, major components, key packages, and design patterns. Use read_file, list_dir, and search_files."
1. Background Tool Execution Simulator
Watch how tzro processes offloaded agent tasks. Under the hood, tzro compiles instructions into a Directed Acyclic Graph (DAG) and runs independent steps in parallel, satisfying dependencies dynamically while keeping your agent responsive.
| ID | Subsystem | Engine |
|---|---|---|
| 01 | Durable Execution | SQLite Checkpoints |
| 02 | Kahn Graph Compiler | DAG Topological Sorter |
| 03 | Relational Memory | FTS5 Vector ONNX |
| 04 | WASM Micro-Skills | Isolated Sandbox Wasmtime |
| 05 | Stdio MCP Gateway | Process Host Auto-Healing |
Background Execution Pipeline
Watch how offloaded tasks compile into dependency levels and execute concurrently on local threads.
2. Context Compaction & SQLite Caching
Keep your agent's context pristine. When tools produce large outputs, tzro compresses them on-device (saving up to 85% of attention space) or offloads massive datasets to a local SQLite cache.
Context Compaction & SQLite Cache Playground
Select raw tool data outputs and watch how tzro compresses them into clean, structured layers before returning them.
3. Go SDK & Synchronous Guardrail Hooks
Extend tzro's capabilities with a clean Go SDK. Register custom native tools, subscribe to real-time telemetry, and inject blocking middleware hooks to redact PII, enforce safety guardrails, or pause executions for supervisor checks.
// Loading snippet...
Synchronous Hook Playground
Inject custom Go middleware hook events natively into the active DAG task loop to intercept executions dynamically.
AfterNode
delete_all_records tool
nodes
Zero-Setup Quickstart
curl -sSL https://tzro.network/install.sh | bash
tzro --offline
go run cmd/tzrod/main.go