tzro
COMPLEMENTARY AGENTIC COMPANION

tzro — Local Offload Engine for AI Agents

A background engine that pairs with your primary AI agent hosts. Offload long-running codebase searches, multi-source web research, and bulk tool executions to local models—saving cloud costs and keeping your main context clean.

Architectural Shift

Delegate Heavy Tool-Use Sequences Locally

Traditional AI agents waste expensive cloud tokens and context slots running loops of standard tools (reading files, listing directories, scraping pages). tzro is built to run alongside your favorite agent client, delegating complex tool sequences to local, offline models that execute them cheaply, safely, and durably.

☁️

The Strategist

Cloud Frontier Agent (e.g. Gemini 3.5 Flash)
  • Execution Frequency: Invoked exactly once at task startup.
  • Abstract Graph Blueprint: Translates your high-level goal into a coarse, dependency-mapped JSON blueprint defining tool-use constraints.
  • Minimal System Overhead: Only processes compressed skill metadata, completely avoiding cloud timeouts or context-window slot thrashing.
  • Cost Efficiency: Yields up to 80% prompt reduction by offloading execution steps and raw outputs to local models.
💻

The Tactician

Local GGUF llama-server (4B/8B Model)
  • Execution Frequency: Runs local tool-use loops in the background per compiled action node.
  • GBNF-Constrained Completion: Backus-Naur grammar structures guarantee 100% syntactically correct tool arguments.
  • 5-Layer Compaction: Compresses large tool responses (up to 85% reduction) on-device before summarizing results.
  • Priority Preemption: Wipes background task context slots instantly to serve sub-450ms interactive human chats.
Protocol Integration

Model Context Protocol (MCP) Setup

Connect tzro directly to your daily AI assistant. By running as an MCP server, your coding agent (Claude Desktop, Cursor, Antigravity) can instantly offload long-running workflows, search memory databases, and run local commands in the background.

1

Build the MCP Server Binary

Compile the server binary target from the root repository directory:

go build -o bin/tzro-mcp ./cmd/tzro-mcp
2

Register in Client Configuration

Select your workspace or client configuration interface to insert the server definition:

{
  "mcpServers": {
    "tzro": {
      "command": "/absolute/path/to/tzro/bin/tzro-mcp",
      "args": [],
      "env": {
        "PORT": "8080"
      }
    }
  }
}
3

Run the Handshake Verification Test

Test the server manually from the terminal to verify standard input/output JSON-RPC message framing. Launch the compiled binary:

./bin/tzro-mcp

Paste this initialization JSON block and press Enter. The server should respond with its protocol capabilities metadata:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {},
    "clientInfo": { "name": "test-client", "version": "1.0.0" }
  }
}
4

Safeguard the Stdio Pipe

Standard input and output are strictly reserved for JSON-RPC serialization framing. Any debug logs or print statements written to stdout will corrupt the protocol stream and drop client connections.

CRITICAL WARNING: All internal logging, telemetry events, and initialization warnings in tzro are redirected to stderr. When writing custom middleware, tools, or extensions, always use fmt.Fprintf(os.Stderr, ...).

Exposed Protocol Tools

Once registered, tzro exposes its entire execution, memory, and model control capability suite:

  • tzro_run: Run durable DAG workflows from natural language
  • tzro_status: Inspect status and outcomes of a specific task
  • tzro_list_tasks: List recent planning and execution tasks
  • tzro_configure_tools: Dynamically register third-party stdio MCP daemons
  • tzro_memory_query: Query memory graphs and facts semantically
  • tzro_memory_ingest: Store memories in the SQLite vector database
  • tzro_kg_neighborhood: Multi-hop graph traversal starting from an entity
  • tzro_kg_add_entity: Create/update nodes and edge relationships
  • tzro_rag_context: Retrieve graph-RAG context for prompting
  • tzro_skills_add: Sync custom procedural micro-skills (SOPs)
  • tzro_hook_approve: Resume paused human-in-the-loop task steps
  • tzro_client_tool_submit: Submit outputs of client-side executed tools
  • tzro_model_list: List downloaded & active local GGUF models
  • tzro_model_set: Change active local tactician worker models
AI Agent Integration

Autonomous Agent Offload & Wait Protocol

To avoid high-latency cloud context slot thrashing, client-side agents must delegate non-reasoning tasks to the local durable tzro engine.

The Offload Decision Rule

"Do I need frontier-model-exclusive reasoning (code generation, complex architectural judgment, interactive human dialog) about intermediate outputs to decide the next step?"
YES Keep execution in-context (Frontier cloud reasoning)
NO Offload to tzro_run (Local durable DAG executor)

Mandatory Delegation Areas

  • Codebase Exploration & Analysis: Codebase navigation or file pattern searching must run via a local Probe Node utilizing reactive Thought Chains.
  • Multi-Source Web Research: Batching web searches, scraping, and compiling findings.
  • Memory Ingestion Pipelines: Bulk storing memories and vector representations.

Delegation Triggers

Evaluate DAG offloading if about to make 3 or more sequential external tool calls, or has made 5 or more in-context calls.

Wait Protocol & Prompts

The Wait Protocol

  1. Stop In-Context Execution: Do not call tools or execute parallel steps in the client session.
  2. Set One-Shot Timer: Schedule a wakeup reminder using the client's timer tool.
  3. Verify Complete: Check task status via tzro_status; resume when status returns completed.
  4. Consume Synthesized State: Read only the terminal_synthesis outcome. Avoid reading raw node output properties.

Suggested Prompt Templates

A. Research & Ingestion
"Use web_search to find the latest changes and trends in the AI orchestration space, compile the findings, and save the final structured summary to memory using the save_memory tool."
B. Multi-System Automation
"Execute a workflow to query recent lead records using salesforce_query, run deduplication check with postgres_insert, and post the execution report to the slack_message tool."
C. Codebase Exploration (Probe Node)
"Explore the project at /path/to/repo using a Probe Node. Read the top-level structure, then follow the most important files to understand the architecture. Produce a structured summary covering purpose, major components, key packages, and design patterns. Use read_file, list_dir, and search_files."
Engine Core

1. Background Tool Execution Simulator

Watch how tzro processes offloaded agent tasks. Under the hood, tzro compiles instructions into a Directed Acyclic Graph (DAG) and runs independent steps in parallel, satisfying dependencies dynamically while keeping your agent responsive.

tzro_daemon_monitor.sh
$ tzro status
[OK] SQLite Persistence Layer Initialized: ~/.tzro/data/tzro.db
[OK] Llama-Server Tactician Linked: port 36888 [Active Slots: 1]
[OK] Eino Strategist Adapter Ready: mode = cooperative
$ tzro list-subsystems
ID Subsystem Engine
01 Durable Execution SQLite Checkpoints
02 Kahn Graph Compiler DAG Topological Sorter
03 Relational Memory FTS5 Vector ONNX
04 WASM Micro-Skills Isolated Sandbox Wasmtime
05 Stdio MCP Gateway Process Host Auto-Healing
$ _

Background Execution Pipeline

Watch how offloaded tasks compile into dependency levels and execute concurrently on local threads.

1. Cloud Plan
Strategist (Gemini)
Goal Intent Routing
2. Level 0 (Parallel)
node_01 [Go]
archive_files
node_02 [WASM]
fetch_user_details
3. Level 1 (Dependent)
node_03 [MCP]
send_team_alert
Awaiting offloaded task triggers...
Optimization

2. Context Compaction & SQLite Caching

Keep your agent's context pristine. When tools produce large outputs, tzro compresses them on-device (saving up to 85% of attention space) or offloads massive datasets to a local SQLite cache.

Context Compaction & SQLite Cache Playground

Select raw tool data outputs and watch how tzro compresses them into clean, structured layers before returning them.

Layer 0: Binary PruningIdle
Layer 1: HTML-to-MarkdownIdle
Layer 2: TSV Tabular HoistingIdle
Layer 3: KV FlatteningIdle
Layer 4: Dot-Notation CompressorIdle
Raw Data Input (22.4 KB)
Compacted Context Output (0.0 KB)


                    
                    
                  
Developer SDK

3. Go SDK & Synchronous Guardrail Hooks

Extend tzro's capabilities with a clean Go SDK. Register custom native tools, subscribe to real-time telemetry, and inject blocking middleware hooks to redact PII, enforce safety guardrails, or pause executions for supervisor checks.

Engine Integration Guides
config.go Go

// Loading snippet...
              

Synchronous Hook Playground

Inject custom Go middleware hook events natively into the active DAG task loop to intercept executions dynamically.

Redacts credentials and SSN values inside AfterNode
Intercepts and skips delete_all_records tool nodes
Pauses compiled tasks at Level 1 to prompt supervisor checks
Initialize the pipeline configuration to run...

Zero-Setup Quickstart

1. Install CLI Engine
curl -sSL https://tzro.network/install.sh | bash
2. Launch Telemetry TUI
tzro --offline
3. Start Web UI Daemon
go run cmd/tzrod/main.go