SEAT is a proof-of-concept HTTP service that lets agents build and execute their own tools at runtime. When a caller describes what a tool should do in plain language, SEAT prompts a language model to generate a Python function, validates the generated code against a strict AST-based security policy, persists the approved function in PostgreSQL, and executes it on demand inside an isolated subprocess — all without any human involvement between description and first run.
- Describe — a caller sends a plain-language description of the tool they need (e.g. "convert Celsius to Fahrenheit").
- Generate — SEAT sends the description to a configured LLM (Ollama, OpenAI, or any LLMWire-compatible provider) requesting a
GeneratedCodestructured response that includes the function name, source code, input schema, and output description. - Validate — the generated code is parsed to an AST and checked against a list of forbidden module imports (
os,sys,subprocess,socket,http, etc.) and forbidden builtin calls (exec,eval,open,__import__, etc.). Code that does not pass this check is rejected with HTTP 422. - Register — the validated function is wrapped in a thin stdin/stdout harness via a Jinja2 template and persisted in the
toolstable with its status set toactive. - Execute — the caller posts input data to the tool's execute endpoint; SEAT writes the wrapper script to a temporary file, spawns it as a subprocess with the input piped as JSON to stdin, collects the JSON result from stdout, updates rolling execution statistics, and returns the result.
docker compose up -dThis starts PostgreSQL 17 on port 5433 with the seat user and database.
pip install -e ".[dev]"
cp .env.example .env # edit LLM settings as needed
alembic upgrade headuvicorn seat.api.app:build_app --factory --reloadThe API is now available at http://localhost:8000. A built-in test GUI is served at the root URL — open http://localhost:8000/ in a browser. The OpenAPI interactive docs are at http://localhost:8000/docs.
curl -s -X POST http://localhost:8000/tools \
-H "Content-Type: application/json" \
-d '{"description": "Convert a temperature from Celsius to Fahrenheit"}' \
| jq .The response contains the tool's id, generated code, input_schema, and initial statistics.
TOOL_ID="<id from previous response>"
curl -s -X POST "http://localhost:8000/tools/$TOOL_ID/execute" \
-H "Content-Type: application/json" \
-d '{"input_data": {"celsius": 100}}' \
| jq .A successful response looks like:
{
"success": true,
"output": "212.0",
"error": null,
"execution_time": 0.031
}# List all active tools
curl -s "http://localhost:8000/tools?status=active" | jq .
# Search by keyword
curl -s "http://localhost:8000/tools/search?q=temperature" | jq .Open http://localhost:8000/ in a browser. The dark-mode single-page interface provides four tabs:
- Create Tool — enter a description (and optional name), generate a tool via the LLM.
- Tools List — browse, search, filter by status, and delete tools.
- Tool Detail — inspect a tool's code, input schema, statistics, and metadata.
- Execute — select a tool, provide JSON input, run it, and view the result.
The GUI is a single HTML file (src/seat/static/index.html) with no external dependencies. It consumes the same REST API documented below.
| Method | Path | Description |
|---|---|---|
GET |
/ |
Serves the built-in test GUI (HTML) |
GET |
/health |
Liveness check — returns {"status": "ok"} |
POST |
/tools |
Generate, validate, and register a new tool |
GET |
/tools |
List all tools; optional ?status=active|deprecated filter |
GET |
/tools/search?q=<query> |
Case-insensitive substring search on name and description |
GET |
/tools/{id} |
Retrieve a single tool by UUID |
POST |
/tools/{id}/execute |
Execute a tool with JSON input data |
DELETE |
/tools/{id} |
Remove a tool from the registry |
All endpoints return JSON. POST /tools returns 201 on success. DELETE /tools/{id} returns 204 with no body. Both return 404 when the tool does not exist. POST /tools returns 422 when generated code fails security validation.
HTTP client
│
▼
┌────────────────────────────────────────────────────────┐
│ FastAPI application (seat.api.app) │
│ │
│ POST /tools GET /tools/{id}/execute │
│ │ │ │
│ ▼ ▼ │
│ ToolGenerator ToolExecutor │
│ ┌──────────────┐ ┌───────────────────────┐ │
│ │ LLMClient │ │ tempfile + subprocess │ │
│ │ (LLMWire) │ │ JSON stdin / stdout │ │
│ │ Jinja2 wrap │ │ configurable timeout │ │
│ └──────────────┘ └───────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ CodeValidator ToolRegistry │
│ ┌──────────────┐ ┌───────────────────────┐ │
│ │ AST parser │ │ SQLAlchemy async ORM │ │
│ │ import check │ │ PostgreSQL (asyncpg) │ │
│ │ builtin check│ │ Alembic migrations │ │
│ └──────────────┘ └───────────────────────┘ │
└────────────────────────────────────────────────────────┘
The lifespan hook wires the engine, session factory, ToolGenerator, and ToolExecutor into app.state once at startup. FastAPI's dependency injection pulls them into each request handler without any global state.
SEAT applies a defence-in-depth approach to code that it did not write.
AST validation (pre-persistence). Before any code reaches the database it is parsed to a Python AST. The validator walks every node and rejects the submission if it finds:
- An
importorfrom ... importstatement whose top-level module is one of:os,sys,subprocess,shutil,socket,http,urllib,requests,httpx,ctypes,multiprocessing,threading,signal,importlib,pickle,shelve, orsqlite3. - A function call whose name matches any of:
exec,eval,compile,__import__,open,globals,locals,getattr,setattr, ordelattr. - Missing function definition — code that contains no
defis not a valid tool.
Validation is purely static; no code is executed during this phase.
Subprocess sandbox (execution). Approved tools run as a separate python3 process. SEAT passes input as JSON to the process's stdin and reads the result from stdout. The subprocess inherits no database credentials or application secrets. A configurable timeout (default 30 s, controlled by EXECUTOR_TIMEOUT) kills the process if it runs too long. The temporary script file is always deleted after execution, even on failure.
All settings are read from environment variables (or a .env file via pydantic-settings):
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
postgresql+asyncpg://seat:seat@localhost:5433/seat |
Async-compatible PostgreSQL connection URL |
LLM_PROVIDER |
ollama |
LLMWire provider name (ollama, openai, ...) |
LLM_MODEL |
llama3 |
Model identifier passed to the LLM provider |
LLM_API_KEY |
(empty) | API key for hosted providers; leave empty for Ollama |
EXECUTOR_TIMEOUT |
30 |
Maximum seconds to allow a tool subprocess to run |
MCP server generation. Each registered tool will be automatically exposed as a Model Context Protocol (MCP) tool descriptor, allowing any MCP-compatible agent (Claude Desktop, Continue, etc.) to discover and invoke SEAT-generated tools without custom API integration.
Docker tool containers. The subprocess executor will gain an optional Docker backend. When enabled, each tool execution runs inside a disposable container with a read-only filesystem, no network access, and strict resource limits — providing OS-level isolation in addition to the existing AST and timeout safeguards.
- MCP Specification: https://spec.modelcontextprotocol.io/
- Toolformer — Language Models Can Teach Themselves to Use Tools: https://arxiv.org/abs/2302.04761
- ART — Automatic multi-step Reasoning and Tool-use for large language models: https://arxiv.org/abs/2303.09014
MIT — see LICENSE.