Skip to content

alexmar07/seat

Repository files navigation

SEAT — Self-Extending Agent Toolkit

Python FastAPI PostgreSQL License Tests

SEAT is a proof-of-concept HTTP service that lets agents build and execute their own tools at runtime. When a caller describes what a tool should do in plain language, SEAT prompts a language model to generate a Python function, validates the generated code against a strict AST-based security policy, persists the approved function in PostgreSQL, and executes it on demand inside an isolated subprocess — all without any human involvement between description and first run.

How It Works

  1. Describe — a caller sends a plain-language description of the tool they need (e.g. "convert Celsius to Fahrenheit").
  2. Generate — SEAT sends the description to a configured LLM (Ollama, OpenAI, or any LLMWire-compatible provider) requesting a GeneratedCode structured response that includes the function name, source code, input schema, and output description.
  3. Validate — the generated code is parsed to an AST and checked against a list of forbidden module imports (os, sys, subprocess, socket, http, etc.) and forbidden builtin calls (exec, eval, open, __import__, etc.). Code that does not pass this check is rejected with HTTP 422.
  4. Register — the validated function is wrapped in a thin stdin/stdout harness via a Jinja2 template and persisted in the tools table with its status set to active.
  5. Execute — the caller posts input data to the tool's execute endpoint; SEAT writes the wrapper script to a temporary file, spawns it as a subprocess with the input piped as JSON to stdin, collects the JSON result from stdout, updates rolling execution statistics, and returns the result.

Quick Start

Start the database

docker compose up -d

This starts PostgreSQL 17 on port 5433 with the seat user and database.

Install and migrate

pip install -e ".[dev]"
cp .env.example .env          # edit LLM settings as needed
alembic upgrade head

Start the API server

uvicorn seat.api.app:build_app --factory --reload

The API is now available at http://localhost:8000. A built-in test GUI is served at the root URL — open http://localhost:8000/ in a browser. The OpenAPI interactive docs are at http://localhost:8000/docs.

Create a tool

curl -s -X POST http://localhost:8000/tools \
  -H "Content-Type: application/json" \
  -d '{"description": "Convert a temperature from Celsius to Fahrenheit"}' \
  | jq .

The response contains the tool's id, generated code, input_schema, and initial statistics.

Execute the tool

TOOL_ID="<id from previous response>"

curl -s -X POST "http://localhost:8000/tools/$TOOL_ID/execute" \
  -H "Content-Type: application/json" \
  -d '{"input_data": {"celsius": 100}}' \
  | jq .

A successful response looks like:

{
  "success": true,
  "output": "212.0",
  "error": null,
  "execution_time": 0.031
}

List and search tools

# List all active tools
curl -s "http://localhost:8000/tools?status=active" | jq .

# Search by keyword
curl -s "http://localhost:8000/tools/search?q=temperature" | jq .

Test GUI

Open http://localhost:8000/ in a browser. The dark-mode single-page interface provides four tabs:

  • Create Tool — enter a description (and optional name), generate a tool via the LLM.
  • Tools List — browse, search, filter by status, and delete tools.
  • Tool Detail — inspect a tool's code, input schema, statistics, and metadata.
  • Execute — select a tool, provide JSON input, run it, and view the result.

The GUI is a single HTML file (src/seat/static/index.html) with no external dependencies. It consumes the same REST API documented below.

API Reference

Method Path Description
GET / Serves the built-in test GUI (HTML)
GET /health Liveness check — returns {"status": "ok"}
POST /tools Generate, validate, and register a new tool
GET /tools List all tools; optional ?status=active|deprecated filter
GET /tools/search?q=<query> Case-insensitive substring search on name and description
GET /tools/{id} Retrieve a single tool by UUID
POST /tools/{id}/execute Execute a tool with JSON input data
DELETE /tools/{id} Remove a tool from the registry

All endpoints return JSON. POST /tools returns 201 on success. DELETE /tools/{id} returns 204 with no body. Both return 404 when the tool does not exist. POST /tools returns 422 when generated code fails security validation.

Architecture

HTTP client
     │
     ▼
┌────────────────────────────────────────────────────────┐
│  FastAPI application  (seat.api.app)                   │
│                                                        │
│  POST /tools                GET /tools/{id}/execute    │
│       │                              │                 │
│       ▼                              ▼                 │
│  ToolGenerator              ToolExecutor               │
│  ┌──────────────┐           ┌───────────────────────┐  │
│  │ LLMClient    │           │ tempfile + subprocess │  │
│  │ (LLMWire)     │           │ JSON stdin / stdout   │  │
│  │ Jinja2 wrap  │           │ configurable timeout  │  │
│  └──────────────┘           └───────────────────────┘  │
│       │                              │                 │
│       ▼                              ▼                 │
│  CodeValidator              ToolRegistry               │
│  ┌──────────────┐           ┌───────────────────────┐  │
│  │ AST parser   │           │ SQLAlchemy async ORM  │  │
│  │ import check │           │ PostgreSQL (asyncpg)  │  │
│  │ builtin check│           │ Alembic migrations    │  │
│  └──────────────┘           └───────────────────────┘  │
└────────────────────────────────────────────────────────┘

The lifespan hook wires the engine, session factory, ToolGenerator, and ToolExecutor into app.state once at startup. FastAPI's dependency injection pulls them into each request handler without any global state.

Security Model

SEAT applies a defence-in-depth approach to code that it did not write.

AST validation (pre-persistence). Before any code reaches the database it is parsed to a Python AST. The validator walks every node and rejects the submission if it finds:

  • An import or from ... import statement whose top-level module is one of: os, sys, subprocess, shutil, socket, http, urllib, requests, httpx, ctypes, multiprocessing, threading, signal, importlib, pickle, shelve, or sqlite3.
  • A function call whose name matches any of: exec, eval, compile, __import__, open, globals, locals, getattr, setattr, or delattr.
  • Missing function definition — code that contains no def is not a valid tool.

Validation is purely static; no code is executed during this phase.

Subprocess sandbox (execution). Approved tools run as a separate python3 process. SEAT passes input as JSON to the process's stdin and reads the result from stdout. The subprocess inherits no database credentials or application secrets. A configurable timeout (default 30 s, controlled by EXECUTOR_TIMEOUT) kills the process if it runs too long. The temporary script file is always deleted after execution, even on failure.

Configuration

All settings are read from environment variables (or a .env file via pydantic-settings):

Variable Default Description
DATABASE_URL postgresql+asyncpg://seat:seat@localhost:5433/seat Async-compatible PostgreSQL connection URL
LLM_PROVIDER ollama LLMWire provider name (ollama, openai, ...)
LLM_MODEL llama3 Model identifier passed to the LLM provider
LLM_API_KEY (empty) API key for hosted providers; leave empty for Ollama
EXECUTOR_TIMEOUT 30 Maximum seconds to allow a tool subprocess to run

Roadmap

v0.2.0

MCP server generation. Each registered tool will be automatically exposed as a Model Context Protocol (MCP) tool descriptor, allowing any MCP-compatible agent (Claude Desktop, Continue, etc.) to discover and invoke SEAT-generated tools without custom API integration.

Docker tool containers. The subprocess executor will gain an optional Docker backend. When enabled, each tool execution runs inside a disposable container with a read-only filesystem, no network access, and strict resource limits — providing OS-level isolation in addition to the existing AST and timeout safeguards.

References and Papers

License

MIT — see LICENSE.

About

Self-Extending Agent Toolkit - agents that build their own tools

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages