Conduit is a provider-aware Python LLM client for vLLM, Ollama, and OpenRouter.
It exposes strict typed configs, provider-specific request mapping, normalized tool calling, and unified chunk and event streaming interfaces.
pip install -e .from conduit import Conduit, OllamaConfig, Message, Role, TextPart
config = OllamaConfig(model="llama3.1:8b")
async with Conduit(config) as client:
response = await client.chat(
messages=[
Message(role=Role.SYSTEM, content=[TextPart(text="You are concise.")]),
Message(
role=Role.USER,
content=[TextPart(text="Explain gradient descent briefly.")],
),
]
)
print(response.content)- Uses
/chat/completionsunder your configured base URL. - Supports
response_format(for example{ "type": "json_object" }). - Supports
stream_optionsfor streaming requests (for example{ "include_usage": true }). - Supports
structured_outputsdirectly. - Legacy
guided_json,guided_regex,guided_choice, andguided_grammarare accepted and mapped tostructured_outputswith deprecation warnings. best_ofandguided_decoding_backendare intentionally not supported by this client.- Tool choice values
auto,none,required, and named function objects are supported.
- Uses
/api/chatby default. - Automatically uses
/api/generatewhenraw=Trueorsuffixis set. - In generate mode, requests are completion-style: exactly one user message plus an optional system message, with no tools, tool messages, tool calls, or images.
- Sampling/runtime fields are sent in
options. - Tool choice supports
autoandnonesemantics. - Canonical
tool_call_idvalues are synthesized asollama_call_{index}. - Tool result messages are converted to Ollama
tool_nameby matching prior assistant tool calls.
- Uses
https://openrouter.ai/api/v1/chat/completionsby default. api_keyis required.- Optional headers:
HTTP-Refererfromapp_urlX-Titlefromapp_name
- Supports provider routing via
providerandroute. - Supports passthrough
reasoning,transforms, andinclude. includeandtransformsare provider/model/endpoint dependent and may not be honored by every upstream route.- Supports tool strict mode passthrough via
ToolDefinition(strict=True). - Supports optional request-context metadata mapping via runtime override key
openrouter_context_metadata_fields.
Support matrix:
- OpenRouter: supported (passes
tools[].function.strictthrough). - vLLM: not supported; raises
ConfigValidationErrorwhen any tool setsstrict=True. - Ollama: not supported; raises
ConfigValidationErrorwhen tools are included and any tool setsstrict=True.
from conduit import Conduit, Message, OpenRouterConfig, Role, TextPart, ToolDefinition
config = OpenRouterConfig(model="openai/gpt-4o-mini", api_key="k")
tool = ToolDefinition(
name="get_weather",
description="Get weather for a location",
strict=True,
parameters={
"type": "object",
"properties": {
"location": {"type": "string"},
},
"required": ["location"],
},
)
async with Conduit(config) as client:
response = await client.chat(
messages=[Message(role=Role.USER, content=[TextPart(text="Weather in SF?")])],
tools=[tool],
tool_choice="auto",
)Conduit separates provider config overrides from invocation-scoped runtime fields:
config_overrides: strict, provider-typed request config. Unknown keys fail validation.runtime_overrides: untyped runtime namespace. Keys are provider opt-in and are never merged into provider config.context: invocation metadata (thread_id,tags,metadata) for middleware/logging and provider-specific mappings.
By default, unknown runtime_overrides keys are ignored. Set
strict_runtime_overrides=True on Conduit or SyncConduit to raise instead.
from conduit import Conduit, Message, RequestContext, Role, TextPart, VLLMConfig
client = Conduit(VLLMConfig(model="m"), strict_runtime_overrides=False)
response = await client.chat(
messages=[Message(role=Role.USER, content=[TextPart(text="Say hi")])],
context=RequestContext(thread_id="thread-123", tags=["demo"]),
runtime_overrides={"ignored_by_vllm": True},
)Per-request config_overrides must be nested dictionaries and must validate against
the active provider config model. Unknown keys raise ConfigValidationError.
await client.chat(
messages=[Message(role=Role.USER, content=[TextPart(text="Say hi")])],
config_overrides={"temperature": 0.1},
)chat_stream() yields ChatResponseChunk.
chat_events() yields typed StreamEvent values with deterministic ordering per chunk:
text_deltatool_call_deltatool_call_completedusagefinish
On streaming errors, chat_events() emits one error event and then re-raises.
Message.content supports content parts:
TextPart(type="text", text="...")ImageUrlPart(type="image_url", url="https://...")- Provider-specific dict parts for passthrough
from conduit import Conduit
client = Conduit.from_env("openrouter")Environment variables:
- vLLM:
CONDUIT_VLLM_MODEL,CONDUIT_VLLM_URL,CONDUIT_VLLM_API_KEY - Ollama:
CONDUIT_OLLAMA_MODEL,CONDUIT_OLLAMA_URL,CONDUIT_OLLAMA_API_KEY - OpenRouter:
CONDUIT_OPENROUTER_MODEL,CONDUIT_OPENROUTER_KEY,CONDUIT_OPENROUTER_URL,CONDUIT_OPENROUTER_APP_NAME,CONDUIT_OPENROUTER_APP_URL
pip install -e .[dev]
pytest