| Date | Update | |
|---|---|---|
| π | 6 Mar 2026 | v0.1.1 Released: Stabilization β fixed license/metadata consistency, improved error handling, added 6 examples, expanded test suite. See changelog |
| π | 1 Mar 2026 | v0.1.0 Released: Major feature release β 14 built-in tools, agent presets, plugin system, real streaming, memory integration, ACP/MCP protocols, CI/CD, and comprehensive test suite. See changelog |
| π§ | 3 Feb 2026 | v0.0.2 Released: vLLM backend fixes with automatic chat template support, GPU memory control, improved OOM error handling, and multi-model family compatibility |
| π | 2 Feb 2026 | Preprint available: EffGen: Enabling Small Language Models as Capable Autonomous Agents |
| π | 31 Jan 2026 | Initial release of effGen framework (v0.0.1) |
effGen transforms Small Language Models into powerful AI agents. While most frameworks require massive LLMs, effGen is optimized from the ground up for efficient, smaller models β delivering fast, capable agents without the compute overhead.
from effgen import Agent, load_model
from effgen.core.agent import AgentConfig
from effgen.tools.builtin import Calculator, PythonREPL
# Load a small but mighty model
model = load_model("Qwen/Qwen2.5-1.5B-Instruct", quantization="4bit")
# Create agent with tools
config = AgentConfig(
name="math_agent",
model=model,
tools=[Calculator(), PythonREPL()]
)
agent = Agent(config=config)
# Run computation
result = agent.run("What is 24344 * 334?")
print(f"Answer: {result.output}")Requires Python 3.10 or newer. Tested on Python 3.10, 3.11, 3.12, 3.13.
pip install effgenpip install effgen[vllm]git clone https://github.com/ctrl-gaurav/effGen.git
cd effGen
# Quick install
./install.sh
# Full install (includes vLLM + dev tools)
./install.sh --full
# Manual install
pip install -e .# Run a task
effgen run "What is the capital of France?"
# Interactive chat
effgen chat
# Start API server
effgen serve --port 8000
# List available presets
effgen presets
# Check infrastructure health
effgen health
# Interactive wizard
effgenfrom effgen import Agent, load_model
from effgen.core.agent import AgentConfig
from effgen.tools.builtin import Calculator
# Load model
model = load_model("Qwen/Qwen2.5-1.5B-Instruct", quantization="4bit")
# Configure agent
config = AgentConfig(
name="calculator_agent",
model=model,
tools=[Calculator()],
system_prompt="You are a helpful math assistant."
)
# Create and run
agent = Agent(config=config)
result = agent.run("Calculate 15% tip on $85.50")
print(result.output)|
π§ |
π |
π§ |
π§© |
π₯ |
πΎ |
π |
Get started instantly with ready-to-use agent configurations:
from effgen import load_model
from effgen.presets import create_agent
model = load_model("Qwen/Qwen2.5-3B-Instruct", quantization="4bit")
# One-line agent creation
math_agent = create_agent("math", model) # Calculator + PythonREPL
research_agent = create_agent("research", model) # WebSearch + URLFetch + Wikipedia
coding_agent = create_agent("coding", model) # CodeExecutor + PythonREPL + FileOps + Bash
general_agent = create_agent("general", model) # All 11 tools
minimal_agent = create_agent("minimal", model) # Direct inference, no tools# CLI preset support
effgen run --preset math "What is sqrt(144)?"
effgen run --preset research "Tell me about quantum computing"|
π’ |
π |
π» |
π |
π |
π |
π― |
|
π₯οΈ |
π€οΈ |
π |
π |
π |
π |
π |
python examples/basic_agent.py # Basic agent (Transformers backend)
python examples/basic_agent_vllm.py # Basic agent (vLLM backend - 5-10x faster)
python examples/web_agent.py # Web search agent
python examples/retrieval_agent.py # RAG-based retrieval
python examples/agentic_search_agent.py # Grep-based agentic search
python examples/preset_agents.py # Ready-to-use agent presets
python examples/streaming_agent.py # Real-time token streaming
python examples/memory_agent.py # Multi-turn memory
python examples/multi_tool_agent.py # Multi-tool agent
python examples/weather_agent.py # Weather via Open-Meteo (free)
python examples/plugin_example.py # Custom tool pluginsπ More Examples
from effgen import Agent, load_model
from effgen.core.agent import AgentConfig
from effgen.tools.builtin import Calculator, WebSearch, PythonREPL
model = load_model("Qwen/Qwen2.5-3B-Instruct")
config = AgentConfig(
name="research_agent",
model=model,
tools=[Calculator(), WebSearch(), PythonREPL()],
system_prompt="You are a research assistant."
)
agent = Agent(config=config)
result = agent.run("Search for the population of Tokyo and calculate what percentage it is of Japan's total population")from effgen import Agent, load_model
from effgen.core.agent import AgentConfig
from effgen.tools.builtin import Calculator
model = load_model("Qwen/Qwen2.5-3B-Instruct", quantization="4bit")
agent = Agent(config=AgentConfig(
name="stream_demo", model=model,
tools=[Calculator()], enable_streaming=True
))
for token in agent.stream("What is 2 + 2?"):
print(token, end="", flush=True)agent = Agent(config=AgentConfig(
name="memory_demo", model=model,
tools=[], enable_memory=True
))
agent.run("My name is Alice and I'm working on quantum computing.")
result = agent.run("What's my name and what am I working on?")
# β "Your name is Alice and you're working on quantum computing."from effgen.tools.builtin import Retrieval
retrieval_tool = Retrieval(knowledge_base_path="./docs")
config = AgentConfig(name="qa_agent", model=model, tools=[retrieval_tool])
agent = Agent(config=config)
result = agent.run("What does the documentation say about configuration?")|
π³ |
π‘οΈ |
β‘ |
π For security policies and vulnerability reporting, see SECURITY.md
If you use effGen in your research, please cite our paper:
@software{srivastava2026effgen,
title={effGen: Enabling Small Language Models as Capable Autonomous Agents},
author={Gaurav Srivastava and Aafiya Hussain and Chi Wang and Yingyan Celine Lin and Xuan Wang},
year={2026},
eprint={2602.00887},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2602.00887},
}Apache License 2.0 β see LICENSE for details.