Definers is a modular Python platform for building AI pipelines, multimodal media workflows, data preparation systems, and safe runtime automation from one codebase.
It is designed for developers, teams, and companies that need more than a narrow model wrapper or a one-off media script. Definers brings together audio, image, video, machine learning, NLP, web transfer, and platform-aware execution primitives behind a consistent Python-first API and launcher surface.
- Positioning
- Why Definers
- Capability Map
- Architecture Diagrams
- Installation
- Quick Start
- Applications And Launchers
- Docker Workflows
- Integrated API Reference
- Reliability And Safety
- Platform Requirements
- Development Workflow
- Troubleshooting
- Adoption Notes For Teams
- Contributing
- License
Definers is built for multimodal AI systems that need to move from experimentation to repeatable operation without changing toolsets. It gives you one repository and one package surface for:
- media analysis and transformation
- data preparation and training flows
- UI launchers and Dockerized applications
- resilient integrations and downloads
- safer command execution and platform-aware runtime handling
The project favors modular adoption. Teams can start with a narrow slice such as definers.data, definers.system, or one media domain, then expand into broader multimodal workflows when they are ready.
Definers is built around a practical view of AI infrastructure: model work, media work, data work, and system work should cooperate instead of living in disconnected libraries.
| Project Value | What It Means In Practice |
|---|---|
| Unified multimodal surface | Audio, image, video, ML, NLP, web, and launcher flows live in one toolkit |
| Modular adoption | Optional dependency groups keep installs targeted instead of forcing the full stack |
| Operational guardrails | Retry, backoff, circuit-breaker, and guarded command execution are part of the public story |
| Deployment flexibility | The same project supports direct imports, CLI usage, installed app launching, and Docker app folders |
| Team-friendly growth | Start small, then expand into richer workflows without replacing the surrounding ecosystem |
| Outcome | Primary Surface | Typical Work |
|---|---|---|
| Launch multimodal apps | definers.chat, definers.presentation.launchers, CLI commands |
Start chat, audio, image, video, translate, train, animation, and FAISS interfaces |
| Process and analyze audio | definers.audio |
Feature extraction, mastering, stem separation, transcription, DSP, preview, synthesis |
| Transform images and video | definers.image, definers.video |
Upscaling, feature extraction, reconstruction, composition, rendering |
| Prepare datasets and train models | definers.data, definers.ml |
Loading, splitting, vectorization, tokenization, training, inference, export |
| Orchestrate resilient integrations | definers.capabilities, definers.web |
Retry, circuit breakers, transfer orchestration, downloads |
| Run tools safely across environments | definers.system |
Path handling, process execution, filesystem operations, runtime checks |
| Domain | Core Public Entry Points | Representative Tasks |
|---|---|---|
| Audio | analyze_audio, analyze_audio_features, extract_audio_features, separate_stems, master |
Analysis, mastering, transcription, generation, mixing |
| Image | extract_image_features, features_to_image, save_image, get_max_resolution |
Feature extraction, reconstruction, upscaling, export |
| Video | features_to_video, video UI and composition flows |
Render pipelines, architect-style composition, media generation |
| Data | prepare_data, fetch_dataset, files_to_dataset, create_vectorizer |
Data ingestion, batching, splitting, vectorization |
| ML | train, fit, answer, extract_text_features |
Training, inference, prediction, feature conversion |
| Capabilities | CircuitBreaker, ExponentialBackoffDelay, with_retry |
Resilience boundaries for unstable downstreams |
| Web | download_file, download_and_unzip |
Safe transfer orchestration and download workflows |
| System | run, run_linux, run_windows, secure_command |
Guarded command execution and runtime integration |
flowchart TB
User[Developer or Team] --> Entry[Python API / CLI / Docker App]
Entry --> Presentation[Presentation Layer]
Entry --> Runtime[System Layer]
Presentation --> Apps[Chat / Audio / Image / Video / Translate / Train / Animation / FAISS]
Runtime --> Domains[Audio / Image / Video / ML / Data / Text / Web]
Domains --> Resilience[Capabilities]
Domains --> Catalogs[Catalogs]
Runtime --> Platform[Platform Services]
Platform --> Filesystem[Filesystem]
Platform --> Processes[Processes]
Platform --> Environment[Environment]
flowchart LR
Source[Local Files or Remote Dataset] --> Prepare[prepare_data]
Prepare --> TrainingData[TrainingData]
TrainingData --> Train[train or fit]
Train --> Model[Trained Artifact or Prediction Flow]
Model --> Export[ONNX / Runtime Usage / App Integration]
sequenceDiagram
participant User
participant CLI as CLI or Docker App
participant Dispatch as cli_dispatch
participant Registry as gui_registry
participant Chat as definers.chat.start
participant App as Target App
User->>CLI: definers start chat
CLI->>Dispatch: run_cli(argv, version)
Dispatch->>Registry: normalize project name
Dispatch->>Chat: start(project)
Chat->>App: launch registered GUI
App-->>User: interactive interface
| Goal | Command |
|---|---|
| Base install | pip install . |
| Audio workflows | pip install ".[audio]" |
| Image workflows | pip install ".[image]" |
| Video workflows | pip install ".[video]" |
| ML workflows | pip install ".[ml]" |
| NLP workflows | pip install ".[nlp]" |
| Web and UI workflows | pip install ".[web]" |
| Full optional stack | pip install ".[all]" |
| Contributor toolchain | pip install -e ".[dev]" |
| CUDA extras | pip install ".[cuda]" --extra-index-url https://pypi.nvidia.com |
pip install .pip install ".[audio,ml,web]"pip install -e ".[dev]"Windows contributors can use scripts/install.bat, which prompts for install groups and includes the extra package index needed for NVIDIA-hosted packages.
| Extra | Purpose | Notes |
|---|---|---|
audio |
Audio analysis, DSP, separation, mastering, transcription, generation | Pulls in heavyweight audio and model dependencies |
image |
Image feature extraction, upscaling, reconstruction | Includes image and computer-vision dependencies |
video |
Video manipulation and rendering flows | Focused on video composition and export |
ml |
Model training, inference, embeddings, export | Includes substantial ML framework dependencies |
nlp |
Translation and language processing | Adds text and inference packages |
web |
Retrieval, scraping, Gradio UI, web utilities | Includes Gradio and scraping-related packages |
cuda |
GPU-oriented acceleration stack | Advanced path with host-level prerequisites |
all |
Aggregates the main domain extras | Does not include dev or cuda |
dev |
Pytest, Ruff, Poe, build, pre-commit, Vulture | Local contributor toolchain |
Definers is intentionally segmented so that narrow adoption does not require the entire dependency graph.
from definers.data import prepare_data
from definers.ml import train
dataset = prepare_data(
features=["/path/to/features.csv"],
drop=["unused_column"],
stratify="label",
val_frac=0.1,
test_frac=0.1,
batch_size=32,
)
print(dataset.metadata)
model_path = train(
features=["/path/to/features.csv"],
dataset_label_columns=["label"],
order_by="shuffle",
stratify="label",
val_frac=0.1,
test_frac=0.1,
batch_size=32,
)
print(model_path)from definers.audio import analyze_audio_features, extract_audio_features
summary = analyze_audio_features("/path/to/song.wav")
vector = extract_audio_features("/path/to/song.wav")
print(summary)
print(None if vector is None else vector.shape)from definers.image import extract_image_features
features = extract_image_features("/path/to/image.png")
print(None if features is None else features.shape)from definers.presentation.launchers import launch_installed_project
launch_installed_project("chat")from definers.system import run
run(["ffmpeg", "-i", "input.mp4", "output.wav"])from definers.capabilities import with_retry
@with_retry(max_attempts=3)
async def fetch_remote_resource():
...Definers exposes both installed-project launchers and direct CLI dispatch for the same application registry.
python -m definers --help
definers --version
definers start chat
definers audio
definers translate
definers music-video /path/to/song.wav 1920 1080 30
definers lyric-video /path/to/song.wav /path/to/background.mp4 /path/to/lyrics.txt bottom| Project | Purpose | Startup Paths |
|---|---|---|
chat |
Multimodal chat interface | CLI, installed launcher, Docker |
audio |
Audio production and analysis workflows | CLI, installed launcher, Docker |
image |
Image generation and upscaling tools | CLI, installed launcher, Docker |
video |
Video composition and architect workflows | CLI, installed launcher, Docker |
animation |
Chunked image-to-animation workflow | CLI, installed launcher, Docker |
translate |
Translation and caption-oriented interface | CLI, installed launcher, Docker |
train |
Training and prediction interface | CLI, installed launcher, Docker |
faiss |
FAISS-oriented utility surface | CLI, installed launcher, Docker |
| Surface | Contract |
|---|---|
definers.presentation.cli_dispatch.run_cli(argv, version) |
Parses commands, normalizes app names, and routes to the right launcher or media command |
definers.presentation.launchers.launch_installed_project(project) |
Starts an installed application by project name |
definers.application_shell.commands.parse_cli_command(...) |
Normalizes incoming requests into typed command routing |
Each app folder under docker/ contains a container entrypoint layout with app.py, Dockerfile, and docker-compose.yml.
cd docker/chat
docker compose up --buildThe same folder shape exists for audio, image, video, animation, translate, train, and faiss.
| When To Use | Best Fit |
|---|---|
| Local import-heavy development | Install the package directly |
| Testing one app in an isolated runtime | Use the relevant Docker folder |
| Deployment-like app startup | Use Docker and the per-app container layout |
This section absorbs the former API specification document so the README can serve as the main technical front door.
Definers provides a modular utility toolkit with extension-oriented boundaries for reliability, data handling, media processing, and system orchestration.
| API | Contract |
|---|---|
definers.capabilities.CircuitBreaker |
Sync and async operation gate with CLOSED, OPEN, and HALF_OPEN state transitions and runtime snapshots |
definers.capabilities.ExponentialBackoffDelay |
Configurable backoff strategy with base delay, multiplier, max delay, and jitter |
definers.capabilities.with_retry |
Async retry decorator with selective exception boundaries and deterministic re-raise behavior |
definers.web.ResourceRetrievalOrchestrator |
Strategy-driven transfer orchestration for integration-heavy workflows |
| Pattern | Meaning |
|---|---|
| Resilience composition | Retry and circuit-breaker behavior can be layered to isolate unstable downstreams |
| Strategy-oriented integrations | Transfer and runtime boundaries are shaped around injectable strategies and protocols |
| Explicit failure surfaces | Runtime snapshots and typed failures are favored over silent degradation |
flowchart TD
Call[Integration Call] --> Retry{Retryable Failure?}
Retry -- Yes --> Backoff[ExponentialBackoffDelay]
Backoff --> Call
Retry -- No --> Circuit{Circuit Threshold Reached?}
Circuit -- Yes --> Open[Open Circuit]
Circuit -- No --> Fail[Raise Failure]
Open --> Snapshot[Capture Circuit Snapshot]
| API | Contract |
|---|---|
prepare_data(remote_src=None, features=None, labels=None, url_type=None, revision=None, drop=None, order_by=None, stratify=None, val_frac=0.0, test_frac=0.0, batch_size=1) |
Loads remote or local data, applies optional drops and ordering, performs train/val/test splitting, and returns TrainingData |
TrainingData |
Dataclass with train, val, test, and metadata |
| API | Contract |
|---|---|
analyze_audio(audio_path, hop_length=1024, duration=None, offset=0.0) |
Returns dense waveform and frame-domain analysis data including tempo, beat frames, spectral data, RMS bands, and normalization helpers |
analyze_audio_features(audio_path, txt=True) |
Returns either a compact formatted summary such as C major (120 bpm) or a (key, mode, tempo) tuple |
extract_audio_features(file_path, n_mfcc=20) |
Returns an audio feature vector or None on load or extraction failure |
| API | Contract |
|---|---|
extract_image_features(image_path) |
Returns a visual feature vector or None if extraction fails |
features_to_image(predicted_features, image_shape=(1024, 1024, 3)) |
Reconstructs an image-like frame from feature input when shape expectations are met |
get_max_resolution(width, height, mega_pixels=0.25, factor=16) |
Calculates a bounded factored resolution target |
Import concrete APIs from the implementation module you are using.
from definers.data import prepare_data
from definers.ml import train
from definers.system import runThe package root is intentionally narrow. It mainly exposes version metadata plus the optional sox probe through definers.sox and definers.has_sox().
| Registry | Contract |
|---|---|
definers.catalogs.languages.LANGUAGE_CODES |
Immutable language-code registry |
definers.catalogs.languages.UNESCO_MAPPING |
Immutable UNESCO mapping registry |
definers.catalogs.tasks.TASKS |
Immutable task-to-model registry consumed by constants |
definers.catalogs.references.USER_AGENTS |
Immutable user-agent registry |
definers.catalogs.references.STYLE_CATALOG |
Immutable style registry |
definers.catalogs and definers.catalogs.access expose these registries directly without a getter-function compatibility layer.
| API | Contract |
|---|---|
definers.web.download_file(url, destination) |
Thin wrapper over the transfer subsystem for file download execution |
definers.web.download_and_unzip(url, extract_to) |
Thin wrapper over the transfer subsystem for download-and-extract workflows |
Operational guardrails are a first-class part of the design.
| Concern | Public Surface | Why It Matters |
|---|---|---|
| Unstable integrations | with_retry, ExponentialBackoffDelay |
Reduces transient failure noise |
| Repeated downstream faults | CircuitBreaker |
Prevents failure amplification and exposes state snapshots |
| Transfer orchestration | definers.web facades |
Centralizes integration-heavy behavior |
| Platform differences | run_linux, run_windows, run |
Makes runtime execution explicit |
All external command execution flows through definers.system.run() and its platform-specific delegates.
- Prefer list-form commands such as
["ffmpeg","-i","input.mp4","output.wav"]. - Avoid free-form shell strings unless shell semantics are required deliberately.
- Guarded runtime execution rejects unsafe string command patterns such as shell separators in protected paths.
Regex behavior is centralized in definers.regex_utils so user-facing pattern work can be constrained instead of compiled ad hoc.
Some features fail lazily or degrade gracefully when optional dependencies are missing. A prominent example is sox, which is probed without forcing import definers itself to fail.
| Area | Current State |
|---|---|
| Python requirement | >=3.10 |
| Check workflow matrix | Python 3.10, 3.11, 3.12 |
| Quality workflow matrix | Python 3.10, 3.12 |
| External tools | FFmpeg often required, sox sometimes required, CUDA host support optional |
- FFmpeg is needed for many audio and video conversion paths.
soxis optional but used in some audio conversion and loading flows.- CUDA-oriented installs require host-level NVIDIA support beyond Python packages alone.
- Some extras pull direct Git dependencies and heavyweight frameworks, so installation time and environment complexity can rise quickly.
Definers includes a contributor-oriented Python quality pipeline.
pip install -e ".[dev]"poe checkpoe check runs cleanup, compile verification, dead-code scanning, source sanitization, pre-commit hooks, tests, and a final cleanup pass.
poe test
poe lint
poe format
poe build
poe hook| Workflow | Purpose |
|---|---|
check.yml |
Pull-request gate across Python 3.10, 3.11, and 3.12 with poe check |
quality.yml |
Push and manual quality validation including lint, format check, and targeted resilience tests |
| Additional workflows | Publish, CodeQL, dependency review, stale automation, and repository maintenance |
sox is optional. Some audio-loading paths return None or fail only when the capability is used. Install the binary and place it on PATH if your workflow depends on it.
Install FFmpeg and ensure it is available on PATH. Definers can help orchestrate FFmpeg-driven workflows, but it cannot supply the binary automatically in every environment.
Treat cuda as an advanced path. Start with CPU-oriented or non-CUDA extras first, confirm your workflows, then layer in CUDA once host prerequisites are proven.
Install only the extras you need. The project is intentionally segmented so narrow adoption does not require the full stack.
Prefer list-form execution. If you need shell syntax, explicitly invoke the shell you intend to use.
Definers is easiest to adopt incrementally.
- Start with one slice such as
definers.data,definers.system, or a single media domain. - Add extras when the team is ready to absorb the relevant runtime and model dependencies.
- Standardize on the shared launcher and resilience surfaces as workflows expand.
That incremental adoption path is one of the project's strongest characteristics: teams can begin with a focused utility layer and grow into a broader multimodal platform without changing ecosystems.
Contributor workflow now lives in CONTRIBUTING.md. Use that guide for Python environment setup, local validation, branch hygiene, and pull request expectations.
Definers is licensed under the MIT License.
- Allowed: private use, commercial use, modification, redistribution, and deployment.
- Required: preservation of copyright and license notices.
- Not provided: warranty, liability coverage, or fitness guarantees.
See LICENSE for the full terms.
Definers is owned and maintained by Yaron Koresh. Contributions are welcome through issues and pull requests.