Skip to content

VakeDomen/HiveNode

Repository files navigation

  hivecore

HiveNode

HiveNode is the worker component of the Hive system. It connects to a central HiveCore proxy and runs local inference using Ollama. By running HiveNode on any machine (on-premise, cloud, or behind firewalls), you can join that machine’s compute resources to the HiveCore network and serve requests routed by the central proxy.

Table of Contents

  1. Overview
  2. Key Features
  3. Installation & Setup
  4. Configuration
  5. Running
  6. How it Works
  7. Logging & Monitoring
  8. Contributing
  9. License

1. Overview

In the Hive architecture:

  • HiveCore serves as the central proxy and gateway, managing queues and distributing inference requests.
  • HiveNode runs on worker machines. It connects out to HiveCore (so the worker does not need to be publicly accessible). Once connected, HiveNode polls inference jobs from HiveCore, which then uses its local Ollama server to process requests.

This design allows multiple machines, possibly spread across different networks, to operate as a single, unified inference cluster.

2. Key Features

  • Docker-First Ollama Runtime

    HiveNode primarily manages an ollama/ollama Docker container itself, including startup and in-place upgrades.

  • Bring Your Own Ollama

    If you already run Ollama yourself, HiveNode can target an external Ollama URL instead of managing Docker.

  • Multiple Concurrent Connections

    Each HiveNode can open several parallel connections to HiveCore, letting you scale inference throughput per worker.

  • Centralized Configuration & Scaling

    Workers require minimal configuration—just point them to HiveCore and set a valid key.

  • Extensible Logging

    Built-in InfluxDB logging for GPU usage (via NVML) and system metrics if environment variables are set.

3. Installation & Setup

  1. Prerequisites

    • Rust toolchain (for building HiveNode)
    • Either Docker access for the default managed mode, or an existing reachable Ollama instance for external mode.
    • valid Worker key generated by HiveCore. (See HiveCore’s management endpoints for instructions.)
  2. Clone the repository

    git clone https://github.com/VakeDomen/HiveNode.git
    cd HiveNode
  3. Configure

    • rename .env.sample to .env
    mv .env.sample .env
  4. Run

    cargo run --release

4. Configuration

A sample .env for the default Docker-managed mode might look like:

# The address and port of HiveCore’s node connection server (default 7777 in HiveCore).
HIVE_CORE_URL=hivecore.example.com:7777

# Worker key provided by HiveCore admin. Must have "Worker" role.
HIVE_KEY=my-secret-key

# docker (default) or external
OLLAMA_MODE=docker

# Docker-managed Ollama settings
OLLAMA_PORT=11434
HIVE_OLLAMA_MODELS=/usr/share/ollama/.ollama/
GPU_PASSTHROUGH=-1

# Number of parallel connections to open to HiveCore. Best to match Ollama configuration.
CONCURRENT_REQUESTS=4

# (Optional) InfluxDB settings for logging
INFLUX_HOST=http://localhost:8086
INFLUX_ORG=MY_ORG
INFLUX_TOKEN=my-token

A sample .env for bring-your-own Ollama mode:

HIVE_CORE_URL=hivecore.example.com:7777
HIVE_KEY=my-secret-key
OLLAMA_MODE=external
OLLAMA_URL=http://localhost:11434
CONCURRENT_REQUESTS=4
  • HIVE_CORE_URL: Where HiveNode connects to HiveCore (must match HiveCore’s NODE_CONNECTION_PORT, by default 7777).
  • HIVE_KEY: The Worker key from HiveCore’s admin interface. Required for authentication.
  • OLLAMA_MODE: docker by default. Set external to use an existing Ollama instance instead of Docker-managed Ollama.
  • OLLAMA_PORT: Host port for the Docker-managed Ollama container. Required in docker mode.
  • HIVE_OLLAMA_MODELS: Host directory mounted into the Docker-managed Ollama container for model storage. Required in docker mode.
  • GPU_PASSTHROUGH: Optional GPU selection for Docker mode. Use -1 for all GPUs, a comma-separated list such as 0,1 for specific GPUs, or leave unset for CPU mode.
  • OLLAMA_URL: Required only in external mode. The local or remote address of the Ollama service.
  • CONCURRENT_REQUESTS: Sets how many parallel connections (and thus concurrent tasks) this HiveNode should proxy. Adjust based on your hardware resources and Ollama configuration.
  • INFLUX_*: (Optional) If configured, HiveNode will record logs and GPU usage metrics to InfluxDB. If not provided, it simply won’t log to Influx.

Ollama setup

Docker-managed mode is the primary path. In this mode HiveNode will pull or reuse ollama/ollama, bind it to OLLAMA_PORT, mount HIVE_OLLAMA_MODELS, and internally set OLLAMA_URL to that local container.

If you prefer to manage Ollama yourself, set OLLAMA_MODE=external and provide OLLAMA_URL.

If you want to prepare an Ollama host manually, or experiment with multiple instances and GPU layouts, you can use the provided setup_ollama.sh helper script.

This script: - Checks if netstat and curl are installed (and installs them if not). - Installs Ollama if not already present. - Lets you pick how many Ollama instances to run and how to assign GPUs to each instance. - Finds free ports for each Ollama instance, runs it, and logs their output.

chmod +x setup_ollama.sh
./setup_ollama.sh

5. Running

After configuring the .env file, run:

cargo run --release

Or compile and execute binary directly:

cargo build --release
./target/release/hive_node

6. How it Works

  1. Authentication
    • On startup, HiveNode initializes Ollama according to OLLAMA_MODE.
    • Each of the CONCURRENT_REQUESTS worker threads tries to authenticate to HiveCore using the key in HIVE_KEY.
    • Upon successful auth, HiveNode advertises its versions and its supported models to HiveCore.
  2. Polling & Proxying
    • HiveNode periodically polls HiveCore for incoming tasks. If HiveCore’s queue has work for a given model, it dispatches it to the node.
    • HiveNode forwards the request to OLLAMA_URL for local inference (via the Ollama HTTP API), then streams the response back to HiveCore.
  3. Reconnection & Control
    • If the connection drops or an error occurs, HiveNode waits briefly, then reconnects.
    • HiveCore can issue commands like REBOOT or SHUTDOWN, which HiveNode listens for in the incoming messages.
    • UPDATE is supported in Docker-managed mode and causes HiveNode to refresh the Docker image and reconnect.
  4. Scaling
    • To allow more capacity on the same machine, increase the CONCURRENT_REQUESTS count.
    • To add more workers across multiple machines, simply run additional HiveNode instances (each with its own .env and valid Worker key).
    • Hive supports having multiple workers on the same machine, but each worker should have its own token.
  hivecore

7. Logging & Monitoring

HiveNode can log system metrics like GPU usage, memory, and proxied requests to InfluxDB:

  • Enable Influx Logging: Provide INFLUX_HOST, INFLUX_ORG, and INFLUX_TOKEN in the .env.
  • GPU Metrics: HiveNode uses NVML to gather GPU info. This is only collected if an NVIDIA GPU is present and NVML is available on the system.
  • Request Streaming: All inference requests and responses can be logged with success/error tags.

These metrics are pushed to Influx in the background. If any of the Influx environment variables are missing or invalid, HiveNode just skips that monitoring.

8. Contributing

We welcome pull requests! Before submitting, please open an issue to discuss your proposed changes. Make sure to:

  • Keep code style consistent.
  • Update documentation if adding or changing features.

9. License

HiveNode is distributed under the MIT License, just like HiveCore. See the LICENSE file in this repository for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors