Ollama · HuggingFace · GitHub — Unified
Reach is a Rust-native universal LLM model manager. It connects to three model platforms, handles acquiring and running models from each, and provides a single consistent API — regardless of where the model lives or how it runs. Self-hosted. No cloud dependency. No Python required.
Reach treats each platform according to what it actually is — not a one-size-fits-all abstraction.
Models already running locally. Reach calls Ollama's existing API to submit prompts and stream responses. No downloading, no loading — the model is already live.
Download GGUF and SafeTensors files from HuggingFace repos. Local storage with integrity verification via hash check after every download.
Clone repos, parse model configs, resolve weight locations. The most complex source — requires understanding repo structure before running anything.
Caller → Unified API → Platform Router → Model Registry → Cleanup
What Reach actually does — no more, no less.
Inspects the incoming request, determines which platform the model lives on, and routes accordingly. The caller never needs to know where a model comes from.
Tracks every model Reach has acquired — source, disk location, size, format, last access time, usage count. Persisted to ~/.reach/registry.json.
Scans for stale models (default: 30 days). Reports file size reclaimed. Always dry-runs first — deletion requires explicit confirmation or --force.
When Dash is present, model execution routes through Dash for CPU acceleration automatically. No configuration needed — Reach detects and uses it.
Optional REST API exposing Reach's full capability. OpenAI-compatible chat endpoint plus Reach-specific model management endpoints for non-Rust systems.
Import directly as a Rust crate. Full async support. No Python, no wrappers, no cloud dependency. Self-contained binary.
Clear boundaries. No feature creep.