Self-hosted, local-first browser app for operating local GGUF models through a configured llama-server executable.
Admins control model discovery, registry state, loading, stopping, runtime settings, Benchmarks, users, and Installation & Settings.
Normal signed-in users chat with the currently loaded main model and manage their own chat sessions, imports, exports, and account settings.
Streaming chat with Markdown, code, math, reasoning display, stop, regenerate, latest-turn variants, and prompt editing workflows.
Single-file and split GGUF support, CPU-only loading, GPU offload settings, tensor-split multi-GPU launch support where the local runtime allows it, Windows launcher support, and Linux path configurability.
Runtime Visibility with live llama-server Logs, GPU Monitor data through local NVIDIA/AMD tools where available, Analytics, and active process visibility.
Admin-only CE Benchmarks with editable five-question prompt set, eligibility controls, live progress, best-run tracking, and result drill-downs.
First-run installer for bootstrap configuration, MySQL schema/seed import, initial admin creation, and runtime default review.
LLM Controller CE Roadmap
Image understanding and image generation are on the CE roadmap, not current v1.0 capabilities.
API support is on the CE roadmap so local workflows can be reached programmatically without changing the local-first posture.
Benchmarks and Runtime Visibility are expected to expand beyond the current CE baseline.
No public timeline is implied here; CE roadmap items are direction, not release promises.
LLM Controller Pro Direction
LLM Controller Pro provides multi‑node orchestration, distributed inference across clusters, advanced GPU scheduling, user quota management, analytics dashboards, and high‑throughput serving of GGUF models at scale.
Additional features include a centralized model registry, role‑based access control, automated benchmark pipelines, and integrated monitoring for CPU/GPU/RAM utilization.
LLM Controller
LLM Controller CE
About LLM Controller
LLM Controller CE is a local-first dashboard for running and managing Large Language Models on your own hardware.
It lets you launch, switch, and monitor models like Llama and DeepSeek with zero cloud dependencies, full privacy, and real-time insight into performance and GPU usage.
Built on llama-server, it automatically detects supported GPU runtimes and supports advanced multi-GPU and dual-model setups without manual configuration.
Key Features
Model Management: Scan, launch, stop, and switch models instantly.
Live Analytics: Logs, GPU telemetry, token throughput, and latency metrics.
Modern Chat UI: Streaming output, Markdown, code, math, and titles.
Local & Private: 100% self-hosted. No cloud, no data sharing.
Actively Developed: Built to evolve with new models and features.