LLM Controller

🧠 Model Management

Scan your local GGUF library, launch and stop models, tune runtime settings, and keep the active model state visible for everyone using the workspace.

💬 Modern Chat Workflows

Stream responses live, stop generation, regenerate the latest reply, edit the latest prompt, and continue working inside organized saved chats.

📎 File-Aware Conversations

Attach supported text and code files directly to prompts so the model can work from real source material without leaving the chat interface.

🖥️ Large Model Ready

Supports split GGUF model sets, CPU-only operation, and practical startup handling for larger local models that need more flexible loading paths.

📊 Built-In Observability

Watch live logs, NVIDIA GPU activity, active GPU processes, and model analytics from the same interface you use to chat and manage runtime behavior.

🧪 Benchmarking Included

Run administrator-managed benchmarks against a fixed CE question set, review best runs per model, and compare real results from your own hardware.

What LLM Controller CE Includes

Local model discovery, launch controls, runtime readiness handling, and live model details.
Streaming chat with Markdown, code blocks, math rendering, reasoning panels, and prompt revision tools.
File attachment support with configurable limits and structured context chunking.
Logs, GPU monitoring, active process visibility, and analytics inside the same application shell.
Administrator controls for settings, model registry governance, user management, and benchmarks.
Chat history portability with export and import support for ongoing local AI workflows.