AnythingLLM is the all-in-one private AI workspace — chat, document RAG, multi-user workspaces, and agents — trusted by 54,000+ GitHub stars. Point it at ARK and every chunk, embedding, and retrieval runs inside the EU with zero retention by default.
AnythingLLM, built by Mintplex Labs, is a self-hosted AI workspace that combines chat, document RAG, and agent capabilities in one app. Upload PDFs, Word docs, text files, or entire websites; AnythingLLM chunks, embeds, and indexes them automatically. Then any chat or agent can query those documents with proper citations.
It runs as a desktop app or a Docker container. Multi-user with role-based access, workspace-isolated knowledge, and pluggable LLM and embedding backends. MIT-licensed, free for self-hosting, and with a 54,000+ star community as of March 2026.
On ARK, AnythingLLM's workspace-centric RAG design pairs cleanly with ARK's embedding and guardrail models served through the same OpenAI-compatible API — so retrieval, generation, and safety all run in one governed, EU-resident stack.
Document RAG surfaces your most sensitive content to an LLM. ARK makes sure that content never leaves the region, never gets retained, and never gets reused for training.
Every embedding request, every chat completion, every reranker call routes through an EU-hosted inference gateway. Zero retention is the default, not an add-on.
ARK serves chat models, embedding models (BGE, Qwen3-Embedding, E5-Mistral), and guardrail models (Llama-Guard-3) through the same API — clean architecture, single audit trail.
ARK's runtime delivers sub-second TTFT under load and 100M+ tokens/min throughput — so a team-wide knowledge assistant stays snappy even when everyone queries at once.
In AnythingLLM's settings, pick Generic OpenAI as the LLM provider, set the base URL to ARK, paste your API key, and pick a model. Same flow for the embeddings provider.
Your workspaces, users, and document libraries stay where they are — only the inference traffic changes addresses.
AnythingLLM setup docs →environment: # Point AnythingLLM at the ARK gateway: LLM_PROVIDER: "generic-openai" GENERIC_OPENAI_BASE_PATH: "https://api.ark-labs.cloud/api/v1" GENERIC_OPENAI_API_KEY: "ark_sk_live_..." GENERIC_OPENAI_MODEL_PREF: "llama-3.3-70b-instruct" # Same gateway for embeddings: EMBEDDING_ENGINE: "generic-openai" EMBEDDING_BASE_PATH: "https://api.ark-labs.cloud/api/v1" EMBEDDING_MODEL_PREF: "bge-multilingual-gemma2" # → Every chat and embedding runs EU, zero retention.
One workspace per department. HR policies, engineering wiki, legal templates — every query returns citations, every byte stays in the EU.
Drop in a folder of contracts; ask "what are our liability caps" or "summarize every termination clause" — with links back to the exact paragraph.
Ingest your product docs, changelogs, and past tickets. Give support agents a private assistant that answers before they escalate.
Stand up a citation-grade knowledge assistant over your documents in an afternoon. Free credits on ARK Cloud — or self-hosted on ARK Core / Tailored.