Amente — Sebastjan Rijavec

Version 0.4.0 · Self-hosted AI knowledge assistant

What It Is

Amente is a self-hosted, privacy-first AI knowledge assistant that lets teams and individuals query their own documents using natural language. Users upload files, ask questions, and receive grounded answers with inline citations — all without sending sensitive data to a third-party cloud.

Where most AI tools require a subscription to a hosted service and trust that the provider won’t train on your data, Amente runs entirely on your own infrastructure. Every document, every query, every answer stays on your server.

The Problem It Solves

Knowledge workers spend hours hunting through PDFs, reports, and internal documents for answers that should take seconds to find. Existing AI search tools require sending that data to external services — a non-starter for confidential business documents, legal files, medical records, or any sensitive internal knowledge.

Amente removes that trade-off. It gives organizations access to AI-powered document search without the privacy compromise.

Key Capabilities

Document ingestion Upload PDF, Word, text, Markdown, and CSV files (up to 50 MB each). Documents are automatically chunked, embedded locally, and indexed. Drag-and-drop upload with real-time progress feedback.

Intelligent retrieval Hybrid search combines BM25 keyword matching with semantic vector similarity. This means queries work whether the user types exact terms from the document or paraphrases the concept in their own words. Results are filtered by relevance thresholds — no low-quality answers padded in to fill a response.

Grounded answers with citations Every answer references the source chunks it was built from. Users see inline citation numbers in the response text and can open a side panel to read the exact passages. No hallucinated facts without a paper trail.

Multi-turn conversation Amente maintains full conversation history. Users can pin messages to keep important context in scope across a long session. The system always includes pinned messages and the last 10 messages when generating each response.

Memory spaces Documents are organized into three scopes: Global (admin-managed, shared across all users), Permanent (per-user, persists indefinitely), and Temporary (per-user, bulk-clearable). This lets organizations share a common knowledge base while users maintain their own private context.

Multi-user with role separation JWT-based authentication with per-user data isolation. Admins manage users, reset passwords, and set default AI model configuration. Users can override model settings for their own sessions.

Personalization Three visual themes (Light, Dark, High Contrast) and six accent colors. Five chat background patterns. User dashboard for profile, password, and AI preferences.

Privacy and Deployment

Amente is designed to run on a single server behind Nginx. All computation — including document embeddings using the all-MiniLM-L6-v2 model — runs locally. The only external connection is to the user’s own LLM server (tested with LM Studio running on the local network).

There is no telemetry, no external API dependency for core functionality, and no data egress.

Deploy via Docker Compose in minutes using the included interactive install script.

Technology Foundation

Layer	Technology
Frontend	React 18, TypeScript, Tailwind CSS, Vite
Backend	FastAPI (Python), async, Server-Sent Events
Vector database	ChromaDB (embedded, persisted to disk)
Embeddings	`sentence-transformers/all-MiniLM-L6-v2` (local)
Keyword search	BM25 (rank-bm25)
LLM interface	OpenAI-compatible API (LM Studio)
Auth	JWT with access + refresh tokens, bcrypt
Infrastructure	Docker Compose, Nginx reverse proxy

Who It’s For

Teams with sensitive documents — legal, medical, financial, or proprietary — who cannot accept cloud data exposure
Self-hosted enthusiasts who want AI capabilities without recurring SaaS costs
Organizations with an existing local LLM server (LM Studio, Ollama, llama.cpp) who want a polished front-end for document Q&A
Knowledge workers who need a faster way to extract answers from a large document corpus

Quality and Testing

The retrieval pipeline is validated by an automated evaluation suite: 24 tests covering document ingestion, BM25 and semantic retrieval, end-to-end RAG pipeline correctness, and a golden dataset evaluated by both cosine similarity scoring and a local LLM judge. No external evaluation APIs are used — the same local model that powers user queries also grades its own retrieval quality.

Roadmap Signal

An analytics layer is planned: per-user LLM usage tracking, document retrieval frequency, and a metrics dashboard built on SQLite. The data model and KPIs are defined; implementation follows in a future version.

Summary

Amente is a production-ready RAG platform with a thoughtfully designed interface, robust hybrid search, and a commitment to full data sovereignty. It turns any collection of documents into a queryable knowledge base — without asking you to trust anyone else with the contents.