What is RAG-DocBot?
RAG-DocBot is a self-hosted, on-premise Retrieval-Augmented Generation (RAG) backend designed for companies that want to run an internal, privacy-preserving AI assistant on their own infrastructure.
It is a backend service and tooling — a separate UI project provides the frontend. Together they form a complete document-grounded chat assistant that runs entirely on your own servers.

Vision
RAG-DocBot is built around the principle that your data should never leave your infrastructure:
- Full data ownership — no SaaS dependency, no third-party data processing
- On-prem deployment — including air-gapped environments with no outbound internet
- Configurable for CPU and GPU — runs on commodity hardware as well as GPU-accelerated servers
- Transparent pipeline — every step from document ingestion to answer generation is visible and controllable
- Scales from small teams to enterprise — license-gated tiers to match your workload
Key Capabilities
| Capability | Description |
|---|---|
| On-premise by design | All data stays on your infrastructure. No telemetry, no external API calls. |
| Source connectors | Ingest documents via file upload, GitHub, Slack, or local directories. |
| Async job system | Document ingestion and indexing run as background jobs — no blocking the API. |
| Persistent storage | PostgreSQL for application data, Redis for live job state, Qdrant for the vector index. |
| JWT auth & RBAC | Role-based access control with viewer, editor, and admin roles. |
| License-gated plan limits | FREE, PRO, and ENTERPRISE tiers with configurable document and storage limits. |
| CPU & GPU support | Works on standard CPU servers and CUDA-capable GPU servers. |
Current Status
RAG-DocBot is at v1.0.0 and under active development. See the Changelog for the full release history.