Skip to main content

What is RAG-DocBot?

RAG-DocBot is a self-hosted, on-premise Retrieval-Augmented Generation (RAG) backend designed for companies that want to run an internal, privacy-preserving AI assistant on their own infrastructure.

It is a backend service and tooling — a separate UI project provides the frontend. Together they form a complete document-grounded chat assistant that runs entirely on your own servers.

RAG Architecture


Vision

RAG-DocBot is built around the principle that your data should never leave your infrastructure:

  • Full data ownership — no SaaS dependency, no third-party data processing
  • On-prem deployment — including air-gapped environments with no outbound internet
  • Configurable for CPU and GPU — runs on commodity hardware as well as GPU-accelerated servers
  • Transparent pipeline — every step from document ingestion to answer generation is visible and controllable
  • Scales from small teams to enterprise — license-gated tiers to match your workload

Key Capabilities

CapabilityDescription
On-premise by designAll data stays on your infrastructure. No telemetry, no external API calls.
Source connectorsIngest documents via file upload, GitHub, Slack, or local directories.
Async job systemDocument ingestion and indexing run as background jobs — no blocking the API.
Persistent storagePostgreSQL for application data, Redis for live job state, Qdrant for the vector index.
JWT auth & RBACRole-based access control with viewer, editor, and admin roles.
License-gated plan limitsFREE, PRO, and ENTERPRISE tiers with configurable document and storage limits.
CPU & GPU supportWorks on standard CPU servers and CUDA-capable GPU servers.

Current Status

RAG-DocBot is at v1.0.0 and under active development. See the Changelog for the full release history.