Private RAG Knowledge System
On-premise retrieval-augmented generation system for a professional services firm handling sensitive client documents.
The Challenge
A legal-adjacent services firm needed internal AI search over 10,000+ client documents but couldn't use cloud AI tools due to data sensitivity and compliance constraints.
The Approach
Assessed infrastructure constraints. Designed a fully air-gapped architecture using local models and vector storage. Chose Ollama for model serving and pgvector for embeddings storage.
The Solution
Deployed a private RAG stack: Ollama running Mistral 7B, LangChain for document chunking and retrieval, PostgreSQL with pgvector for embeddings, and a clean React UI for staff queries. All running on their existing server infrastructure.
Related projects.
Ready to build something
that matters?
Let's discuss your project. I typically respond within 24 hours.