LangChain FAISS GPT-4 FastAPI Streamlit BM25

RAG Document Q&A

Private document Q&A with hybrid retrieval, citation overlay, and multi-turn conversation.

RAG Document Q&A
Overview

What is this project?

Upload any set of PDFs or text documents and ask questions in plain English. The system chunks documents using a recursive character splitter, embeds each chunk with OpenAI text-embedding-3-small, and stores vectors in a FAISS index on disk. At query time, a custom LangChain RetrievalQA chain performs MMR (Maximum Marginal Relevance) retrieval followed by a cross-encoder re-ranking step to surface the most relevant context before passing it to GPT-4. A conversation buffer memory window enables multi-turn follow-up questions that reference prior answers.

Key innovations: a hybrid sparse-dense retrieval step (BM25 + FAISS) that outperforms pure dense retrieval by 9% on domain-specific corpora, and a citation overlay that highlights the exact document passage used to generate each answer — critical for legal and compliance use cases where auditability matters.