LangChain FAISS GPT-4 FastAPI Streamlit BM25

RAG Document Q&A

Private document Q&A with hybrid retrieval, citation overlay, and multi-turn conversation.

Overview

What is this project?

Upload any set of PDFs or text documents and ask questions in plain English. The system chunks documents using a recursive character splitter, embeds each chunk with OpenAI text-embedding-3-small, and stores vectors in a FAISS index on disk. At query time, a custom LangChain RetrievalQA chain performs MMR (Maximum Marginal Relevance) retrieval followed by a cross-encoder re-ranking step to surface the most relevant context before passing it to GPT-4. A conversation buffer memory window enables multi-turn follow-up questions that reference prior answers.

Key innovations: a hybrid sparse-dense retrieval step (BM25 + FAISS) that outperforms pure dense retrieval by 9% on domain-specific corpora, and a citation overlay that highlights the exact document passage used to generate each answer — critical for legal and compliance use cases where auditability matters.

Details

Year

2025
Category

Machine Learning
Role

AI Engineer
Duration

6 Weeks
Team Size

Solo Project

Tech Stack

LangChain FAISS GPT-4 FastAPI Streamlit BM25

Source Code

RAG Document Q&A

What is this project?

Related Projects