DocBrain: Upload, Extract, Ask Anything
Upload any document, extract structured data, and ask questions about it.
By
Parth Pidadi
Semester
Spring 2026
Problem
Businesses deal with mountains of unstructured documents — invoices, receipts, bank statements, contracts. Each type needs different extraction logic. Enterprise tools (Azure, Google) are expensive and cloud-locked, and no free tool combines OCR, classification, extraction, and Q&A in one place.
Solution
A three-layer AI pipeline: document ingestion → structured extraction → retrieval-augmented LLM reasoning. A Vision Engine (Donut + Tesseract) handles OCR-free parsing with fallback for low-quality scans, a classification + extraction layer detects document type and extracts structured key-value JSON, and a RAG + LLM layer enables natural-language Q&A with evidence-grounded answers.
User flow
- Upload invoices, receipts, or contracts (PDF, image, or scan)
- Donut + LLM auto-classify the document type and extract structured data (amounts, dates, vendors)
- Ask questions like 'What was my total spending in March?' or 'What are the payment terms?'
- Receive answers with source citations from the RAG layer
LLM components
- Structured extraction — turns unstructured documents into JSON
- Document classification — type detection via LLM prompting
- RAG with evidence-grounded Q&A — answers cite source chunks
- Cross-document reasoning — comparison queries across multiple uploads
Tools
- LLM: Groq API (LLaMA 3 70B)
- Vision: HuggingFace + Donut + Tesseract
- Embeddings & RAG: BGE + ChromaDB
- Stack: FastAPI + PostgreSQL + React
- Compute: Google Colab Pro
- Vibe coding: Windsurf