AMS 691.01
All projects
enterprise tools

DocBrain: Upload, Extract, Ask Anything

Upload any document, extract structured data, and ask questions about it.

RAGOCRdocument-extractionDonutChromaDB

By

Parth Pidadi

Semester

Spring 2026

Businesses deal with mountains of unstructured documents — invoices, receipts, bank statements, contracts. Each type needs different extraction logic. Enterprise tools (Azure, Google) are expensive and cloud-locked, and no free tool combines OCR, classification, extraction, and Q&A in one place.

A three-layer AI pipeline: document ingestion → structured extraction → retrieval-augmented LLM reasoning. A Vision Engine (Donut + Tesseract) handles OCR-free parsing with fallback for low-quality scans, a classification + extraction layer detects document type and extracts structured key-value JSON, and a RAG + LLM layer enables natural-language Q&A with evidence-grounded answers.

  • Upload invoices, receipts, or contracts (PDF, image, or scan)
  • Donut + LLM auto-classify the document type and extract structured data (amounts, dates, vendors)
  • Ask questions like 'What was my total spending in March?' or 'What are the payment terms?'
  • Receive answers with source citations from the RAG layer
  • Structured extraction — turns unstructured documents into JSON
  • Document classification — type detection via LLM prompting
  • RAG with evidence-grounded Q&A — answers cite source chunks
  • Cross-document reasoning — comparison queries across multiple uploads
  • LLM: Groq API (LLaMA 3 70B)
  • Vision: HuggingFace + Donut + Tesseract
  • Embeddings & RAG: BGE + ChromaDB
  • Stack: FastAPI + PostgreSQL + React
  • Compute: Google Colab Pro
  • Vibe coding: Windsurf