AMS 691.01
All projects
health

DiagRAG: AI-Powered Rare Disease Diagnostic

Probabilistic diagnosis for rare diseases with grounded clinical reasoning.

RAGclinicalrare-diseaseprobabilisticknowledge-graph

By

Devshree Hardiksinh Jadeja

Semester

Spring 2026

Rare disease diagnosis is uniquely difficult: each disease has very few documented cases, symptoms overlap heavily, and diagnosis is treated as static classification — when in reality it is a sequential, uncertainty-heavy decision process.

A three-layer AI pipeline: phenotype ingestion → probabilistic inference → retrieval-augmented LLM reasoning. A Partial VAE generates a disease posterior with calibrated uncertainty, an information-gain module recommends the most informative next phenotype (cost-aware), and a RAG + LLM layer retrieves biomedical evidence to produce transparent clinical reasoning.

  • Clinician enters observed HPO symptoms
  • Probabilistic engine generates calibrated disease probabilities with uncertainty
  • System suggests the most informative and cost-aware next phenotype or test
  • RAG + LLM produces a transparent, evidence-grounded clinical explanation
  • RAG-based clinical reasoning — evidence-grounded explanations for ranked diagnoses
  • Uncertainty-aware reasoning — surfaces calibrated probabilities, not just top picks
  • LLM: OpenAI GPT / Claude
  • Embeddings & vector store: Hugging Face + FAISS / Qdrant
  • ML: PyTorch
  • Domain knowledge: SHEPHERD knowledge graph embeddings
  • Stack: FastAPI + React
  • Vibe coding: Antigravity