developer tools
DataLineage AI
A multi-agent debugger that traces broken pipelines to their true cause.
multi-agentSQLdbtlineageLangGraph
By
Bhanage Aishwarya
Semester
Spring 2026
Problem
Data engineers and analysts still play pipeline detective when SQL or dbt jobs fail. Existing tools surface errors but rarely identify the true upstream cause, forcing manual lineage tracing and trial-and-error fixes — resulting in slower debugging, broken dashboards, and unreliable pipelines.
Solution
A multi-agent AI debugger that ingests broken SQL/dbt code and error messages, reconstructs lineage from the dbt manifest, and outputs confidence-scored root causes with corrected queries — cutting debug time from 4 hours to 20 minutes. Specialized agents handle parsing, lineage tracing, hypothesis ranking, and SQL repair.
User flow
- Upload broken SQL/dbt model and error message
- Parser agent extracts query structure and key entities
- Lineage agent reconstructs the pipeline lineage graph
- Reasoner agent identifies likely root causes and ranks hypotheses
- Fix agent generates corrected SQL and validation steps
LLM components
- Structured SQL extraction — LLM-driven extraction of tables, joins, filters, and types
- Retrieval-augmented reasoning — over lineage and dbt manifests
- Multi-agent orchestration — LangGraph coordinates Parser, Lineage, Reasoner, and Fix agents
- Evidence-grounded root cause analysis — hypotheses ranked by supporting evidence
- Automated SQL fix generation — produces corrected queries with plain-English impact summaries
Tools
- Vibe coding & AI dev: Cursor, ChatGPT, Claude / Gemini
- Agent orchestration: Python, FastAPI, LangGraph
- LLM & parsing: OpenAI / Claude API, sqlglot, SentenceTransformers
- Data & indexing: PostgreSQL, FAISS or Chroma, dbt manifest.json
- Cloud & storage: AWS S3, EC2, RDS
- Frontend: Streamlit or Next.js
- Deployment: Docker, Vercel