developer tools
AI Lie Detector: Self-Verification Middleware for LLM Chatbots
Middleware that catches unreliable LLM answers before they reach users.
safetyverificationconsistencyuncertaintymiddleware
By
Wang Guangying
Semester
Spring 2026
Problem
LLM-based chat systems often show high confidence even when wrong, fail to admit uncertainty, and produce hallucinated answers. There is currently no lightweight mechanism to detect unreliable reasoning and trigger answer revision or retraction.
Solution
A self-verification middleware layer placed between users and chatbots. It generates a draft answer, independently verifies it via re-sampling and consistency checks, then decides to Accept, Revise, or Retract — with an uncertainty explanation. Output includes the answer plus a confidence level.
User flow
- User submits a question to the chatbot through the middleware
- Middleware generates a draft answer
- Multi-sample consistency checks run independently
- System decides to Accept, Revise, or Retract
- User receives the final answer with a confidence level and (if retracted) an explanation
LLM components
- Multi-sample consistency checking
- Chain-of-Verification prompting
- Structured output — confidence level + retraction flag
- Role-based prompting — separate Answerer and Verifier roles
Tools
- LLM: OpenAI / Claude API
- Demo UI: Streamlit