developer tools

AI Lie Detector: Self-Verification Middleware for LLM Chatbots

Middleware that catches unreliable LLM answers before they reach users.

safetyverificationconsistencyuncertaintymiddleware

Wang Guangying

Semester

Spring 2026

Problem

LLM-based chat systems often show high confidence even when wrong, fail to admit uncertainty, and produce hallucinated answers. There is currently no lightweight mechanism to detect unreliable reasoning and trigger answer revision or retraction.

Solution

A self-verification middleware layer placed between users and chatbots. It generates a draft answer, independently verifies it via re-sampling and consistency checks, then decides to Accept, Revise, or Retract — with an uncertainty explanation. Output includes the answer plus a confidence level.

User flow

User submits a question to the chatbot through the middleware
Middleware generates a draft answer
Multi-sample consistency checks run independently
System decides to Accept, Revise, or Retract
User receives the final answer with a confidence level and (if retracted) an explanation

LLM components

Multi-sample consistency checking
Chain-of-Verification prompting
Structured output — confidence level + retraction flag
Role-based prompting — separate Answerer and Verifier roles

Tools

LLM: OpenAI / Claude API
Demo UI: Streamlit