Debug a Hallucinating AI
Learn to identify, diagnose, and fix AI hallucinations in a production-like scenario using grounding techniques.
1. Understand the Scenario
You've deployed a customer support bot, but users are reporting that it sometimes gives incorrect information about product features and pricing. Your task is to identify why this is happening and implement fixes.
Learning Objectives
- Identify common hallucination patterns
- Implement grounding with source documents
- Add citation requirements to prompts
- Build verification checks for factual claims
Concepts You'll Practice
2. Follow the Instructions
The Problem
Your support bot is answering questions about a SaaS product. Users have reported these incorrect responses:
❌ "Our Enterprise plan includes unlimited API calls" (it's actually 100k/month)
❌ "You can export data in XML format" (only JSON and CSV are supported)
❌ "We offer a 60-day money-back guarantee" (it's 30 days)
These are hallucinations — the model is confidently generating false information.
Step 1: Understand Why This Happens
LLMs hallucinate because:
- They generate plausible-sounding text based on patterns, not facts
- They have no way to know what they don't know
- They try to be helpful, even when they should say "I don't know"
The fix: ground responses in source documents.
// The problematic prompt (causes hallucinations)
const badPrompt = `You are a helpful customer support agent for AcmeSaaS.
Answer the customer's question about our product.`;
// The customer asks: "What's included in Enterprise?"
// Model might hallucinate features that don't exist!
Step 2: Ground Responses in Source Documents
Provide the actual product documentation to the model. It can only reference what's in the context.
// Product documentation (your source of truth)
const productDocs = `
# AcmeSaaS Pricing & Features
## Plans
- Starter: $29/month, 10k API calls, JSON export only
- Pro: $99/month, 50k API calls, JSON + CSV export
- Enterprise: $299/month, 100k API calls, JSON + CSV export, SSO
## Policies
- 30-day money-back guarantee on all plans
- Annual billing: 20% discount
- No XML export (feature request tracked)
`;
Step 3: Update the Prompt with Grounding
Force the model to only use information from the provided documents.
const groundedPrompt = `You are a customer support agent for AcmeSaaS.
<product_documentation>
${productDocs}
</product_documentation>
Rules:
1. ONLY answer based on the documentation above
2. If the answer is not in the docs, say "I don't have that information"
3. Always cite which section your answer comes from
4. Never make up features, prices, or policies
Customer question: {question}`;
⚠️ WARNING: Grounding Isn't Perfect
Even with grounding, models can misinterpret or combine information incorrectly. Consider adding verification for critical claims like pricing.
Step 4: Add Citation Requirements
Require the model to cite its sources. This makes hallucinations easier to spot and builds user trust.
Your Task: Complete the implementation with grounding, citations, and a verification step for pricing claims.
3. Try It Yourself
import OpenAI from 'openai';
const openai = new OpenAI();
const productDocs = `
# AcmeSaaS Pricing & Features
## Plans
- Starter: $29/month, 10k API calls, JSON export only
- Pro: $99/month, 50k API calls, JSON + CSV export
- Enterprise: $299/month, 100k API calls, JSON + CSV export, SSO
## Policies
- 30-day money-back guarantee on all plans
- Annual billing: 20% discount
`;
// TODO: Create a response schema that requires citations
const responseSchema = {
// Define schema here
};
// TODO: Implement the support bot with grounding
async function answerQuestion(question: string): Promise<{
answer: string;
citations: string[];
confidence: 'high' | 'medium' | 'low';
}> {
// 1. Build a grounded prompt
// 2. Require citations in the response
// 3. Assess confidence based on source coverage
throw new Error('Not implemented');
}
// TODO: Implement verification for pricing claims
function verifyPricingClaims(answer: string): {
valid: boolean;
issues: string[];
} {
// Check if any prices mentioned match the docs
throw new Error('Not implemented');
}
// Test cases that previously caused hallucinations
const testQuestions = [
"What's included in the Enterprise plan?",
"Can I export my data to XML?",
"What's your refund policy?",
"Do you offer a lifetime deal?"
];
for (const q of testQuestions) {
answerQuestion(q).then(result => {
console.log(`Q: ${q}`);
console.log(`A: ${result.answer}`);
console.log(`Citations: ${result.citations.join(', ')}`);
console.log(`Confidence: ${result.confidence}\n`);
});
} This typescript exercise requires local setup. Copy the code to your IDE to run.
4. Get Help (If Needed)
Reveal progressive hints
5. Check the Solution
Reveal the complete solution
/**
* Key Points:
* - Line ~17: Keep a verified list of valid prices for cross-checking
* - Line ~31: found_in_docs flag helps identify when model is uncertain
* - Line ~54: Explicit instruction to admit when info isn't in docs
* - Line ~85: Downgrade confidence when verification fails
* - Line ~95: Regex-based verification catches hallucinated prices
*/
import OpenAI from 'openai';
const openai = new OpenAI();
const productDocs = `
# AcmeSaaS Pricing & Features
## Plans
- Starter: $29/month, 10k API calls, JSON export only
- Pro: $99/month, 50k API calls, JSON + CSV export
- Enterprise: $299/month, 100k API calls, JSON + CSV export, SSO
## Policies
- 30-day money-back guarantee on all plans
- Annual billing: 20% discount
`;
// Known prices for verification
const VALID_PRICES = ['$29', '$99', '$299', '20%', '30-day'];
const responseSchema = {
type: 'object',
properties: {
answer: {
type: 'string',
description: 'The answer to the customer question'
},
citations: {
type: 'array',
items: { type: 'string' },
description: 'Exact quotes from documentation that support this answer'
},
found_in_docs: {
type: 'boolean',
description: 'Whether the answer was found in the documentation'
}
},
required: ['answer', 'citations', 'found_in_docs'],
additionalProperties: false
};
async function answerQuestion(question: string): Promise<{
answer: string;
citations: string[];
confidence: 'high' | 'medium' | 'low';
}> {
const response = await openai.chat.completions.create({
model: 'gpt-4o-2024-08-06',
messages: [
{
role: 'system',
content: `You are a customer support agent for AcmeSaaS.
<product_documentation>
${productDocs}
</product_documentation>
Rules:
1. ONLY answer based on the documentation above
2. If the answer is NOT in the docs, set found_in_docs to false and say "I don't have information about that in our documentation"
3. Include exact quotes from the docs as citations
4. Never make up features, prices, or policies`
},
{
role: 'user',
content: question
}
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'support_response',
strict: true,
schema: responseSchema
}
}
});
const result = JSON.parse(response.choices[0].message.content!);
// Determine confidence based on grounding
let confidence: 'high' | 'medium' | 'low';
if (!result.found_in_docs) {
confidence = 'low';
} else if (result.citations.length >= 2) {
confidence = 'high';
} else if (result.citations.length === 1) {
confidence = 'medium';
} else {
confidence = 'low';
}
// Verify pricing claims
const verification = verifyPricingClaims(result.answer);
if (!verification.valid) {
console.warn('⚠️ Pricing verification failed:', verification.issues);
confidence = 'low';
}
return {
answer: result.answer,
citations: result.citations,
confidence
};
}
function verifyPricingClaims(answer: string): {
valid: boolean;
issues: string[];
} {
const issues: string[] = [];
// Extract any price-like patterns from the answer
const pricePattern = /\$\d+|\d+%|\d+-day/g;
const mentionedPrices = answer.match(pricePattern) || [];
for (const price of mentionedPrices) {
if (!VALID_PRICES.includes(price)) {
issues.push(`Unknown price/value: ${price}`);
}
}
return {
valid: issues.length === 0,
issues
};
}
// Test cases
const testQuestions = [
"What's included in the Enterprise plan?",
"Can I export my data to XML?",
"What's your refund policy?",
"Do you offer a lifetime deal?"
];
for (const q of testQuestions) {
answerQuestion(q).then(result => {
console.log(`Q: ${q}`);
console.log(`A: ${result.answer}`);
console.log(`Citations: ${result.citations.join(', ')}`);
console.log(`Confidence: ${result.confidence}\n`);
});
}
/* Expected outputs:
Q: What's included in the Enterprise plan?
A: The Enterprise plan costs $299/month and includes 100k API calls, JSON + CSV export, and SSO.
Citations: ['Enterprise: $299/month, 100k API calls, JSON + CSV export, SSO']
Confidence: high
Q: Can I export my data to XML?
A: I don't have information about XML export. Based on our documentation, we support JSON and CSV export formats.
Citations: ['JSON export only', 'JSON + CSV export']
Confidence: medium
Q: What's your refund policy?
A: We offer a 30-day money-back guarantee on all plans.
Citations: ['30-day money-back guarantee on all plans']
Confidence: high
Q: Do you offer a lifetime deal?
A: I don't have information about that in our documentation.
Citations: []
Confidence: low
*/ Common Mistakes
Not including the documentation in the prompt context
Why it's wrong: Without the docs in context, the model has no source of truth and will use training data (which may be wrong).
How to fix: Always include source documents in the system or user message with clear delimiters like <product_documentation>.
Relying solely on the model to admit uncertainty
Why it's wrong: Models are trained to be helpful and may still confabulate. They don't reliably know what they don't know.
How to fix: Add explicit verification checks for critical information like prices, dates, and feature lists.
Not tracking confidence levels
Why it's wrong: Users need to know when to trust an answer vs when to escalate to a human.
How to fix: Calculate confidence based on citation count, found_in_docs flag, and verification results.
Test Cases
Answers from docs with citations
Questions with clear answers should have citations
What's included in the Enterprise plan?Answer mentions $299, 100k API calls, SSO with citation from docsAdmits when not in docs
Questions without answers should be flagged
Do you offer a lifetime deal?Says info not in documentation, confidence: lowCatches invalid prices
Verification should flag hallucinated prices
Inject: Enterprise is $599/monthVerification flags $599 as unknown price