Build a Test-Driven AI Agent Workflow
Learn to configure Claude Code hooks and Git pre-commit hooks for automated test-driven AI development, ensuring agents produce verifiable code.
1. Understand the Scenario
You're setting up a new TypeScript project where AI agents will write most of the code. Without guardrails, agents produce code that looks correct but fails silently. You'll configure automated checks that catch errors during generation and before commits.
Learning Objectives
- Understand the Red-Green-Generate workflow for AI agents
- Configure Claude Code hooks for immediate feedback during code generation
- Set up Git pre-commit hooks as quality gates
- Implement Spec-First Prompting to anchor agent behavior
- Avoid the Tautology Trap by separating test creation from implementation
Concepts You'll Practice
2. Follow the Instructions
The Problem: Vibe Coding
Without tests, AI agents iterate based on vibes - they generate code that 'looks right' but may:
- Have type errors the agent didn't catch
- Miss edge cases the agent didn't consider
- Contain subtle bugs that pass quick visual review
The Solution: Test-Driven Agent Development
Tests provide an objective exit condition. The agent iterates until tests pass, not until it 'feels done'.
The Two-Loop Architecture
Inner Loop (Claude Code Hooks): Fast feedback during generation
- Fires after every file edit
- Runs type-checker, linter, or quick tests
- Agent 'feels' errors immediately and self-corrects
Outer Loop (Git Hooks): Quality gate before commit
- Fires before every commit attempt
- Runs full test suite
- Blocks commits with failing tests
┌─────────────────────────────────────────┐
│ Inner Loop (Fast) │
│ ┌─────┐ ┌─────┐ ┌──────────┐ │
│ │Edit │───►│Hook │───►│TypeCheck │ │
│ │File │ │Fire │ │ Result │ │
│ └─────┘ └─────┘ └────┬─────┘ │
│ │ │
│ ┌─────────▼────────┐ │
│ │ Agent Sees Error │ │
│ │ & Self-Fixes │ │
│ └──────────────────┘ │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Outer Loop (Thorough) │
│ ┌──────┐ ┌─────────┐ ┌──────────┐ │
│ │Commit│──►│Pre-Commit│──►│Full Tests│ │
│ │Attempt│ │ Hook │ │ Suite │ │
│ └──────┘ └─────────┘ └────┬─────┘ │
│ │ │
│ ┌─────────▼──────┐ │
│ │Pass: Commit OK │ │
│ │Fail: Blocked │ │
│ └────────────────┘ │
└─────────────────────────────────────────┘
Step 1: Configure Claude Code Inner Loop Hook
Create a hook that runs TypeScript type-checking after every file edit.
// .claude/settings.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "node -e \"const fs = require('fs'); const data = JSON.parse(fs.readFileSync('/dev/stdin', 'utf8')); const fp = data.tool_input?.file_path || ''; if (fp.endsWith('.ts') || fp.endsWith('.tsx')) { const { execSync } = require('child_process'); try { execSync('npx tsc --noEmit', { stdio: 'pipe' }); console.log('TypeScript: OK'); } catch (e) { console.error('TypeScript errors found:'); console.error(e.stdout?.toString() || e.message); process.exit(2); } }\"",
"timeout": 30
}
]
}
]
}
} Step 2: Configure Git Pre-Commit Hook
Set up a pre-commit hook using Husky that runs the test suite.
# Install Husky
npm install -D husky
npx husky init
# Create pre-commit hook
cat > .husky/pre-commit << 'EOF'
#!/bin/bash
echo "Running pre-commit checks..."
# TypeScript type check
echo "1/3 TypeScript..."
npx tsc --noEmit || {
echo "TypeScript errors - fix before committing"
exit 1
}
# Linting
echo "2/3 Linting..."
npm run lint || {
echo "Lint errors - fix before committing"
exit 1
}
# Tests
echo "3/3 Tests..."
npm test || {
echo "Tests failed - fix before committing"
exit 1
}
echo "All checks passed!"
EOF
chmod +x .husky/pre-commit Step 3: Spec-First Prompting
Before asking the agent to implement a feature, ask it to write a failing test first.
# Spec-First Prompt Template
## Request (BAD - Implementation First)
"Write a function that validates email addresses"
## Request (GOOD - Spec First)
"Create a Jest test file called `validateEmail.test.ts` that:
1. Tests valid emails: user@example.com, name.surname@domain.co.uk
2. Tests invalid emails: missing @, multiple @, no domain
3. Tests edge cases: empty string, null, undefined
Do NOT implement the function yet. Just write failing tests.
I will review the tests before you implement."
## Why This Works
- Agent must understand requirements to write tests
- You review test spec, not implementation (higher leverage)
- Tests become documentation of expected behavior
- Implementation is constrained to pass your approved tests Step 4: Red-Green-Generate Workflow
Put it all together in a complete workflow.
// Example: Red-Green-Generate in action
// STEP 1: RED - Agent writes failing test (you review)
// validateEmail.test.ts
import { validateEmail } from './validateEmail';
describe('validateEmail', () => {
it('returns true for valid email', () => {
expect(validateEmail('user@example.com')).toBe(true);
});
it('returns false for missing @', () => {
expect(validateEmail('userexample.com')).toBe(false);
});
it('returns false for empty string', () => {
expect(validateEmail('')).toBe(false);
});
it('returns false for null/undefined', () => {
expect(validateEmail(null as any)).toBe(false);
expect(validateEmail(undefined as any)).toBe(false);
});
});
// STEP 2: GREEN - Agent implements to pass tests
// validateEmail.ts
export function validateEmail(email: unknown): boolean {
if (typeof email !== 'string' || email === '') {
return false;
}
// Simple regex - agent chose this based on test requirements
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
return emailRegex.test(email);
}
// STEP 3: VERIFY - Inner loop hook catches type errors automatically
// Pre-commit hook runs full test suite before allowing commit Step 5: Blocking AI Smell Patterns
Add a pre-commit check that blocks common AI code smells.
# Add to .husky/pre-commit
# Check for AI smell patterns
echo "Checking for AI code smells..."
# TODO comments left by AI
if grep -rn "TODO" --include="*.ts" --include="*.tsx" src/; then
echo "WARNING: TODO comments found - review before committing"
# Note: Warning only, not blocking. Adjust as needed.
fi
# Console.log debugging left in
if grep -rn "console\.log" --include="*.ts" --include="*.tsx" src/; then
echo "ERROR: console.log found - remove before committing"
exit 1
fi
# Unused imports (common AI mistake)
npx eslint src/ --rule 'no-unused-vars: error' --quiet || {
echo "ERROR: Unused variables found"
exit 1
} Your Task
Set up a complete TDD workflow for AI agents:
- Configure Claude Code hook for TypeScript validation
- Set up Husky pre-commit with type check + lint + tests
- Practice Spec-First Prompting - write a test spec for a feature
- Add AI smell detection to your pre-commit hook
- Test the flow - have an agent write code and verify hooks catch errors
3. Try It Yourself
// .claude/settings.json - Complete this configuration
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "// TODO: Add TypeScript check command",
"timeout": 30
}
]
}
]
}
}
// TODO: Create .husky/pre-commit with:
// 1. TypeScript type check
// 2. Linting
// 3. Test suite
// 4. AI smell detection (console.log, TODO) This typescript exercise requires local setup. Copy the code to your IDE to run.
4. Get Help (If Needed)
Reveal progressive hints
5. Check the Solution
Reveal the complete solution
// .claude/settings.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "node -e \"const fs = require('fs'); const data = JSON.parse(fs.readFileSync('/dev/stdin', 'utf8')); const fp = data.tool_input?.file_path || ''; if (fp.endsWith('.ts') || fp.endsWith('.tsx')) { const { execSync } = require('child_process'); try { execSync('npx tsc --noEmit', { stdio: 'pipe' }); } catch (e) { console.error(e.stdout?.toString() || e.message); process.exit(2); } }\"",
"timeout": 30
}
]
}
]
}
}
// .husky/pre-commit
#!/bin/bash
set -e
echo "Pre-commit checks starting..."
# 1. TypeScript
echo "[1/4] TypeScript type check..."
npx tsc --noEmit
# 2. Lint
echo "[2/4] Linting..."
npm run lint --silent
# 3. Tests
echo "[3/4] Running tests..."
npm test --silent
# 4. AI smell patterns
echo "[4/4] Checking for AI code smells..."
if grep -rn "console\.log" --include="*.ts" --include="*.tsx" src/ 2>/dev/null; then
echo "ERROR: console.log statements found"
exit 1
fi
echo "All checks passed!" Common Mistakes
Asking agent to write tests AND implementation together
Why it's wrong: Agent writes weak tests that its own buggy code passes (Tautology Trap).
How to fix: Always separate: (1) Agent writes tests → (2) Human reviews → (3) Agent implements.
Not checking file extension in hook
Why it's wrong: Running tsc on JSON or markdown files wastes time and may error.
How to fix: Filter by file extension: if (fp.endsWith('.ts')) { ... }
Using exit code 1 instead of 2 in Claude hooks
Why it's wrong: Exit code 1 is a non-blocking warning. Exit code 2 blocks the tool and sends feedback.
How to fix: Use process.exit(2) for errors that should stop the agent and prompt self-correction.
Blocking all TODO comments
Why it's wrong: Legitimate TODOs are useful for tracking work. Only block if it's a policy.
How to fix: Make TODO detection a warning, not a blocker, or allow specific TODO formats.
Test Cases
Hook blocks TypeScript errors
Writing invalid TypeScript should trigger hook and block with error message
Agent writes: const x: string = 123;Hook exits with code 2, error message visible to agentPre-commit blocks console.log
Attempting to commit file with console.log should fail
git commit with console.log in staged filePre-commit hook exits 1, commit blockedClean code commits successfully
Valid TypeScript with passing tests should commit
git commit with valid code and passing testsAll checks pass, commit succeeds