Build a Test-Driven AI Agent Workflow

Build intermediate 30 min typescript

Sources verified Dec 23

Learn to configure Claude Code hooks and Git pre-commit hooks for automated test-driven AI development, ensuring agents produce verifiable code.

1. Understand the Scenario

You're setting up a new TypeScript project where AI agents will write most of the code. Without guardrails, agents produce code that looks correct but fails silently. You'll configure automated checks that catch errors during generation and before commits.

Learning Objectives

Understand the Red-Green-Generate workflow for AI agents
Configure Claude Code hooks for immediate feedback during code generation
Set up Git pre-commit hooks as quality gates
Implement Spec-First Prompting to anchor agent behavior
Avoid the Tautology Trap by separating test creation from implementation

Concepts You'll Practice

TDD for AI Agents Guardrails Agentic Workflows

2. Follow the Instructions

The Problem: Vibe Coding

Without tests, AI agents iterate based on vibes - they generate code that 'looks right' but may:

Have type errors the agent didn't catch
Miss edge cases the agent didn't consider
Contain subtle bugs that pass quick visual review

The Solution: Test-Driven Agent Development

Tests provide an objective exit condition. The agent iterates until tests pass, not until it 'feels done'.

The Two-Loop Architecture

Inner Loop (Claude Code Hooks): Fast feedback during generation

Fires after every file edit
Runs type-checker, linter, or quick tests
Agent 'feels' errors immediately and self-corrects

Outer Loop (Git Hooks): Quality gate before commit

Fires before every commit attempt
Runs full test suite
Blocks commits with failing tests

┌─────────────────────────────────────────┐
│           Inner Loop (Fast)             │
│  ┌─────┐    ┌─────┐    ┌──────────┐    │
│  │Edit │───►│Hook │───►│TypeCheck │    │
│  │File │    │Fire │    │  Result  │    │
│  └─────┘    └─────┘    └────┬─────┘    │
│                              │          │
│                    ┌─────────▼────────┐ │
│                    │ Agent Sees Error │ │
│                    │   & Self-Fixes   │ │
│                    └──────────────────┘ │
└─────────────────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────┐
│          Outer Loop (Thorough)          │
│  ┌──────┐   ┌─────────┐   ┌──────────┐ │
│  │Commit│──►│Pre-Commit│──►│Full Tests│ │
│  │Attempt│  │  Hook   │   │  Suite   │ │
│  └──────┘   └─────────┘   └────┬─────┘ │
│                                │        │
│                      ┌─────────▼──────┐ │
│                      │Pass: Commit OK │ │
│                      │Fail: Blocked   │ │
│                      └────────────────┘ │
└─────────────────────────────────────────┘

Step 1: Configure Claude Code Inner Loop Hook

Create a hook that runs TypeScript type-checking after every file edit.

step1_claude_hook.json
 // .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "node -e \"const fs = require('fs'); const data = JSON.parse(fs.readFileSync('/dev/stdin', 'utf8')); const fp = data.tool_input?.file_path || ''; if (fp.endsWith('.ts') || fp.endsWith('.tsx')) { const { execSync } = require('child_process'); try { execSync('npx tsc --noEmit', { stdio: 'pipe' }); console.log('TypeScript: OK'); } catch (e) { console.error('TypeScript errors found:'); console.error(e.stdout?.toString() || e.message); process.exit(2); } }\"",
            "timeout": 30
          }
        ]
      }
    ]
  }
} 

Step 2: Configure Git Pre-Commit Hook

Set up a pre-commit hook using Husky that runs the test suite.

step2_git_hook.sh
 # Install Husky
npm install -D husky
npx husky init

# Create pre-commit hook
cat > .husky/pre-commit << 'EOF'
#!/bin/bash

echo "Running pre-commit checks..."

# TypeScript type check
echo "1/3 TypeScript..."
npx tsc --noEmit || {
  echo "TypeScript errors - fix before committing"
  exit 1
}

# Linting
echo "2/3 Linting..."
npm run lint || {
  echo "Lint errors - fix before committing"
  exit 1
}

# Tests
echo "3/3 Tests..."
npm test || {
  echo "Tests failed - fix before committing"
  exit 1
}

echo "All checks passed!"
EOF

chmod +x .husky/pre-commit 

Step 3: Spec-First Prompting

Before asking the agent to implement a feature, ask it to write a failing test first.

step3_spec_first.md
 # Spec-First Prompt Template

## Request (BAD - Implementation First)
"Write a function that validates email addresses"

## Request (GOOD - Spec First)
"Create a Jest test file called `validateEmail.test.ts` that:
1. Tests valid emails: user@example.com, name.surname@domain.co.uk
2. Tests invalid emails: missing @, multiple @, no domain
3. Tests edge cases: empty string, null, undefined

Do NOT implement the function yet. Just write failing tests.
I will review the tests before you implement."

## Why This Works
- Agent must understand requirements to write tests
- You review test spec, not implementation (higher leverage)
- Tests become documentation of expected behavior
- Implementation is constrained to pass your approved tests 

Step 4: Red-Green-Generate Workflow

Put it all together in a complete workflow.

step4_red_green_generate.ts
 // Example: Red-Green-Generate in action

// STEP 1: RED - Agent writes failing test (you review)
// validateEmail.test.ts
import { validateEmail } from './validateEmail';

describe('validateEmail', () => {
  it('returns true for valid email', () => {
    expect(validateEmail('user@example.com')).toBe(true);
  });

  it('returns false for missing @', () => {
    expect(validateEmail('userexample.com')).toBe(false);
  });

  it('returns false for empty string', () => {
    expect(validateEmail('')).toBe(false);
  });

  it('returns false for null/undefined', () => {
    expect(validateEmail(null as any)).toBe(false);
    expect(validateEmail(undefined as any)).toBe(false);
  });
});

// STEP 2: GREEN - Agent implements to pass tests
// validateEmail.ts
export function validateEmail(email: unknown): boolean {
  if (typeof email !== 'string' || email === '') {
    return false;
  }
  
  // Simple regex - agent chose this based on test requirements
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return emailRegex.test(email);
}

// STEP 3: VERIFY - Inner loop hook catches type errors automatically
// Pre-commit hook runs full test suite before allowing commit 

Step 5: Blocking AI Smell Patterns

Add a pre-commit check that blocks common AI code smells.

step5_smell_check.sh
 # Add to .husky/pre-commit

# Check for AI smell patterns
echo "Checking for AI code smells..."

# TODO comments left by AI
if grep -rn "TODO" --include="*.ts" --include="*.tsx" src/; then
  echo "WARNING: TODO comments found - review before committing"
  # Note: Warning only, not blocking. Adjust as needed.
fi

# Console.log debugging left in
if grep -rn "console\.log" --include="*.ts" --include="*.tsx" src/; then
  echo "ERROR: console.log found - remove before committing"
  exit 1
fi

# Unused imports (common AI mistake)
npx eslint src/ --rule 'no-unused-vars: error' --quiet || {
  echo "ERROR: Unused variables found"
  exit 1
} 

Your Task

Set up a complete TDD workflow for AI agents:

Configure Claude Code hook for TypeScript validation
Set up Husky pre-commit with type check + lint + tests
Practice Spec-First Prompting - write a test spec for a feature
Add AI smell detection to your pre-commit hook
Test the flow - have an agent write code and verify hooks catch errors

3. Try It Yourself

starter_config.jsonc
 // .claude/settings.json - Complete this configuration
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "// TODO: Add TypeScript check command",
            "timeout": 30
          }
        ]
      }
    ]
  }
}

// TODO: Create .husky/pre-commit with:
// 1. TypeScript type check
// 2. Linting
// 3. Test suite
// 4. AI smell detection (console.log, TODO) 

This typescript exercise requires local setup. Copy the code to your IDE to run.

4. Get Help (If Needed)

Reveal progressive hints

Hint 1: Claude Code hooks receive JSON on stdin with tool_input.file_path. Use node -e to parse and check file extensions.

Hint 2: Exit code 2 is special for Claude hooks - it sends stderr as blocking feedback to the agent.

Hint 3: Use npx husky init to set up Husky, then create .husky/pre-commit with chmod +x.

5. Check the Solution

Reveal the complete solution

solution_config.json
 // .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "node -e \"const fs = require('fs'); const data = JSON.parse(fs.readFileSync('/dev/stdin', 'utf8')); const fp = data.tool_input?.file_path || ''; if (fp.endsWith('.ts') || fp.endsWith('.tsx')) { const { execSync } = require('child_process'); try { execSync('npx tsc --noEmit', { stdio: 'pipe' }); } catch (e) { console.error(e.stdout?.toString() || e.message); process.exit(2); } }\"",
            "timeout": 30
          }
        ]
      }
    ]
  }
}

// .husky/pre-commit
#!/bin/bash
set -e

echo "Pre-commit checks starting..."

# 1. TypeScript
echo "[1/4] TypeScript type check..."
npx tsc --noEmit

# 2. Lint
echo "[2/4] Linting..."
npm run lint --silent

# 3. Tests
echo "[3/4] Running tests..."
npm test --silent

# 4. AI smell patterns
echo "[4/4] Checking for AI code smells..."
if grep -rn "console\.log" --include="*.ts" --include="*.tsx" src/ 2>/dev/null; then
  echo "ERROR: console.log statements found"
  exit 1
fi

echo "All checks passed!" 
  Hook reads file path from stdin JSON payload 
  Only runs for TypeScript files to avoid unnecessary checks 
  exit(2) sends blocking error - agent sees feedback immediately 
  set -e ensures script exits on first failure 

Common Mistakes

Asking agent to write tests AND implementation together

Why it's wrong: Agent writes weak tests that its own buggy code passes (Tautology Trap).

How to fix: Always separate: (1) Agent writes tests → (2) Human reviews → (3) Agent implements.

Not checking file extension in hook

Why it's wrong: Running tsc on JSON or markdown files wastes time and may error.

How to fix: Filter by file extension: if (fp.endsWith('.ts')) { ... }

Using exit code 1 instead of 2 in Claude hooks

Why it's wrong: Exit code 1 is a non-blocking warning. Exit code 2 blocks the tool and sends feedback.

How to fix: Use process.exit(2) for errors that should stop the agent and prompt self-correction.

Blocking all TODO comments

Why it's wrong: Legitimate TODOs are useful for tracking work. Only block if it's a policy.

How to fix: Make TODO detection a warning, not a blocker, or allow specific TODO formats.

Test Cases

Hook blocks TypeScript errors

Writing invalid TypeScript should trigger hook and block with error message

Input: Agent writes: const x: string = 123;

Expected: Hook exits with code 2, error message visible to agent

Pre-commit blocks console.log

Attempting to commit file with console.log should fail

Input: git commit with console.log in staged file

Expected: Pre-commit hook exits 1, commit blocked

Clean code commits successfully

Valid TypeScript with passing tests should commit

Input: git commit with valid code and passing tests

Expected: All checks pass, commit succeeds

Sources

Tempered AI — Forged Through Practice, Not Hype

? Keyboard shortcuts