JSON Schema for Structured Outputs

Type Systems beginner 15 min

Sources verified Dec 22

JSON Schema defines the exact structure and constraints for LLM outputs, ensuring type-safe, validated responses without post-processing guesswork.

JSON Schema is a vocabulary for annotating and validating JSON documents. When applied to LLM outputs, it transforms unpredictable text generation into structured, validated data that your code can safely consume.

The key insight: Instead of parsing free-form text and hoping the LLM followed your instructions, you define the output schema upfront and the LLM is constrained to produce only valid JSON matching that schema.

Why JSON Schema for AI?

Without structured outputs, you get:

Unpredictable text formats that require complex parsing
Missing fields you expected
Type mismatches (strings when you need numbers)
Inconsistent naming (sometimes 'email', sometimes 'emailAddress')

With JSON Schema, the LLM's output is guaranteed to match your specification.

person_schema.json
 {
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["name", "email", "skills"],
  "properties": {
    "name": {
      "type": "string",
      "description": "Full name of the person"
    },
    "email": {
      "type": "string",
      "format": "email",
      "description": "Valid email address"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150,
      "description": "Age in years (optional)"
    },
    "skills": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "description": "List of professional skills"
    },
    "experience_level": {
      "type": "string",
      "enum": ["junior", "mid", "senior", "staff"],
      "description": "Career level"
    }
  },
  "additionalProperties": false
} 
  required fields MUST be present in output 
  format validation ensures valid email 
  constraints like min/max enforce business rules 
  enum restricts to specific values 
  prevents LLM from adding unexpected fields 

OpenAI Structured Outputs Example

OpenAI's response_format parameter enforces JSON Schema compliance:

openai_structured_output.ts
 import OpenAI from 'openai';

const openai = new OpenAI();

const schema = {
  type: 'object',
  required: ['entities', 'sentiment', 'summary'],
  properties: {
    entities: {
      type: 'array',
      items: {
        type: 'object',
        required: ['text', 'type'],
        properties: {
          text: { type: 'string' },
          type: { 
            type: 'string', 
            enum: ['person', 'organization', 'location', 'date']
          }
        },
        additionalProperties: false
      }
    },
    sentiment: {
      type: 'string',
      enum: ['positive', 'negative', 'neutral']
    },
    summary: { type: 'string' }
  },
  additionalProperties: false
};

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Analyze this email: ...' }
  ],
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'email_analysis',
      schema: schema,
      strict: true  // Enforces schema compliance
    }
  }
});

// Output is GUARANTEED to match schema
const result = JSON.parse(response.choices[0].message.content);
console.log(result.entities);  // TypeScript knows this exists
console.log(result.sentiment); // TypeScript knows this is 'positive' | 'negative' | 'neutral' 
  strict: true enables schema enforcement 
  No need to validate — schema guarantees structure 

Common Schema Patterns

Nested Objects

{
  "type": "object",
  "properties": {
    "address": {
      "type": "object",
      "required": ["city", "country"],
      "properties": {
        "street": { "type": "string" },
        "city": { "type": "string" },
        "country": { "type": "string" }
      }
    }
  }
}

Arrays of Specific Types

{
  "type": "array",
  "items": {
    "type": "object",
    "required": ["id", "name"],
    "properties": {
      "id": { "type": "integer" },
      "name": { "type": "string" }
    }
  },
  "minItems": 1,
  "maxItems": 10
}

Discriminated Unions (oneOf)

{
  "oneOf": [
    {
      "type": "object",
      "required": ["type", "url"],
      "properties": {
        "type": { "const": "link" },
        "url": { "type": "string", "format": "uri" }
      }
    },
    {
      "type": "object",
      "required": ["type", "path"],
      "properties": {
        "type": { "const": "file" },
        "path": { "type": "string" }
      }
    }
  ]
}

When to Use JSON Schema

Use JSON Schema When	Consider Alternatives When
You need validated, structured data	Free-form text is acceptable
Output feeds directly into your code	Output is for human reading
Type safety matters (production apps)	Prototyping/experimenting
You need arrays, nested objects, enums	Simple key-value extraction
Compliance/audit requires validation	Speed is more critical than validation

JSON Schema Best Practices

Start simple: Begin with required fields only, add constraints later
Use descriptions: They guide the LLM on what to extract
Set additionalProperties: false: Prevents unexpected fields
Use enums for categories: Better than free-form strings
Validate business rules: Use minimum, maxLength, pattern, etc.
Version your schemas: Track changes over time
Test with edge cases: Empty arrays, optional fields, null values

Key Takeaways

JSON Schema defines structure, types, and constraints for LLM outputs
OpenAI's structured outputs enforce schema compliance (no validation needed)
Use `required`, `enum`, `additionalProperties: false` for strict typing
Descriptions guide the LLM on what to extract
JSON Schema is language-agnostic but pairs with Pydantic (Python) and Zod (TypeScript)
Not all providers enforce schemas — check documentation for guarantees

In This Platform

This entire platform is built on JSON Schema. Every content type (dimension, question, source, concept, module) has a corresponding schema that validates structure at build time. The schema files serve as both documentation and enforcement.

Relevant Files:

schema/survey.schema.json
schema/concept.schema.json
schema/learning.schema.json
schema/exercise.schema.json
schema/comparison.schema.json

build.js (excerpt)
 // build.js validates all content against schemas
import Ajv from 'ajv';
import addFormats from 'ajv-formats';

const ajv = new Ajv({ allErrors: true });
addFormats(ajv);

// Load schema
const questionSchema = JSON.parse(fs.readFileSync('schema/survey.schema.json'));
const validate = ajv.compile(questionSchema);

// Validate each question
for (const question of questions) {
  const valid = validate(question);
  if (!valid) {
    console.error(`Invalid question ${question.id}:`);
    console.error(validate.errors);
    process.exit(1);
  }
}

// Only valid content makes it to production 

Sources

Tempered AI — Forged Through Practice, Not Hype

? Keyboard shortcuts

JSON Schema for Structured Outputs

Why JSON Schema for AI?

OpenAI Structured Outputs Example

Common Schema Patterns

Nested Objects

Arrays of Specific Types

Discriminated Unions (oneOf)

When to Use JSON Schema

JSON Schema Best Practices

Key Takeaways

In This Platform

Related Concepts

Sources