Skip to content

Choosing the Right Model: Matching Task to Capability

beginner 10 min 📋 Copy-paste ready
Sources not yet verified
modelsselectioncostbeginner

Scenario

Context: You have access to multiple AI models and need to decide which to use for different tasks

Goal: Learn to match task complexity to model capability for optimal cost and quality

Anti-pattern: Always using the most powerful model, or always using the cheapest

Tools: Claude (Haiku/Sonnet/Opus)GPT-4/4o/4o-miniVarious IDEs

Conversation

The Model Selection Problem

Different models have different:

  • Capabilities (reasoning, code generation, context length)
  • Speed (tokens per second)
  • Cost (per million tokens)
  • Latency (time to first token)

Using Opus/GPT-4 for everything is expensive. Using mini models for complex tasks produces poor results.

👤 You

I need to rename a variable from x to userCount across 5 files. Should I use Claude Opus for this?

A simple, mechanical task that doesn't need a powerful model.
🤖 AI

No - this is a perfect task for a fast, cheap model (Haiku, GPT-4o-mini) or even just your IDE's find-and-replace.

Task characteristics:

  • Purely mechanical (find X, replace with Y)
  • No reasoning required
  • No context needed beyond the file

Better options:

  1. IDE find-and-replace (free, instant)
  2. CLI: sed -i 's/\bx\b/userCount/g' *.js
  3. Haiku/GPT-4o-mini if you want AI to check for naming conflicts

Rule: If a regex could do it, you probably don't need a frontier model.

Simple tasks don't need powerful models - match capability to need.

When to Upgrade

👤 You

What about refactoring a complex function with multiple responsibilities into smaller, well-designed functions?

This requires understanding, judgment, and design decisions.
🤖 AI

This warrants a more capable model. Here's why:

Task characteristics:

  • Requires understanding the function's purpose
  • Needs judgment about how to split responsibilities
  • Must preserve behavior while changing structure
  • Benefits from knowledge of design patterns

Model recommendation: Sonnet/GPT-4o or higher

Prompt structure for refactoring:

Refactor this function following the Single Responsibility Principle:

[paste function]

Requirements:
- Each new function should have one clear purpose
- Preserve existing behavior (I'll verify with tests)
- Use descriptive names that explain what each function does
- Show me the dependency graph between new functions

Why not Opus/GPT-4?
Sonnet-class models are usually sufficient for single-function refactoring. Save the top tier for:

  • System-wide architecture decisions
  • Complex debugging across multiple files
  • Nuanced tradeoff analysis
Mid-tier models handle most coding tasks well. Reserve top tier for the hardest problems.

Model Selection Framework

👤 You

Give me a quick decision framework for choosing models.

A practical heuristic for everyday use.
🤖 AI

Quick Decision Framework:

Task Type Recommended Examples
Mechanical Mini/Free Rename, format, simple regex
Standard Code Mid-tier Write functions, add features, fix bugs
Complex Reasoning Top-tier Architecture, debugging mysteries, security review
Long Context Context-optimized Codebase Q&A, document analysis

Decision Questions:

  1. Could a regex do this? -> Use IDE or mini model
  2. Is it a well-defined coding task? -> Mid-tier (Sonnet/GPT-4o)
  3. Does it need judgment or creativity? -> Mid-tier or higher
  4. Is it high-stakes (security, architecture)? -> Top-tier
  5. Do I need to reference many files? -> Check context window limits

Cost Reality Check:

Haiku/GPT-4o-mini: ~$0.25/M input, $1/M output
Sonnet/GPT-4o:     ~$3/M input, $15/M output
Opus/GPT-4:        ~$15/M input, $75/M output

10 simple tasks on mini = 1 complex task on Opus (roughly)

Pro tip: Many IDEs let you switch models per-request. Start with mid-tier, escalate if the output isn't good enough.

Have a default tier, escalate when needed, and be aware of costs.

Key Takeaways

  • Match model capability to task complexity
  • Simple mechanical tasks don't need frontier models
  • Most coding tasks work well with mid-tier models (Sonnet/GPT-4o)
  • Reserve top-tier for high-stakes or complex reasoning tasks
  • Consider cost: 10 mini calls often cost less than 1 top-tier call

Try It Yourself

Prompt Template

[Before sending your prompt, ask yourself:]

1. Could find-and-replace do this? -> Use IDE
2. Is this straightforward coding? -> Mid-tier model
3. Does this need deep reasoning? -> Top-tier model
4. How many tokens will this use? -> Check if cost matters

Variations to Try

  • Start with a cheaper model, see if output quality is acceptable
  • For experiments/drafts, use mini models first
  • For production code review, consider upgrading model tier

Sources

Tempered AI Forged Through Practice, Not Hype

Keyboard Shortcuts

j
Next page
k
Previous page
h
Section home
/
Search
?
Show shortcuts
m
Toggle sidebar
Esc
Close modal
Shift+R
Reset all progress
? Keyboard shortcuts