Model Selection & Routing

Weight: 10%

Sources verified Dec 22, 2025

Why It Matters

GitHub Copilot now supports Claude (Opus/Sonnet), GPT-4, and Gemini. Cursor and Windsurf offer even more options. Mature users know that different models excel at different tasks, and the 5x cost difference between Opus and Haiku matters. IDC predicts 70% of top enterprises will use dynamic model routing by 2028.

Learn More

ConceptUnderstand tokenization and cost implications across models
ConversationSee decision frameworks for model selection
Learning TrackBuild foundational knowledge for effective model selection

2025 Context

Claude Opus 4.5 leads SWE-bench (80.9%), but GPT-5-Codex (74.5%) and Gemini Flash (faster, cheaper) each excel at different tasks. Augment Code's production data shows developers assembling 'model alloys'—matching Sonnet 4.5 to multi-file reasoning, Sonnet 4.0 to fast structured tasks, GPT-5 to explanatory contexts. The skill gap has moved from 'which is best?' to 'best for what?'

Assessment Questions (5)

Maximum possible score: 19 points

○ Q1 single choice 4 pts

Are you aware that GitHub Copilot supports multiple AI models (Claude, GPT, Gemini)?

[0] No, I didn't know this

[1] Yes, but I always use the default

[2] Yes, I've tried different models occasionally

[4] Yes, I regularly switch based on the task

○ Q2 single choice 4 pts

How do you select AI models for different coding tasks?

[0] I use whatever is default—I don't think about model selection

[1] I use the same model for everything (my favorite)

[2] I have rough preferences (e.g., Claude for refactoring, GPT for docs)

[3] I systematically match models to tasks, considering cost and speed tradeoffs

[4] I assemble 'model alloys'—matching cognitive styles (reasoning vs fast) to task profiles

○ Q3 multi select 5 pts

Which of the following model-task pairings do you use?

[1] Claude Opus/Sonnet for complex refactoring or architecture

[1] Gemini Flash or GPT-3.5 for simple tasks (tests, docs)

[1] Reasoning models (o3, Claude thinking) for debugging

[1] Gemini for very long context (1M+ tokens)

[1] Thinking triggers (think hard, ultrathink) for complex problems

[0] I don't think about model selection

Note: Thinking triggers (ultrathink) activate extended reasoning budgets in Claude Code. Gemini Deep Think uses parallel reasoning. Different cognitive styles for different tasks.

○ Q4 single choice 3 pts

Are you aware of the cost differences between AI models?

[0] No, I don't think about cost

[1] Vaguely—I know some are more expensive

[2] Yes, I know approximate cost ratios

[3] Yes, I factor cost into model selection decisions

○ Q5 single choice 3 pts

When selecting a model for a task, do you consider speed/latency tradeoffs?

[0] No, I don't think about latency

[1] Sometimes—I notice when a model is slow but don't switch

[2] Yes, I use faster models for simple tasks to avoid waiting

[3] Yes, I balance latency, quality, and cost based on task urgency

Practice Conversations (1)

Learn through simulated conversations that demonstrate key concepts.

beginner 10 min

Choosing the Right Model: Matching Task to Capability

You have access to multiple AI models and need to decide which to use for different tasks

9 messages →

Tempered AI — Forged Through Practice, Not Hype

? Keyboard shortcuts