With 1M+ token context windows (Gemini) and automatic chain-of-thought (Opus 4.5, o3), 'prompt engineering' has evolved into 'context curation'. The skill is no longer tricking the model—it's providing the right codebase context, creating rule files, and writing clear specifications. Context engineering is replacing prompt engineering as the control mechanism for production AI.
Long context | Gemini API Documentation
Google
Gemini's 1M token context window is among the largest available, enabling whole-codebase understanding. Key for context_curation and model_routing dimensions.
Key Findings:
Gemini models support up to 1M token context window (1,048,576 tokens)
Can process hours of video, audio, and 60,000+ lines of code in single context
Gemini 2.5 Pro, 2.5 Flash, 3.0 Pro, 3.0 Flash all support 1M tokens
Claude Opus 4.5 sets a new bar for AI coding with 80.9% SWE-bench Verified. Key for model_routing dimension - represents current state-of-the-art for complex coding tasks.
Key Findings:
80.9% on SWE-bench Verified - first AI model over 80%
Explains the shift from prompt engineering to context engineering. As organizations move from pilots to production, prompt engineering alone cannot deliver the accuracy, memory, or governance required. Context includes conversation history, retrieved documents, tool outputs, and agent state.
Key Findings:
Context engineering is replacing prompt engineering for production AI
Anthropic formalized the concept in September 2025
Prompt engineering is now a subset of context engineering
Models now do reasoning automatically. The bottleneck is what context you provide, not how you phrase requests. Power users employ 'Document & Clear' (save progress, clear context, continue), the 'Scout pattern' (throwaway attempts to discover complexity), and phased workflows (Research → Plan → Implement with clearing between phases).
Assessment Questions (10)
Maximum possible score: 53 points
○Q1single choice4 pts
What scope of codebase context do you typically provide to AI tools?
[1]I just ask questions without providing context
[2]I paste relevant code snippets manually
[3]I use @file/@folder references to include relevant files
[4]I use @codebase or let the AI analyze my full project
Note: Measures progression from no context to full codebase awareness. Rules files and visual context are separate questions.
GitHub Copilot Documentation - Supported AI Models
GitHub
GitHub Copilot now supports multiple AI models including Claude 3.5 Sonnet, GPT-4o, and Gemini 1.5 Pro, allowing developers to switch models per conversation based on task requirements. This multi-model capability is foundational for model routing maturity - understanding that different models excel at different tasks enables cost-effective and capability-optimized AI development workflows.
Cursor's Rules for AI feature enables project-level AI configuration via .cursorrules files, allowing teams to define coding standards, conventions, and context that the AI follows automatically. Similar to GitHub's copilot-instructions.md but for the Cursor IDE ecosystem. Key for context_curation dimension - demonstrates cross-tool pattern of configuration-driven AI behavior.
Do you use visual context (screenshots, diagrams, images) when working with AI tools?
[0]No - I only use text-based context
[1]Rarely - only for specific visual bugs
[2]Sometimes - I paste error screenshots or UI mockups
[3]Regularly - I use screenshots, diagrams, and images as standard practice
Note: Visual context is a power user technique - models interpret images with surprising accuracy
Power User AI Coding Workflow Tips
Craft Better Software
Practical power user tips for AI coding workflows. Key insights: (1) paste screenshots instead of describing bugs in text, (2) create reusable slash commands for repeated workflows. Both patterns dramatically reduce friction in AI-assisted development.
Key Findings:
Paste screenshots liberally - models handle images extremely well
Create custom slash commands for repeated workflows
Note: Team-shared configuration indicates organizational maturity. Custom slash commands are reusable macros for AI workflows.
Cursor Documentation - Rules for AI
Cursor
Cursor's Rules for AI feature enables project-level AI configuration via .cursorrules files, allowing teams to define coding standards, conventions, and context that the AI follows automatically. Similar to GitHub's copilot-instructions.md but for the Cursor IDE ecosystem. Key for context_curation dimension - demonstrates cross-tool pattern of configuration-driven AI behavior.
Empirical research on LLM instruction-following limits. Key finding: even frontier models reliably follow only 150-200 instructions. Implications: keep instruction files focused, use hierarchical structure, include only genuinely useful guidance.
Key Findings:
Frontier LLMs can follow ~150-200 instructions with reasonable consistency
Smaller models degrade much more quickly with instruction count
Practical power user tips for AI coding workflows. Key insights: (1) paste screenshots instead of describing bugs in text, (2) create reusable slash commands for repeated workflows. Both patterns dramatically reduce friction in AI-assisted development.
Key Findings:
Paste screenshots liberally - models handle images extremely well
Create custom slash commands for repeated workflows
[4]Structured formats (ADRs, user stories, acceptance criteria)
[5]Spec-driven workflows with executable specifications (Spec Kit or similar)
BMAD-METHOD: Breakthrough Method for Agile AI Driven Development
BMad Code
BMAD represents the multi-agent orchestration approach to AI development. Unlike simple chat-based AI assistance, BMAD uses specialized agents (Analyst, Architect, Developer, QA) coordinated by an orchestrator. Key innovation: zero context loss between tasks. Represents advanced maturity in agentic workflows.
Key Findings:
19+ specialized AI agents with distinct roles (Analyst, Architect, Developer, QA)
50+ workflows covering development scenarios
Scale-adaptive intelligence adjusts to task complexity
Spec-Driven Development with AI: Get Started with a New Open Source Toolkit
GitHub
GitHub Spec Kit formalizes the spec-driven development approach where detailed specifications precede AI code generation. The four-phase workflow (Specify → Plan → Tasks → Implement) ensures human oversight at each checkpoint. This is the antidote to 'vibe coding' - structured, auditable AI development. Key for assessing advanced workflow maturity.
When AI output doesn't meet your needs, what do you typically do?
[0]Accept it anyway or give up
[1]Ask again with the same or similar wording
[3]Provide more specific context or constraints
[4]Analyze why it failed and restructure my approach
[5]Switch to a different model better suited for the task
Note: Model switching indicates advanced multi-model awareness
The 'Trust, But Verify' Pattern For AI-Assisted Engineering
This article provides the conceptual framework for our trust_calibration dimension. The three principles (Blind Trust is Vulnerability, Copilot Not Autopilot, Human Accountability Remains) directly inform our survey questions. The emphasis on verification over speed aligns with METR findings. Practical guidance includes starting conservatively with AI on low-stakes tasks.
Key Findings:
Blind trust in AI-generated code is a vulnerability
AI tools function as 'Copilot, Not Autopilot'
Human verification is the new development bottleneck
This is the most comprehensive 2025 survey on AI code quality (609 developers). The key insight is the 'Confidence Flywheel' - context-rich suggestions reduce hallucinations, which improves quality, which builds trust. The finding that 80% of PRs don't receive human review when AI tools are enabled is critical for our agentic_supervision dimension. NOTE: The previously cited 1.7x issue rate and 41% commit stats were not found in the current report.
Key Findings:
82% of developers use AI coding tools daily or weekly
65% of developers say at least a quarter of each commit is AI-generated
BMAD-METHOD: Breakthrough Method for Agile AI Driven Development
BMad Code
BMAD represents the multi-agent orchestration approach to AI development. Unlike simple chat-based AI assistance, BMAD uses specialized agents (Analyst, Architect, Developer, QA) coordinated by an orchestrator. Key innovation: zero context loss between tasks. Represents advanced maturity in agentic workflows.
Key Findings:
19+ specialized AI agents with distinct roles (Analyst, Architect, Developer, QA)
50+ workflows covering development scenarios
Scale-adaptive intelligence adjusts to task complexity
Beads solves the 'context loss' problem in multi-session AI development. Rather than storing tasks in unstructured markdown, Beads uses Git-backed JSONL files that agents can query for 'ready' work. Key for long-horizon tasks spanning multiple days or sessions. Represents the frontier of AI workflow tooling for persistent memory.
Key Findings:
Git-backed issue tracker designed for AI coding agents
How do you manage AI context during long or complex coding sessions?
[0]I don't actively manage context - I let it grow naturally
[1]I restart the session when things get confusing
[2]I use /clear when starting new tasks
[4]I use Document & Clear: save progress to a file, clear context, then continue
Document & Clear Method for AI Context Management
Introduces the Document & Clear pattern for managing AI context. Key insight: rather than trusting automatic context compaction, explicitly clear context and persist important state to external files. This produces more reliable outputs and gives you control over what the AI 'remembers'.
Key Findings:
Clear context aggressively with /clear when <50% of prior context is relevant
Document & Clear method: dump plan to .md file, clear, restart with file reference
Auto-compaction is unreliable - explicit clearing produces better outputs
Introduces the Research → Plan → Implement workflow for AI-assisted development. Key insight: reviewing plans is higher leverage than reviewing code. The workflow explicitly clears context between phases to maintain focus and quality.
Key Findings:
Research → Plan → Implement workflow produces better results than direct coding
Review the plan before implementation for maximum leverage
Keep context utilization at 40-60% - clear between phases
For complex, unfamiliar tasks, do you run exploratory AI attempts first?
[0]No - I ask AI to implement directly
[1]Sometimes - if I'm unsure about the approach
[3]Yes - I use throwaway branches to explore before committing
[4]Yes - scout attempts inform my plan, then I implement with fresh context
7 Prompting Habits of Highly Effective Engineers
Introduces the 'scout pattern' for AI-assisted development. Before committing to a complex task, run a throwaway attempt to discover where complexity lies, which files are involved, and what questions arise. This reconnaissance produces valuable context for the real implementation.
Key Findings:
Send out a scout before committing to learn where complexity lies
Use throwaway attempts to learn which files get modified
Failed attempts provide valuable context for the 'real' attempt
What workflow do you use for complex AI-assisted development tasks?
[1]I describe the task and let AI implement it
[2]I break it into steps but implement in one session
[3]I use Research → Plan → Implement phases
[4]I use phased workflow with context clearing between phases
[5]I review the plan before implementation for maximum leverage
Advanced Context Engineering for AI Agents
HumanLayer
Introduces the Research → Plan → Implement workflow for AI-assisted development. Key insight: reviewing plans is higher leverage than reviewing code. The workflow explicitly clears context between phases to maintain focus and quality.
Key Findings:
Research → Plan → Implement workflow produces better results than direct coding
Review the plan before implementation for maximum leverage
Keep context utilization at 40-60% - clear between phases
How do you use AI research capabilities (web search, citations) during planning phases?
[0]I don't use AI for research / Not available to me
[1]I use chat-based AI for general questions (no citations)
[2]I use @web or research modes occasionally
[4]I regularly use research modes and verify the citations provided
[5]Research with citations is standard in my planning phase workflow
Note: AI research capabilities with web search and citations are critical for planning phases. Copilot has @web, Claude Code has built-in search. Verifying citations is essential - AI can hallucinate sources.
Advanced Context Engineering for AI Agents
HumanLayer
Introduces the Research → Plan → Implement workflow for AI-assisted development. Key insight: reviewing plans is higher leverage than reviewing code. The workflow explicitly clears context between phases to maintain focus and quality.
Key Findings:
Research → Plan → Implement workflow produces better results than direct coding
Review the plan before implementation for maximum leverage
Keep context utilization at 40-60% - clear between phases