📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
A recent whitepaper from Google emphasizes that the core of AI development is not the model itself but the surrounding harness and verification processes. This shift in focus has significant implications for how organizations build and maintain AI systems.
A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model accounts for only 10% of system behavior in AI development. The paper argues that the harness, verification, and context engineering are far more influential, marking a paradigm shift in how organizations should approach AI systems.
The whitepaper, titled The New SDLC With Vibe Coding, reports that 85% of professional developers use AI coding agents regularly, with over half using them daily. It emphasizes that 41% of all new code is AI-generated, yet the key insight is that the model itself is only a small piece of the overall system. The authors highlight that most failures in AI agents stem from configuration issues—missing tools, vague rules, or noisy context—rather than the model’s capabilities.
One of the paper’s core messages is that the ‘harness’—the prompts, tools, policies, and observability surrounding the model—determines 90% of the behavior. Evidence from experiments shows that changing only the harness or tweaking prompts can significantly improve performance, even with the same underlying model.
Furthermore, the paper stresses that cost and efficiency in AI development are driven by how well teams engineer their context and harness, not just by adopting newer, larger models. It advocates for a disciplined approach—agentic engineering—that incorporates verification, structured context, and modular skills, which can be more cost-effective over time.
The model is only 10%
A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.
The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.
Implications for AI Development Strategies
This shift in understanding redefines how organizations should invest in AI. Instead of chasing the latest models, companies are encouraged to focus on building robust harnesses and verification processes. This approach can lead to more reliable, cost-effective AI systems and a competitive advantage, as the durable, scalable parts of AI are within their control. The insight also suggests that costs associated with model upgrades are less impactful than those incurred by poor configuration and context management.
AI model validation tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Evolution of AI System Design and Best Practices
The paper builds on the ongoing evolution of AI engineering, which has moved from vibe coding—quick prompts with minimal oversight—to agentic engineering, characterized by formal specifications, testing, and structured context. Since early 2026, AI adoption has surged, with a majority of developers integrating AI tools into their workflows, but the core challenge remains: how to reliably control and verify AI outputs. Previous focus was on acquiring larger models, but recent experiments show that tuning the harness yields greater performance improvements.
This development aligns with broader trends in software engineering, emphasizing verification, modularity, and cost management over raw model size. The whitepaper’s findings reinforce the idea that the real skill lies in context engineering and system configuration.
“The model is only 10% of what determines behavior; the harness is 90%. The behavior you experience is dominated by scaffolding you can build, own, and improve.”
— Addy Osmani
AI harness and verification software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Remaining Questions on Practical Implementation
While the paper provides compelling evidence that harness and verification are critical, it remains unclear how organizations will scale these practices across diverse AI applications. The precise methods for measuring and optimizing harness effectiveness are still evolving, and the impact of different types of tasks or models requires further study. Additionally, how quickly industries will adopt these insights and shift their investment priorities is uncertain.
AI prompt engineering tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Next Steps for AI Teams and Industry Adoption
Organizations should prioritize developing and testing their harnesses, including prompts, tools, and verification frameworks, to improve AI reliability and cost-efficiency. Future research will likely focus on standardized metrics for harness quality and best practices for context engineering. Industry leaders may begin to publish case studies demonstrating the tangible benefits of this disciplined approach, accelerating adoption.
AI observability and monitoring software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Why is the model only 10% of the system’s behavior?
The whitepaper shows that factors like prompts, tools, rules, and observability—collectively called the harness—determine most of the AI’s output, making the model itself a small part of the overall system.
How can organizations improve their AI systems based on this insight?
Focusing on building robust harnesses, verifying outputs, and managing context effectively can lead to more reliable, cost-efficient AI applications.
Does this mean larger models are less important?
Not necessarily, but the whitepaper suggests that beyond a certain point, investing in better harnesses and verification yields greater returns than simply acquiring bigger models.
What are the economic implications of this shift?
High upfront costs in designing systems and context management can reduce long-term operational costs, making AI development more sustainable and scalable.
Source: ThorstenMeyerAI.com