Skip to content

Technical

Why Your Agent Needs a Model Combo Optimizer, Not Just a Model

Wenyue Hua*, Qian Xie, Sripad Karne, Armaan Agrawal, Nikos Pagonas, Kostis Kaffes, Tianyi Peng*

* Equal contribution

Most teams pick a model, usually the latest frontier release, and run every step of their agent on it. Planner? GPT-5.4. Solver? GPT-5.4. Critic? GPT-5.4. It works, so nobody questions it.

But "it works" is not "it's optimal." What if the same accuracy costs 20x less with a different combination? What if a weaker model actually performs better at one of those steps? These aren't hypotheticals. We ran the experiments.