Why Agent Roles Matter More Than Model Quality

Early on we assumed that upgrading models would fix most problems. Better reasoning, better code generation, better everything. It didn't.

Most failures came from agents stepping outside their responsibilities. When an agent tries to plan, code, and evaluate simultaneously, things break. It loses focus. It optimizes for the wrong objective. It contradicts itself across steps.

Once we defined strict roles—planner, researcher, coder, reviewer—reliability improved dramatically, even without changing the model. The same model, constrained to a single role, became far more dependable. A coder that only codes doesn't get distracted by whether the plan was right. A reviewer that only reviews doesn't try to "fix" things on the fly.

The lesson: structure beats raw intelligence.