February 2026
How we built a comprehensive benchmark for evaluating AI coding agents on real university assignments.