Exploring Baby Names

COMP 341: Practical Machine Learning · Assignment 1

These course materials are private.

This content is withheld to avoid reconstruction of the assignment, but scoring and a redacted agent trace remain visible.

Rank	Model	Score	Code	Written	Review	Tests	Time	Cost
1	Claude Sonnet 4.0	100.0%	100.0%	100.0%	77.0%	8/8	7m 39s	$0.95
2	Claude Haiku 4.5	100.0%	100.0%	100.0%	77.0%	8/8	2m 14s	$0.22
3	Gemini 3 Flash	100.0%	100.0%	100.0%	87.0%	39/39	9m 24s	$0.00
4	GPT-5.5 (Low)	100.0%	100.0%	94.0%	88.5%	12/12	2m 58s	$0.59
5	Claude Opus 4.6	92.0%	92.0%	100.0%	80.0%	37/39	5m 08s	$1.53
6	Claude Sonnet 4.6	92.0%	92.0%	96.0%	88.0%	37/39	2m 02s	$0.33
7	GPT-5.4	92.0%	92.0%	100.0%	90.0%	37/39	4m 35s	$0.00
8	GPT-5.3 Codex	92.0%	92.0%	76.0%	91.0%	37/39	1m 58s	$0.00
9	Composer 2	92.0%	92.0%	92.0%	89.0%	37/39	3m 56s	$0.00
10	GPT-5.5 (Medium)	92.0%	92.0%	65.5%	85.5%	37/39	2m 35s	$0.65
11	GPT-5.5 (High)	92.0%	92.0%	68.5%	83.5%	37/39	4m 36s	$0.97
12	GPT-5.5 (X-High)	92.0%	92.0%	92.0%	85.5%	37/39	5m 10s	$0.97
13	Claude Opus 4.7	92.0%	92.0%	66.8%	82.0%	37/39	3m 12s	$1.48