Raw data

Every iteration, every transcript.

The full per-iteration journal and agent transcript for every rep are committed to the repository. No data is summarized away. Below is the index.

Downloads

Aggregate

Per-rep

Index of all reps

17 reps
ModelRepStatus ItersBest fit LogTranscriptSummary
gemini-3_1-pro rep1 done 46 354.73 log.jsonl agent.log summary.json
gemini-3_1-pro rep2 done 46 339.62 log.jsonl agent.log summary.json
gemini-3_1-pro rep3 done 46 323.92 log.jsonl agent.log summary.json
gpt-5_4_xhigh rep1 done 46 496.11 log.jsonl agent.log summary.json
gpt-5_4_xhigh rep2 done 46 513.84 log.jsonl agent.log summary.json
gpt-5_5_high rep1 done 46 461.87 log.jsonl agent.log summary.json
gpt-5_5_high rep2 done 46 420.61 log.jsonl agent.log summary.json
gpt-5_5_high rep3 done 46 408.01 log.jsonl agent.log summary.json
gpt-5_5_medium rep1 done 46 431.58 log.jsonl agent.log summary.json
gpt-5_5_medium rep2 done 46 407.55 log.jsonl agent.log summary.json
gpt-5_5_medium rep3 done 46 431.24 log.jsonl agent.log summary.json
gpt-5_5_xhigh rep1 done 46 397.83 log.jsonl agent.log summary.json
gpt-5_5_xhigh rep2 done 46 525.04 log.jsonl agent.log summary.json
gpt-5_5_xhigh rep3 done 46 482.03 log.jsonl agent.log summary.json
kimi-k2_6 rep1 done 46 347.76 log.jsonl agent.log summary.json
kimi-k2_6 rep2 done 46 331.22 log.jsonl agent.log summary.json
kimi-k2_6 rep3 failed 31 396.13 log.jsonl agent.log summary.json

Each log.jsonl is one row per iteration: hypothesis ID, title, outcome (improvement / regression / broken), fitness, delta vs baseline, LUT4, FF, Fmax, IPC, cycles, error class if broken, timestamp. Each agent.log is the verbatim model transcript: every bash command, every file read, every write.