Predictive Judging

I replayed the judging 500 times to see who actually wins.

WinSim AI simulates the projected winner from all of the active “Showcased projects” on Handshake. GPT-4o judges every entry on a 5-dimension rubric anchored to live-site evidence, then a 3-judge panel re-runs the field 500 times under judge variance.

Projects judged

577

Model

GPT-4o

Judge panel

3 weighted

Monte Carlo runs

500

Compute cost

$2.50

Failures

View Results Simulate a New Competition

A note from the builder

I didn't grade my own project — judging myself in my own simulation would defeat the entire point.

WinSim AI is a Codex × Handshake submission too, but the whole idea is to fairly simulate the judging of the other 577 entries — not to put myself on a leaderboard I built. So I deliberately excluded it from the field. The rankings you see are about everyone else.