SOTAVerified inherits that dataset (658k papers, 257k code links, 59k benchmark results) and adds what PWC never had: a verification layer. Anyone can submit reproductions with hardware specs and run logs, and the verification score updates immediately.
I've been doing reproductions myself on my RTX 3090: Fort et al. 2019 deep ensembles and Havasi et al. 2021 MIMO so far, with wandb logs linked. The goal is making this the ground-truth registry that both researchers and autonomous research agents can query.
Stack: Next.js, PostgreSQL, Vercel, Railway. Open source: https://github.com/sotarepro/sotaverified
Built for: - Authors who want to claim their papers and submit official metrics - Researchers who want to understand the SOTA techniques for a task - Autonomous research agents to check if a result reproduces before investing GPU hours
Would love feedback from the HN community. What features would make this useful for your workflow?