Hacker News

Ask HN: Are there any objective measurements for AI model coding performance?

by charliebwriteson 2/28/2025, 3:38:03 PM with 2 comments

by ta0608714652on 2/28/2025, 8:50:58 PM
The two that I know of are SWE-bench and CodeElo. SWE-bench is oriented towards "real world" performance (resolution of GitHub issues), and CodeElo is oriented towards competitive programming (CodeForces).
https://www.swebench.com/
https://codeelo-bench.github.io/
by gregjoron 2/28/2025, 4:37:30 PM
As far as I know we don't have any way to objectively measure or compare "quality" or "coding performance" or "best" when looking at code produced by human programmers.
You may find this useful:
https://www.gitclear.com/coding_on_copilot_data_shows_ais_do...
Or this analysis if you don't want to sign up to download that white paper:
https://arc.dev/talent-blog/impact-of-ai-on-code/