This was debunked - the agent was actually fooling the verification harness https://x.com/SakanaAILabs/status/1892992938013270019. One particular test that showed a 150x speedup is actually 3x slower.
Nvidia is doing work like this internally: https://developer.nvidia.com/blog/automating-gpu-kernel-gene...
This was debunked - the agent was actually fooling the verification harness https://x.com/SakanaAILabs/status/1892992938013270019. One particular test that showed a 150x speedup is actually 3x slower.