Multi-Agent Step Race Benchmark: LLM Collaboration and Deception Under Pressure

  • interesting results, thank you!