Benchmark of 110 RAG implementations on Annual Reports Q&A task

  • Results from the Enterprise RAG Challenge, comparing 110 experiments from 43 teams on building Retrieval-Augmented Generation (RAG) systems. The challenge required AI solutions to automatically answer 100 complex queries across 100 large annual reports (one was over 1000 pages), including cross-document reasoning.

    Teams detailed their architectures, methods, and lessons learned - expand rows in the table for insights into each approach.

    Feedback welcome!

    PS: There are even a few fully-local solutions in the leaderboard. Hopefully in the third round we'll have even more.