And this is why I don't have a lot of faith in leaderboards to predict real world performance. Every time I see something like this I decide to give Bard another try and every time it disappoints. Ok Google, I'm ready to be hurt again.
Why are there multiple "Gemini Pro" on the board? Is this based on popular vote?
How do they know these "human votes" aren't rigged or ai bots?
Right after Google/HF announced their parternships...
Surpassing original gpt-4, still behind gpt-4-turbo
If leaderboard does not load, screenshot here: https://twitter.com/lmsysorg/status/1750921228012122526