Hacker News

New DeepSeek-v2.5 model tops OSS coding leaderboards

by geepyteeon 9/10/2024, 8:45:18 PM with 1 comment

by geepyteeon 9/10/2024, 8:45:18 PM
DeepSeek just released this week their new DeepSeek-V2.5 model, which is a "combination" of their DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724 models according to their release tweet. [0]
They claim to surpass GPT-4-Turbo, Claude 3 Opus, and the previous DeepSeek-Coder-V2 model in coding, scripting, and math tasks on their official website [1] and it's fully open sourced [2] with a 128k context window.
It still doesn't show on the LMSYS Chatbot Arena coding leaderboard, which is common with new models, but livebench [3] has it ranked 7th for their coding benchmark, which is the highest ranking for any open source model (not counting mistral-large-2407 as fully open sourced since weights not public), beating meta-llama-3.1-405b-instruct-turbo.
Since this model's strength is coding, I've also made it available for free to anyone who wants to try it as a coding copilot in VS Code [4] (Disclaimer: I'm a co-founder at Double and this is my extension).
Hope others find this as exciting as I do, it's great to see open source models continue to improve!
[0] - https://x.com/deepseek_ai/status/1832026579180163260
[1] - https://www.deepseek.com/
[2] - https://huggingface.co/deepseek-ai/DeepSeek-V2.5
[3] - https://livebench.ai/
[4] - https://double.bot/