Top
New
🌕
helloericsf
joined
5/9/2022, 8:24:04 PM
has
608
karma
Recent Posts
Better than DeepSeek R1? MiniMax-M1:open-weight hybrid-attention reasoning model
by
helloericsf
on 6/16/2025, 5:28:51 PM with
0
comments
kit - Code Intelligence Toolkit
by
helloericsf
on 5/8/2025, 11:16:19 PM with
0
comments
DeepSeek Open Source Optimized Parallelism Strategies, 3 repos
by
helloericsf
on 2/27/2025, 2:01:41 AM with
5
comments
DeepSeek Open Source DeepGEMM – FP8 GEMM Library(300 lines for 1350+ FP8 TFLOPS)
by
helloericsf
on 2/26/2025, 1:08:29 AM with
1
comment
Alibaba Open Source Large-Scale Video Generative Models: Wan2.1
by
helloericsf
on 2/25/2025, 3:03:22 PM with
4
comments
DeepSeek open source DeepEP – library for MoE training and Inference
by
helloericsf
on 2/25/2025, 2:27:29 AM with
15
comments
DeepSeek Open Source FlashMLA – MLA Decoding Kernel for Hopper GPUs
by
helloericsf
on 2/24/2025, 1:37:24 AM with
16
comments
New Qwen2.5-Max Outperforms DeepSeek V3 in Benchmarks
by
helloericsf
on 1/28/2025, 4:08:44 PM with
2
comments
Longest context up to 4M, MiniMax-01 hybrid 456B Open source model
by
helloericsf
on 1/14/2025, 7:32:05 PM with
1
comment
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper
by
helloericsf
on 12/26/2024, 11:47:29 AM with
4
comments
NeurIPS and Dr. Picard released statement for singling out Chinese scholars
by
helloericsf
on 12/16/2024, 6:16:49 PM with
2
comments
Tencent Hunyuan-Large
by
helloericsf
on 11/5/2024, 6:52:09 PM with
13
comments
Chinese AI Community: open-source Heatmap
by
helloericsf
on 7/31/2024, 10:46:01 PM with
3
comments
Poolside is raising $400M+ at a $2B valuation to build a coding co-pilot
by
helloericsf
on 6/20/2024, 8:10:16 PM with
1
comment
Is LMDeploy the Ultimate Solution? Why It Outshines VLLM, TRT-LLM, TGI, and MLC
by
helloericsf
on 6/20/2024, 3:48:34 PM with
4
comments
21.2× faster than llama.cpp? plus 40% memory usage reduction
by
helloericsf
on 6/12/2024, 9:58:03 PM with
5
comments