Hacker News

DeepSeek: A 100-Person Team Built an AI Model Rivaling GPT-4

by zacfireon 2/17/2025, 12:50:36 PM with 1 comment

by zacfireon 2/17/2025, 12:50:36 PM
DeepSeek's approach to AI development stands out:
- Team of ~100 engineers, mostly fresh graduates - Developed novel MLA (Multi-head Latent Attention) architecture reducing memory usage to 5-13% of traditional architectures - Achieved similar performance to GPT-4 at 1/70th the cost - Founder focuses on fundamental research over commercialization - Unusual organizational structure: no KPIs, unlimited computing resources for engineers
The most interesting part is their approach to innovation: unlike most Chinese tech companies that focus on application innovation, they're doing fundamental architecture research, challenging the notion that only Western companies can lead in core tech innovation.