- Team of ~100 engineers, mostly fresh graduates
- Developed novel MLA (Multi-head Latent Attention) architecture reducing memory usage to 5-13% of traditional architectures
- Achieved similar performance to GPT-4 at 1/70th the cost
- Founder focuses on fundamental research over commercialization
- Unusual organizational structure: no KPIs, unlimited computing resources for engineers
The most interesting part is their approach to innovation: unlike most Chinese tech companies that focus on application innovation, they're doing fundamental architecture research, challenging the notion that only Western companies can lead in core tech innovation.
DeepSeek's approach to AI development stands out:
- Team of ~100 engineers, mostly fresh graduates - Developed novel MLA (Multi-head Latent Attention) architecture reducing memory usage to 5-13% of traditional architectures - Achieved similar performance to GPT-4 at 1/70th the cost - Founder focuses on fundamental research over commercialization - Unusual organizational structure: no KPIs, unlimited computing resources for engineers
The most interesting part is their approach to innovation: unlike most Chinese tech companies that focus on application innovation, they're doing fundamental architecture research, challenging the notion that only Western companies can lead in core tech innovation.