Hacker News

Nvidia GPU on bare metal NixOS Kubernetes cluster explained

by fangpenlinon 3/2/2025, 8:26:21 PM with 3 comments

by colordropson 3/2/2025, 10:33:52 PM
This looks fun. The author mentions machine learning workloads. What are typical machine learning use cases for a cluster of lower end GPUs?
While on that topic, why must large model inferencing be done on a single large GPU and/or bank of memory rather than a cluster of them? Is there promise of being able to eventually run large models on clusters of weaker GPUs?
by heywoodlhon 3/3/2025, 4:32:23 AM
Ah, this is awesome! I currently run k3s on a decently spec-ed NixOS rig. I tried getting k3s to recognize my Nvidia GPU but was unsuccessful. I even used the small guide for getting GPU in k3s to work in nixpkgs[0], but without success.
For now I’m just using Docker’s Nvidia container runtime for containers that need GPU acceleration.
Will likely spend more time digging into your findings — hoping it results in me finding a solution to my setup!
[0] https://github.com/NixOS/nixpkgs/blob/master/pkgs/applicatio...
by imcriticon 3/2/2025, 11:26:36 PM
All the links are styled in an unreadable form on that page.