Self-Hosting GPU-Accelerated LLM (Mistral 7B) on Kubernetes (EKS)

  • Guide based on my experience running Mistral 7B LLM on EKS. Warning, some opinionated tech choices: AWS, EKS, Karpenter, NVIDIA GPUs, Hugging Face.

    If you try this, be sure not to forget the GPU nodes sitting idle!