DeepSeek-R1 at 3,872 tokens / second on a single Nvidia HGX H200