How Trillion Parameter Models Are Trained