Ask HN: How the AI companies collect data to train models?

  • They use datasets like common crawl.