Outrageously Large Neural Networks: Up to 137B Parameters

  • Looking at paper like this, I can't help to think about PDP. Will we be able to confirm Parallel distributed processing (PDP) theory in the near future?