here's video of the agent actually playing (linked in the paper): https://www.youtube.com/watch?v=Msy82sIfprI
This is really cool. A step in the right direction towards general learning through observation.
This is actually quite human. I also watch Let's plays if I struggle with a quest (or game in general).
Also interesting assumption to say "harder = fewer rewards". Probably doesn't always apply but is a good generalization.
Are audio cues also analyzed here? ie: "We observe that use of the audio signal in CMC results in more emphasis being placed on key items and their location in the inventory"
This should probably say "ML" or "AI" or whatever, I was slightly disappointed to realize it was not a funny paper about… I don't know to be fair.
Neat, I don't understand what they mean by having embedded a reward video into the set. Is that a video where copying the behaviour will deliver victory?