First AlphaGo and now this. Our world has truly come to an end!
Interesting implementation detail
while "pigs" != "fly":
LSTM would converge even faster.
A K-level breadth first search mimicking the optimal policy and a simple learning to search algorithm with a cost sensitive binary linear classifier would work well too.
After training it would be a constant time evaluation of what to do next.
Interesting. Here's a similar project from a couple years ago:
I'd like to see this hooked up to a physical phone and actuator. Anybody here seen anything done using a realtime physical loop in the learning process?
I was sort of hoping that the bird would hit a pipe at the end of the 6-minute video.
Finally we're trying to teach algorithms to feel anger! I love it!
EDIT: Just to be clear, this is a joke based on the game being flappy bird.
Seriously, though, this is awesome. I love this kind of stuff!