Reinforcement Learning for Reasoning in LLMs with One Training Example