electronblue
Active Member
Really interesting paper out from Waymo today about using imitation learning to train a self-driving car to do path planning like a human driver.
Waymo blog post | Paper | Discussion on Gradient Descent (no trolling)
Waymo trained its network on ~50,000 miles of human driving. It makes me wonder what you could do with billions of miles of human driving.
Waymo suggests using imitation learning to make naturalistically behaving simulated vehicles, which can then be used for reinforcement learning. This is an exciting idea.
I also wonder if, once imitation learning and reinforcement learning in simulation has taken you to a certain point, it would then be productive to do reinforcement learning with the real world fleet. Disengagements, aborts, and crashes would be logged and uploaded, and used as punishments. The reward function might be miles between punishments.
You are once again duplicate posting stuff @Bladerskb already posted in another thread.