diplomat33
Average guy who loves autonomous vehicles
Amnon's argument has a significant foundational assumption in it, which I believe to be false. He assumes that the "long tail" of failure cases is independently distributed; that solving one of them will have no bearing on solving the others. I believe the opposite; that there will be a lot of similarity and overlap between many of the rare failure cases, and that properly learning to solve a few of them can implicitly solve many more. In this sense, I think the monolithic E2E approach is perfectly fine.
Some edge cases probably overlap and some may not. And some edge cases are perception issues while others may be behavior prediction or planning issues. For example, a person in a mascot costume might be a perception edge case if your perception system has never seen something like that. And maybe adding that edge case to your training would automatically carry over to similar perception cases like a person in a Halloween costume. Other edge cases might be prediction based. For example, Waymo encountered an edge case with a tow truck that was pulling a pick up truck at an unusual angle and so the Waymo prediction stack misinterpreted how the combo of the tow truck and the pick-up would move. You could also have a planning edge case where you encounter say a construction zone that requires the car to move in a way it was not trained to do.
Of courses, this only applies for the modular approach with separate perception, prediction and planning stacks. With end-to-end, everything is trained together. But the point still remains that I don't think edge cases would always carry over to other cases. For example, training your end-to-end to handle the person in the Halloween costume would not carry over to the edge case of the oddly towed pick-up truck. So I think there would be some independence between edge cases but not 100%.
Ultimately, when discussing the long tail, I don't think anyone really knows the best way to solve it. In fact, I don't think we really know how long the tail is, or more precisely, how much of the long tail needs to be solved before AVs can be scaled safely everywhere. Amnon even says as much when he says that we could be underestimating or overestimating the long tail.
Last edited: