You can actually make the same argument about lidar.
Lidar is being used only to get to some kind of geofenced L4 quickly to get VC money.
Vision only FSD is not wild speculation. It is just "first principles". If humans can do it with vision only, theoretically we should be able to do it with NN as well.
Except that eventually the cameras will FAR exceed the natural abilities of any animal on Earth. (e.g. extreme night vision, additional spectrums, and a NN that can decide which is best in RT) LiDAR was a stopgap. Progress in FSD is a software issue now.