Ah, that's possible. I haven't followed the specifics of Tesla's implementations closely as I don't think they publish much of their work. Assuming they are using some sort of RNN like architecture or some other means to distill temporal information and feed it back into the network, that can certainly help it do a better job, but the point in my original post still stands. The network is still not rooted in any physics at all. It is just an approximation machine and it can still break down in completely unpredictable ways on samples that are not similar enough to the training-set distribution. It doesn't for example "learn" classical structure-from-motion algorithms and wouldn't be able to generalize well on data that looks very different even though classical algorithms rooted in physics would.
Well I agree and disagree.
Everything you said tells me you are a smart experienced ML expert. All your points stand about basically any ML algorithm If some ML algorithm is trained on Lidar data, you would have the same potential issues with training/test distributions. Obviously thought, the more complicated the algorithm (like deep learning) and more noisy the data, the worse it could become.
But I would still argue deep learning techniques can still learn some physics. I mean, that is definitely true in the pure sense. Feed in time series with labels that are some order of integrals or time derivatives and convnets can figure the true math out (of course convolutions are set up for this) that still generalize as well as standard derivatives.
If you fed in *perfect* image data with high resolution, no noise, etc... where structure from motion algos would work, I think the right architecture could also learn the same math to get it right and be simplistic enough to generalize.
The benefit of course of deep learning is still that it is statistical and can achieve better performance than classical algorithms. The best performing algorithms (on independent test dat) will have learned the most simplistic representation of the physics that also handles the noise and edge cases in the best manner on average.
So I agree with all your points on the potential flaws, but it doesn't mean it's impossible. It means you need a lot of diverse data to ensure that your training / test set distributions converge, and then you can better trust if your algorithm is even good enough. Doesn't guarantee it though.