Elon and Karpathy have mentioned various fundamental changes to FSD over the past 18 months:
- HydraNet NN architecture
- multi-task learning
- 2D labelling to 3D labelling
- 2D labelling to 4D labelling: using video to incorporate the time dimension, rather than still images
- from per-camera imagery to 360-degree video
- moving more planning (behaviour generation) into the NNs
- more self-supervised learning
- probably some other things I'm forgetting (feel free to comment)
The fundamental rewrite of FSD is turning out to be a slower and more incremental process than it first seems to me. Still, I hope with v9.0 we start to see the performance impacts of these fundamental changes that have been discussed since November 2019.
Software projects are always behind schedule and always seem to be just around the corner — and then when you turn the corner, you see another corner. But behind THAT corner, the finish line is there for sure... Of course, former employees have (perhaps tongue-in-cheek) described Elon's estimation of software development timelines as the amount of time required to physically type the lines of code, so his perpetual overoptimism doesn't help a situation already beset by mirages of completion.
Yet delays are temporary and success is forever. The long-term global economic and humanitarian effects of pushing NNs to superhuman performance on driving-relevant perception and planning tasks is intimidating to even attempt to calculate. The ideas and provisional results that Elon, Karpathy, et al. have discussed, especially when cross-referencing them with academic research and industry research and engineering, seem incredibly promising. To me, they carry a genuine hope of escaping indefinite stagnation in autonomous driving progress and unlocking step change or exponential improvements in key performance metrics.
I think the public and even fan reaction to Elon's comments on FSD and Tesla's whole FSD enterprise is dishearteningly cynical and miserly. I'm waiting for v9.0 with hope and excitement. And after v9.0, I'll be eagerly waiting for v10.0, 11.0, and 12.0.