Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD improvement rate with neural nets for planning and control?

This site may earn commission on affiliate links.
I'm not an AI expert, but I'm interested in thinking about how the use of neural nets for planning and control in FSD (Elon tweeted in January suggesting they were moving to this in FSDb v 11.3) improves performance. Intuitively, moving from a system of traditional programming for control (i.e., a load of "if... then" statements) to neural nets should allow for faster improvement, as the neural net can learn from everything it sees, but this is difficult to quantify. I guess it would be a pretty material improvement. Are there are any examples through history where deep learning has been applied to a new task (e.g., image recognition) in place of traditional programming for control and we've been able to quantify the improvement we've seen? There are lots of stylized showing how deep learning performs better than other ML algorithms, but I haven't been able to find anything quantifiable.
 
Intuitively, moving from a system of traditional programming for control (i.e., a load of "if... then" statements) to neural nets should allow for faster improvement, as the neural net can learn from everything it sees, but this is difficult to quantify. I guess it would be a pretty material improvement.

Using neural nets instead of traditional programming does bring many improvements. That is why everyone, not just Tesla, is using neural nets for all parts of the autonomous driving stack.

I think the main advantage is efficiency. Doing autonomous driving with just traditional programming would require writing billions of lines of code. Doing that manually would be incredibly inefficient, take too long, and be prone to error. With machine learning, you can essentially have the computer "write the code" for you more quickly. The other advantage is that it is easier to fix an issue. With traditional programming, you would need to go into those billions of lines of code and figure out what needs to be changed. With machine learning, you can give it better data and retrain the neural net.

But there is nothing magical about using neural nets. They don't automatically learn on their own from everything they see. They only "learn" when you train them with specific data. You still need to do a lot of work, collecting data, and training the system. And driving safely everywhere involves many tasks and millions of edge cases that need to be trained, validated, retrained etc... It's why it has taken this long to get to the autonomous driving that we have. Machine learning is a tool that has made it easier to make progress in autonomous driving but it is not some magic bullet that will "solve FSD" overnight.

Are there are any examples through history where deep learning has been applied to a new task (e.g., image recognition) in place of traditional programming for control and we've been able to quantify the improvement we've seen? There are lots of stylized showing how deep learning performs better than other ML algorithms, but I haven't been able to find anything quantifiable.

Yes. Most every company now uses deep learning in autonomous driving. In fact, deep learning is applied to all parts of the stack from perception, prediction to planning and control. Image recognition was one of the very first tasks that used deep learning instead of traditional programming. It was called ImageNet. It was started back in 2012. ImageNet - Wikipedia.

You can check out academic papers on machine learning. They will often describe quantifiable improvements over past models. For example, here is a table that compares Waymo's latest sparse window transfomer to other models.

ZCwFIbS.png


Source:

You can also look at release notes. Cruise release notes often describe quantifiable improvements from using better neural nets. For example:
  • Shipped DLA v3 & KSE VRU v13 which improves tracking of bikes by up to 30%, allowing the AV to behave and respond more safely around these vehicles on higher speed roads
  • Shipped PSeg v5.1.5 that improves tracking by up to 25% for several classes of less common objects, such as animals and small debris in the road.
  • Shipped STA-V v24 that improves prediction of incoming cross traffic vehicles in the right lane when the AV is taking a right turn on major roads with splitting lanes by 15%
  • Shipped Vehicle MTL v3 which improves vehicle open door detections by 49%.

Source: April 2023 Software Release

Or Tesla release notes:

- Improved control through turns, and smoothness in general, by improving geometry, curvature, position, type and topology of lanes, lines, road edges, and restricted space. Among other improvements, the perception of lanes in city streets improved by 36%, forks improved by 44%, merges improved by 27% and turns improved by 16%, due to a bigger and cleaner training set and updated lane-guidance module.

- Added lane-guidance inputs to the Occupancy Network to improve detections of long-range roadway features, resulting in a 16% reduction in false negative median detections.

- Improved motorbike recall by 8% and increased vehicle detection precision to reduce false positive detections. These models also add more robustness to variance in vision frame-rate.

- Reduced interventions caused by other vehicles cutting into ego's lane by 43%. This was accomplished by creating a framework to probabilistically anticipate objects that may cut into ego's lane and proactively offset and/or adjust speed to position ego optimally for these futures.

- Improved cut-in control by reducing lane-centric velocity error by 40-50% for close-by vehicles.

- Improved recall for object partial lane encroachment by 20%, high yaw-rate cut-in by 40%, and cut-out by 26% by using additional features of the lane-change trajectory to improve supervision.

Source: FSD Beta 11.4.3 (2023.7.15) Official Tesla Release Notes - Software Updates
 
Last edited:
  • Informative
Reactions: BitJam and Bitdepth
Upvote 0
Tesla has been gradually transitioning it’s stack to more/90% NN:s over the last three years. The industry as a whole has done the same journey. No one believes anymore that robotics problems will be solved with traditional programming.

Neural nets uses probability for “learning”, which makes training for the uncommon or unknown hard. NN:s are typically a block box and lack explainablity, which makes validation very hard. To makes these systems viable in unsupervised safety critical applications, you limit the bounds in which they operate. Eg max speed, geo area, weather and so on. This limitation is referred to as an ODD (operational design domain).

We don’t yet trust unsupervised computer vision in any safety critical applications including radiology (and driving is time critical too) which should tell you something about the short term viability of vision-only. In order to drive safer robotaxi companies have at least three more sensing modalities (lidar, radar and microphone) and detailed maps so they know what’s up ahead and/or occluded from the view. Having more modalities makes the system more robust and less susceptible to adversarial attacks. Tesla has started to use maps extensively by crowdsourcing them from the fleet, a concept pioneered by mobileye 8 years ago or so. But you can’t rely on the maps to be correct, as things change with roadwork etc. Tesla seem to be over reliant on map data at the moment (incorrect speed limit, phantom breaking, running stop signs).
 
Last edited:
  • Like
Reactions: diplomat33
Upvote 0