Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Google AI: Revisiting the Unreasonable Effectiveness of Data

This site may earn commission on affiliate links.
The issue here is ground truth. There is no ground truth when a Tesla customer car is out driving. You have no idea if the customer disconnected because of danger or because they just wanted to go get a donut, and you have no idea if you missed that speed limit sign. Tesla drivers are not training or even testing the system every time they drive, because the feedback the system can get is minimal, and all learning systems need a feedback signal.

There are two distinct concepts to understand here:

1) Automatic labelling, e.g. when drivers automatically provide the training signal for semantic free space segmentation. (Another example of automatic labelling is self-supervised learning.)

2) Automatic data curation, when a sophisticated technique is used to trigger uploads of data and then the ground truth label is provided by a human annotator.

You can imagine them, but Kaparthy specifically says he hand codes for specific detections to collect data. They teach it a few stop signs, then hope it can pick up more and more stop sign variations. They are nowhere near just having the system learn by itself and just record and upload "odd" situations.

Check out this video of him from a year ago- He's talking just about how they even collect all the variations of stop signs so they can train the model on them. These are single frame images of single objects, which humans then have to classify. They are nowhere near complex conditional cases like "all lanes are stopped for no reason".

This is an incorrect interpretation of Karpathy because you are making the leap from “this is one thing that we do” to “this is all that we do”. Karpathy and others at the company such as Stuart Bowers and Elon have discussed other forms of automatic data curation. For example, Karpathy has explicitly mentioned Tesla’s use of active learning. One way to do active learning is to run an ensemble of diverse neural networks and then upload training examples when their predictions diverge. These training examples get labelled by human annotators and then added to the training datasets that are used to train the neural networks.
 
@Bladerskb

I think you're confusing the IPU-03 sold by Desay SV with other products sold by Desay SV or other companies. Or you're simply misunderstanding the functionality offered by the IPU-03.

The IPU-03 is an autonomous driving domain controller. This is what the IPU-03 does in the Xpeng P7, according to a press release:

"Available in China, the Xpeng P7 is one of the world’s leading autonomous EVs and carries the Desay SV automatic driving domain control unit – the IPU-03. Through multi-sensor data collection, the IPU-03 calculates the vehicle’s driving status and provides 360-degreee omnidirectional perception with real time monitoring of the surrounding environment to make safe driving decisions."​

Like Mobileye, Desay appears to have developed an end-to-end autonomous driving system, encompassing perception, localization, path planning, decision-making, and control.

Xpeng doesn't seem competent to develop such a system in-house, at least not at their Silicon Valley subsidiary. I would venture further that the dysfunction at their Silicon Valley office probably extends to their offices in China as well.



Alot of SDC companies uses Nvidia Drive OS including Zoox

Source? If you Google any of the following search terms:

"zoox" "nvidia drive os"

"zoox" "nvidia drive"

"zoox" "nvidia os"

"zoox" "nvidia operating system"

"zoox" "nvidia" "operating system"


You get no relevant results except for a blog post that doesn't say Zoox is using (or has ever used) Nvidia Drive OS.



I did a little searching to see if I could find any info on what operating systems Zoox or other AV companies use.

For Zoox, I found a job posting that hints that Zoox may use some form of "real-time Linux" (archive.org, archive.is). Maybe Automotive Grade Linux?

If I were an AV company, I would want to use an OS that is either a) my own proprietary OS or b) a free and open source OS. Or a mix of (a) and (b). I would feel uneasy about using a proprietary, closed source, licensed OS owned and controlled by another company.
 
Last edited:
Yes, everything you said makes sense up to here. The issue here is ground truth. There is no ground truth when a Tesla customer car is out driving. You have no idea if the customer disconnected because of danger or because they just wanted to go get a donut, and you have no idea if you missed that speed limit sign. Tesla drivers are not training or even testing the system every time they drive, because the feedback the system can get is minimal, and all learning systems need a feedback signal.


Only for a very narrow set of actions. Without FSD active, there is no planned route, so FSD has no idea if the user turned at that intersection because they meant to or there was construction. If you watch a video of someone driving a car in the city, and you had no idea of their destination, how would you determine when they did something unexpected? Even if they have NAV on, how often does it get ignored because of traffic, missed turns, or new information?


You can imagine them, but Kaparthy specifically says he hand codes for specific detections to collect data. They teach it a few stop signs, then hope it can pick up more and more stop sign variations. They are nowhere near just having the system learn by itself and just record and upload "odd" situations.

Check out this video of him from a year ago- He's talking just about how they even collect all the variations of stop signs so they can train the model on them. These are single frame images of single objects, which humans then have to classify. They are nowhere near complex conditional cases like "all lanes are stopped for no reason".

None of this is to say that Tesla gets no value from the cars on the road- Clearly they do. But this data is much less valuable per mile than a specific test fleet, especially when the functionality is so basic that it is fully expected that users will be taking over constantly for the system.

This approach does not assume "automatic labeling". It is good old-fashioned, manually labeling for supervised learning. This can get Tesla pretty far - I mean look how far Waymo has gone with less available data.

FSD predictions are running. Driver intervenes, so this tells algorithm there was a difference between prediction and reality. Data snippet is uploaded to headquarters. Labelers look through the scenario, decide whether algorithm messed up for driver just felt like doing something different, and label appropriately.

Self-supervised learning is a thing and eventually Tesla may use that. Keep in mind you don't have to have the labeling for this be perfectly accurate to work - if the driver "ground-truth" is correctly only 90% of the time, that would still work.
 
I did a little searching to see if I could find any info on what operating systems Zoox or other AV companies use.

For Zoox, I found a job posting that hints that Zoox may use some form of "real-time Linux" (archive.org, archive.is). Maybe Automotive Grade Linux?

If I were an AV company, I would want to use an OS that is either a) my own proprietary OS or b) a free and open source OS. Or a mix of (a) and (b). I would feel uneasy about using a proprietary, closed source, licensed OS owned and controlled by another company.
Source? If you Google any of the following search terms:

"zoox" "nvidia drive os"

"zoox" "nvidia drive"

"zoox" "nvidia os"

"zoox" "nvidia operating system"

"zoox" "nvidia" "operating system"


You get no relevant results except for a blog post that doesn't say Zoox is using (or has ever used) Nvidia Drive OS.

You CAN'T use Xavier, etc without DRIVE OS because it gives you access to the hardware.

DRIVE OS is not an actual OS. Its a branding that is made up of various modules, for example: TensorRT, cuDNN, etc.

You can't use Xavier, Pegasus, Orion without Drive OS the same way you can't use a GTX/RTX Nvidia graphics card without Nvidia drivers.

Only when you can understand this can we move forward and talk about things like Zoox modifying TensorRT to optimize for their particular workflow, which is the same as modifying drivers (Nvidia Game Ready Drivers) for gtx/rtx cards to optimize for a particular game.


RdX2v6i.png



Here's the Job posting For Cruise who uses Nvidia...

Senior Deep Learning Inference Engineer​

What you must have:

  • Extensive experience with ML deployment software (e.g. TensorRT, TF Lite, etc)

Bonus points!

  • GPU programming (e.g. CUDA, OpenCL)

Here is a Engineering Manager, Technical Lead, Software Engineer job description on Linkedin:

Work across teams to improve platform performance by (1) analyzing modules for opportunities for performance improvements, (2) leverage accelerator platform by leveraging libraries and developing custom GPU code, (3) developing and monitoring platform performance metrics.​
Develop, deploy and maintain tooling to convert deep learning models/networks to inference-optimized modules; currently targeting production inferencing framework (TensorRT) and investigating open-source compilers.​
Participate in defining future computing platforms, with focus on algorithmic needs for neural networks and sensor ingestions flows.​
Architected and developed highly efficient, low-latency, in-loop image compression for computer vision pipeline; worked with and coordinated multiple teams to test, evaluate, and deploy the platform.​
 
Last edited:
  • Like
Reactions: caligula666
I think you're confusing the IPU-03 sold by Desay SV with other products sold by Desay SV or other companies. Or you're simply misunderstanding the functionality offered by the IPU-03.

The IPU-03 is an autonomous driving domain controller. This is what the IPU-03 does in the Xpeng P7, according to a press release:

"Available in China, the Xpeng P7 is one of the world’s leading autonomous EVs and carries the Desay SV automatic driving domain control unit – the IPU-03. Through multi-sensor data collection, the IPU-03 calculates the vehicle’s driving status and provides 360-degreee omnidirectional perception with real time monitoring of the surrounding environment to make safe driving decisions."​

Like Mobileye, Desay appears to have developed an end-to-end autonomous driving system, encompassing perception, localization, path planning, decision-making, and control.

Xpeng doesn't seem competent to develop such a system in-house, at least not at their Silicon Valley subsidiary. I would venture further that the dysfunction at their Silicon Valley office probably extends to their offices in China as well.

Again this showcases your limited knowledge of the auto and Tier 1 industry.
Tier 1s like Desay, Aptiv, ZF, Veoneer, etc provide variety of options to automakers.
  • Option 1 is just the hardware (SOC, domain controller) and sensor package (radar, camera, ultrasonic).
  • Option 2 is hardware and sensor package with sensor fusion of the sensors (this is what DeSay SV is doing for Xpeng)
  • Option 3 is the hardware/sensor package, plus the sensor fusion, plus the perception software using either the Tier 1's algorithm or another Tier 2 (Mobileye, etc).
  • Option 4 is the hardware/sensor package, plus the sensor fusion, plus the perception software, plus the driving policy.

DeSay SV does Option 2 for Xpeng and this is the case for almost all automaker. They rarely do their own sensor fusion.

For example here is one of the platform Aptiv offers:

APTIV’S APPROACH TO SENSOR FUSION
Aptiv’s sensor fusion software centrally fuses input from radars, cameras and other sensors to intelligently deliver 360° perception.

https://www.aptiv.com/docs/default-source/white-papers/2021_aptiv_whitepaper_nextgenadasplatform.pdf

DeSay SV does NOT provide the perception algorithm or the driving policy algorithms for Xpeng.
This is the same case with BMW iX. They are using Option #3 from Aptiv. Aptiv is providing the domain controller and the sensor fusion, which is then packaged with the perception software from EyeQ5 Vision SOC and then an Open EyeQ5 compute where BMW can write their algorithm to.

DwfQ2puUcAAi8yR.jpg


pIYBAFyEfOWACm2KAAFr6LmEZQo818.png