The issue here is ground truth. There is no ground truth when a Tesla customer car is out driving. You have no idea if the customer disconnected because of danger or because they just wanted to go get a donut, and you have no idea if you missed that speed limit sign. Tesla drivers are not training or even testing the system every time they drive, because the feedback the system can get is minimal, and all learning systems need a feedback signal.
There are two distinct concepts to understand here:
1) Automatic labelling, e.g. when drivers automatically provide the training signal for semantic free space segmentation. (Another example of automatic labelling is self-supervised learning.)
2) Automatic data curation, when a sophisticated technique is used to trigger uploads of data and then the ground truth label is provided by a human annotator.
You can imagine them, but Kaparthy specifically says he hand codes for specific detections to collect data. They teach it a few stop signs, then hope it can pick up more and more stop sign variations. They are nowhere near just having the system learn by itself and just record and upload "odd" situations.
Check out this video of him from a year ago- He's talking just about how they even collect all the variations of stop signs so they can train the model on them. These are single frame images of single objects, which humans then have to classify. They are nowhere near complex conditional cases like "all lanes are stopped for no reason".
This is an incorrect interpretation of Karpathy because you are making the leap from “this is one thing that we do” to “this is all that we do”. Karpathy and others at the company such as Stuart Bowers and Elon have discussed other forms of automatic data curation. For example, Karpathy has explicitly mentioned Tesla’s use of active learning. One way to do active learning is to run an ensemble of diverse neural networks and then upload training examples when their predictions diverge. These training examples get labelled by human annotators and then added to the training datasets that are used to train the neural networks.