Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Tesla's large-scale fleet learning

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Tesla has approximately 650,000 Hardware 2 and Hardware 3 cars on the road. Here are the five most important ways that I believe Tesla can leverage its fleet for machine learning:

1. Automatic flagging of video clips that are rare, diverse, and high-entropy. The clips are manually labelled for use in fully supervised learning for computer vision tasks like object detection. Flagging occurs as a result of Autopilot disengagements, disagreements between human driving and the Autopilot planner when the car is fully manually driven (i.e. shadow mode), novelty detection, uncertainty estimation, manually designed triggers, and deep-learning based queries for specific objects (e.g. bears) or specific situations (e.g. construction zones, driving into the Sun).

2. Weakly supervised learning for computer vision tasks. Human driving behaviour is used as a source of automatic labels for video clips. For example, with semantic segmentation of free space.

3. Self-supervised learning for computer vision tasks. For example, with depth mapping.

4. Self-supervised learning for prediction. The future automatically labels the past. Uploads can be triggered when a HW2/HW3 Tesla’s prediction is wrong.

5. Imitation learning (and possibly reinforcement learning) for planning. Uploads can be triggered by the same conditions as video clip uploads for (1). With imitation learning, human driving behaviour automatically labels either a video clip or the computer vision system's representation of the driving scene with the correct driving behaviour. (DeepMind recently reported that imitation learning alone produced a StarCraft agent superior to over 80% of human players. This is a powerful proof of concept for imitation learning.)​

(1) makes more efficient/effective use of limited human labour. (2), (3), (4), and (5) don’t require any human labour for labelling and scale with fleet data. Andrej Karpathy is also trying to automate machine learning at Tesla as much as possible to minimize the engineer labour required.

These five forms of large-scale fleet learning are why I believe that, over the next few years, Tesla will make faster progress on autonomous driving than any other company.

Lidar is an ongoing debate. No matter what, robust and accurate computer vision is a must. Not only for redundancy, but also because there are certain tasks lidar can’t help with. For example, determining whether a traffic light is green, yellow, or red. Moreover, at any point Tesla can deploy a small fleet of test vehicles equipped with high-grade lidar. This would combine the benefits of lidar and Tesla’s large-scale fleet learning approach.

I tentatively predict that, by mid-2022, it will no longer be as controversial to argue that Tesla is the frontrunner in autonomous driving as it is today. I predict that, by then, the benefits of the scale of Tesla’s fleet data will be borne out enough to convince many people that they exist and that they are significant.

Did I miss anything important?
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
On the topic of (1), here's a cool paper on discovering new object categories in raw, unlabelled video.

Large-Scale Object Mining for Object Discovery from Unlabeled Video

“Abstract—This paper addresses the problem of object discovery from unlabeled driving videos captured in a realistic automotive setting. Identifying recurring object categories in such raw video streams is a very challenging problem. Not only do object candidates first have to be localized in the input images, but many interesting object categories occur relatively infrequently. Object discovery will therefore have to deal with the difficulties of operating in the long tail of the object distribution. We demonstrate the feasibility of performing fully automatic object discovery in such a setting by mining object tracks using a generic object tracker. In order to facilitate further research in object discovery, we release a collection of more than 360,000 automatically mined object tracks from 10+ hours of video data (560,000 frames). We use this dataset to evaluate the suitability of different feature representations and clustering strategies for object discovery.”​

EyfwP8r.jpg
 
Last edited:

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
On the topic of (5), George Hotz (the President and founder of Comma AI) has an interesting discussion of reinforcement learning in this interview (starting at 1:27:30):


I’m not sure whether or not what Hotz is describing would count as off-policy reinforcement learning. I don’t know enough about reinforcement learning to say. In any case, off-policy reinforcement learning is at least a related idea that’s relevant for training autonomous vehicles on planning.

I’m curious/befuddled about why Hotz would want to use reinforcement learning rather than supervised imitation learning (a.k.a. behaviour cloning). There are also approaches that combine reinforcement learning and imitation learning, such as inverse reinforcement learning and reinforcement learning from demonstrations. That makes everything more confusing.

@jimmy_d is someone who could probably shed some light on this.
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
A few clarifications since the topic was brought up elsewhere:
  • I'm not certain full autonomy is even possible with current technology. I hope it is, but who can predict the future? However, I am much more confident we can get to “cyborg driving” (i.e. highly advanced Level 2 autonomy in all driving environments) within 5 years. I'm confident because we already have a rudimentary version of cyborg driving in the form of Navigate on Autopilot.
  • One way or another, I think there's a decent chance Tesla will use lidar eventually, even if superhuman full autonomy can be achieved without it in the near term.
  • Disengagements are probably not the best measure of how close a system is to full autonomy. I discussed this at length in another thread in the context of Waymo and Cruise. A year ago, Waymo's disengagement rate seemed to be once per ~50 miles, but now Waymo is deploying driverless rides and hopefully it has good internal safety metrics to support that decision. 1 disengagement ≠ 1 crash.
  • Partly for the same reason, Tesla's past or current Autopilot disengagement rate is not super germane to the hypothesis that Tesla has a competitive advantage in the form of large-scale fleet learning. Also, we are still waiting to see the next generation of Tesla's software and neural nets designed for the next generation of its computer hardware (i.e. Hardware 3 a.k.a. the FSD Computer). Hardware 2 cars — or Hardware 3 cars running the same software and neural nets as Hardware 2 cars — don't tell us what the next gen performance will be. Even when the next gen software and networks come out, it might take a few iteration cycles (i.e. collect fleet data, label, train, tweak) for Tesla to really hit its stride. Only then can we do an apples-to-apples comparison with Waymo One. (We've got to remember to compare consumer software with consumer software and development software with development software.)
  • Per (1), Tesla's advantage with fully supervised learning for computer vision tasks is largely in its ability to collect many more training examples than its competitors (e.g. Waymo) of rare semantic classes. Tesla won't get 1,000x more hand-labelled images of sedans than Waymo, but it might get 1,000x more hand-labelled images of bears, without spending anything like 1,000x more money on labelling.
  • With regard to the Baidu paper (which I discuss in a blog post that echoes this thread), my claim is not that Tesla will increase its own fleet size 1,000x and, therefore, increase its machine learning performance 10x. Rather, my claim is that Tesla has ~1,000x more training vehicles than Waymo and something like 300x as many training vehicles as the rest of the world combined. The Baidu paper doesn't tell us the increase in machine learning performance this implies; it just gives us a rough, ballpark point of reference.
  • The Baidu paper looks at four tasks. Three are natural language processing (NLP) tasks. One is image classification on the ImageNet dataset. Admittedly, ImageNet is only weakly analogous to the computer vision tasks used in autonomous driving. But at least it's a computer vision task. The NLP tasks seem even less analogous. Maybe NLP tasks involving sequences would be a more relevant comparison for (5): imitation learning for planning (but I'm not sure about that).
  • I don't know what Tesla's performance scaling rate is or will be. I don't think anyone outside Tesla knows. (Same goes for Waymo, Cruise, et al.) Even if the scaling rate is much less than ImageNet — say, 2x performance for 1,000x training data — that's still pretty good. 2x performance is nothing to sneeze at.
  • The topic of the “irreducible error region” was brought up. This concept is described in the Baidu paper (page 11):
“Finally, for most real world applications, there is likely to be a non-zero lower-bound error past which models will be unable to improve. This lower bound includes Bayes error—the information theoretic lower bound based on the data generating function—and a combination of other factors that cause imperfect generalization. For instance, mislabeled samples in the training or validation data sets are likely to cause irreducible error. We call this the irreducible error region. Although we have yet to reach the irreducible error region for real applications in this study, we have tested that this lower bound exists for toy problems.”
I haven't seen any evidence yet about what the irreducible error region might be for the computer vision, prediction, and planning tasks required for autonomous driving.​
 
Last edited:
  • Like
Reactions: Richt and NHK X

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
A thought I had about ImageNet vs. other tasks: on ImageNet, a neural network is making a prediction about which of 1,000 categories an image belongs to. That means it has a 0.1% chance of getting the answer right if it guesses randomly. By contrast, if a neural network is tasked with, say, classifying images of traffic lights as either green or red/yellow, it has a 50% chance of getting it right by randomly guessing.

As far as I'm aware, if you do pre-training on ImageNet, a neural network can often do quite well on binary classification tasks (i.e. 90%+ accuracy) like this with fine-tuning from a relatively small number of labelled training images (e.g. just a few dozen).

Object detection for autonomous vehicles is a different problem than image classification, but for better and for worse ImageNet classification is the most well-studied problem in deep learning-based computer vision.
 

diplomat33

Well-Known Member
Aug 3, 2017
7,758
9,082
Terre Haute, IN USA
I'm not certain full autonomy is even possible with current technology. I hope it is, but who can predict the future? However, I am much more confident we can get to “cyborg driving” (i.e. highly advanced Level 2 autonomy in all driving environments) within 5 years. I'm confident because we already have a rudimentary version of cyborg driving in the form of Navigate on Autopilot.

I get the idea of simply enhancing human drivers rather than replacing them. But I think "cyborg driving" could prove as difficult or more difficult than autonomous driving. With "cyborg driving", you need to determine how you are going to split the job of driving between the human driver and the machine. And, you need a sophisticated driver attention system to monitor the driver in case they get distracted, sleepy or experience a medical emergency. And you need the machine to be the fallback in case the driver is incapable of driving (sleepy or ill) so you still need to have some amount of autonomous driving. If you are going to do all this extra stuff, why not just cut the human driver out of the loop completely and go full autonomous? Plus, "cyborg driving" gives up on all the advantages that full autonomy would bring to society.

And of course, "Cyborg driving" is born out of the premise, that you state, that autonomous driving is not possible. Ergo, the best we can hope for is to enhance the human driver instead. But if that premise is false and autonomous driving really is possible, then the whole need for "cyborg driving" becomes a moot point.

Disengagements are probably not the best measure of how close a system is to full autonomy. I discussed this at length in another thread in the context of Waymo and Cruise. A year ago, Waymo's disengagement rate seemed to be once per ~50 miles, but now Waymo is deploying driverless rides and hopefully it has good internal safety metrics to support that decision. 1 disengagement ≠ 1 crash.

Not sure where you are getting the 50 miles number from. According to the California DMV report from last year, Waymo's disengagement rate is around 10,000 miles per disengagement. Even if we say that the DMV number is only critical safety disengagements and the 50 miles number is all disengagements, even minor non safety ones, that is still a huge discrepancy that is hard to reconcile IMO. I don't think the 50 miles number can be accurate. If Waymo's disengagement rate really was that bad, there is no way that they could be doing driverless rides like they are doing.

I think disengagements and interventions are very logical ways to measure the quality of autonomous driving. After all, if the human is not touching the steering wheel or pedals, then logically, it can only mean that the car is driving, which is the very definition of autonomous driving.

But first, I think we need to distinguish between disengagements and interventions.

When an autonomous system is engaged, it means that the autonomous car is responsible for the driving. Any disengagement means that the autonomous car was forced to stop being responsible for the driving. Obviously, disengagements are bad since you want a good autonomous car to be able to be responsible for driving for long periods of time. We don't want it to quit in the middle of a trip! That is why the SAE specifically defines autonomous driving as the sustained performance of all dynamic driving tasks.

Interventions are when the human interacts with the steering wheel or pedals for the purpose of correcting behavior (side note: AP nags don't count as interventions since the purpose is not to correct driving behavior). In theory, I guess you could have an intervention without a disengagement if the system allows the human to make a quick correction without disengaging the system completely. For example, the human jerks the wheel or taps the accelerator pedal but without turning off the autonomous system. Again, any intervention by the driver is a sign of failure because it indicates that the autonomous car was unable to perform the driving tasks correctly. So the fewer the interventions, the better your autonomous car is at performing the driving tasks correctly.

Bottom line: if you have a trip where the autonomous system was engaged the whole time with no disengagements and no driver interventions on the controls, then it can only mean that the car was responsible for driving and successful at performing all the driving tasks for the entire trip, which is without a doubt an indicator of how good your autonomous driving is. Logically, the goal would be to have your autonomous system stay engaged as long as possible and with zero interventions for as long as possible. An autonomous car that can stay engaged for 1000 miles with no interventions is obviously better than an autonomous car that can only stay engaged for 10 miles with no interventions.
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
With "cyborg driving", you need to determine how you are going to split the job of driving between the human driver and the machine. And, you need a sophisticated driver attention system to monitor the driver in case they get distracted, sleepy or experience a medical emergency. And you need the machine to be the fallback in case the driver is incapable of driving (sleepy or ill) so you still need to have some amount of autonomous driving. If you are going to do all this extra stuff, why not just cut the human driver out of the loop completely and go full autonomous?

The problems of the human-machine interface and driver monitoring may prove much easier than fully autononomous cars that crash less often than every 500,000 miles. Navigate on Autopilot is the first consumer product that could arguably be described as “cyborg driving”.

Elon seems disinterested in driver monitoring cameras. He once tweeted that they are ineffective, but I don't what his basis for that is. In his first interview with Lex Fridman, he essentially said it wasn't worth worrying about because by the end of 2020 it would be a moot point: Teslas would already be fully autonomous and superhuman.

But what if that doesn't end up happening? What if the autonomous driving system is only 25% as safe as a human on its own, but 125% as safe with a human in the loop? “Cyborg driving” is the alternate path for these technologies if full autonomy is out of reach but partial autonomy continues to progress.

Not sure where you are getting the 50 miles number from. According to the California DMV report from last year, Waymo's disengagement rate is around 10,000 miles per disengagement. Even if we say that the DMV number is only critical safety disengagements and the 50 miles number is all disengagements, even minor non safety ones, that is still a huge discrepancy that is hard to reconcile IMO.

The Waymo disengagements thing is explained here. Sometimes a safety driver will take over simply because a Waymo van is staying still for too long. That's a disengagement but it's also not an unsafe vehicle behaviour (at least not obviously unsafe). Examples like these help explain the discrepancy between the anecdotal reports from Waymo riders in Arizona and the California DMV figure. The California DMV is focused on disengagements where failure to disengage would have caused a risk of a collision. This thread has more detail.

90% seems fine, I bet I mess them up more often than that at first glance!

I tried to figure this out once. I think humans stop for red lights roughly about 99.9% of the time.
 
Last edited:

diplomat33

Well-Known Member
Aug 3, 2017
7,758
9,082
Terre Haute, IN USA
The Waymo disengagements thing is explained here. Sometimes a safety driver will take over simply because a Waymo van is staying still for too long. That's a disengagement but it's also not an unsafe vehicle behaviour (at least not obviously unsafe). Examples like these help explain the discrepancy between the anecdotal reports from Waymo riders in Arizona and the California DMV figure. The California DMV is focused on disengagements where failure to disengage would have caused a risk of a collision. This thread has more detail.

Thanks. I know about the different types of disengagements. I am still skeptical of the 50 miles number because from what the other thread, it seems rather approximate and anecdotal. I can't be sure how accurate it is. The DMV disengagement rate is more scientific.

But be that as it may, safety and non-safety disengagements do not carry the same weight in my opinion because safety disengagements are deal breakers whereas non-safety disengagements don't have to be. I am more willing to give an autonomous car a pass on non-safety disengagements than I am with safety disengagements.
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Unfortunately, I think we (the general public) just don't have good data on autonomous vehicle safety or performance. I think the best data we have is what's been leaked out.

The California DMV's metric is a bit unscientific in the sense that different companies are free to count and report disengagements differently. It also gives companies leeway to exclude disengagements based on their own judgment.

Anyway, if we know that the California DMV reports don't count total disengagements and we know that Lex Fridman et al.'s MIT-AVT study does count total disengagements, we shouldn't treat these two numbers as if they're measuring the same thing. They aren't. That was my original point above.

Even if Waymo's total disengagement rate were once per ~11,000 miles, humans crash on average once per ~500,000 miles. So, there is a 45x discrepancy if we just use disengagements as the metric for autonomous vehicle safety. Unless Waymo is being completely subjective or reckless in deploying driverless rides, presumably they have different internal metrics of safety, perhaps using a similar methodology to what leaked out from Cruise. The assumption has to be 1 disengagement ≠ 1 crash and, therefore, there has to be some alternative way of measuring safety. Or else Waymo is extremely premature in doing driverless rides.
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Didn't he call them "tricky disengagements" or something? I thought that was a fun and low-key way of saying disengagements that would have been crashes.

From the pre-print:

“The term “tricky situations” is used in the definitions and throughout this work to describe challenging driving scenarios that require a response or anticipatory action by the driver in order to maintain safe operation of the vehicle.”
Table I on page 9 has more information:

6QY1PHt.jpg


I think tricky disengagements are when either a) there was a machine failure in one of Autopilot's intended use cases within its operational design domain (ODD) or b) the human driver worried there would soon be such a failure or a risk of such a failure.

Non-tricky disengagements are when Autopilot is working as intended and there is no machine failure (or anticipated failure). That would include, for example, disengaging Autopilot when you exit the highway.
 
  • Helpful
Reactions: diplomat33

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Thx, so is it fair to say that “tricky disengagements” are when the car actually would have crashed, or at least the driver was pretty sure it would have crashed?

As I understand it, the way companies like Waymo and Cruise assess whether disengagements prevented a crash is they re-run the scenario in simulation. I think we would need that kind of analysis with the tricky disengagements to determine what would have happened without the human's intervention. I don't know if a rigorous apples-to-apples comparison is possible with the data that's publicly available right now.

Trent how do you think Voyage made 100x improvements with such a small fleet? Is Oliver Cameron just lying? Every company has been claiming they’re nearly done for like five years now lol.

Oliver seems like an honourable and conscientious person and I don't think he would just lie about such a thing. The 100x figure (for improvement in prediction) is eye-popping and I'm curious how it was achieved. I hope Voyage goes into more detail on this at some point.

Waymo claimed a 100x improvement in pedestrian detection after switching to a deep learning approach. Maybe Voyage switched to a deep learning approach. I dunno. ¯\_(ツ)_/¯
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
What would hold them back?

Mainly just the logistics of scaling production of Waymo's custom vehicles, which include Waymo's in-house custom lidar. And negotiating the right business partnerships to get that done. Could take multiple years to get from 1,000 Waymo robotaxis to 1 million. Especially if Waymo wants to go with electric robotaxis, for which the per-mile economics should be much better.
 

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Tesla already has manufacturing capacity of ~350k units/year of electric vehicles with autonomy hardware ex-lidar. It also has ~200k electric vehicles with autonomy hardware ex-lidar in the field today (plus an additional ~500k that just need a computer upgrade).

Gigafactory China is coming online soon. Model Y production in the U.S. is starting next year. Model 3 production in the U.S. may continue to grow, depending on demand and cap ex. That ~350k/year run rate seems like it will continue to increase.

Except for lidar, Tesla has already ramped production and is continuing to ramp. Waymo and its partners would just be starting.
 
Last edited:

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Waymo’s partners are FCA, which made 4.4 million vehicles in 2017, and Magna, which makes about 200k vehicles per year.

But the numbers for all-electric vehicles is much smaller. For instance, the Jaguar I-Pace is on track to sell ~3,000 units in the U.S. in 2019.

The per-mile economics of electric robotaxis is expected to be much better than gasoline robotaxis due to differences in depreciation, energy costs, and maintenance costs. So, there would be a strong incentive for Waymo to opt for electric robotaxis.

If it didn't, there would be a risk that another company would come in with electric robotaxis and undercut Waymo on price. That means the $50 billion+ investment of building 1 million robotaxis would be at risk.

A limiting factor in building electric vehicles is battery cell and battery pack production. That takes time to ramp. New factories need to be built.

If Waymo has a 10-year technology lead, then of course none of this matters. But if it only has a 1-year or 2-year technology lead, then these considerations matter a lot.
 
  • Like
Reactions: tacocat

strangecosmos2

Koopa Troopa
Nov 24, 2019
177
120
New Donk City
Sure, but we don’t know how many the factory could actually build. Either way, it’s not credible to claim that either FCA or Magna are “just starting” in the car business.

As I said, an important limiting factor would be battery cells and battery packs.

Obviously, FCA or Magna aren't just starting in the car business. But Waymo and its partners would only just be starting to ramp production of Waymo's specific vehicle model or models.

Ramping to 350k units/year of custom Waymo I-Paces would probably take several years, for example.

Agreed, electric robotaxis would be best. But gasoline robotaxis would still be insanely profitable. Any company who can make any kind of robotaxi should get off their butts and do it pronto.

The profit off the gas robotaxis could, of course, be immediately rolled into increasing production of (even more profitable) electric robotaxis.

Yeah, you have a good point. Supply of robotaxis might end up being so constrained relative to demand that gasoline robotaxis can co-exist with electric robotaxis for a time, even if they're a worse deal for customers.

But whether this means the first mover has a hard takeoff and corners the market would depend primarily, I think, on how long its technology lead is. 10 years, sure. 1 year, probably not.

Just scaling up the production of high-resolution, long-range, automotive-grade lidar to 100k+ units per year could take several years.

In a scenario where Tesla's robotaxis don't need lidar but Waymo's do, well, then of course Tesla would scale faster even with a technology lag of 1 year+. But that's somewhat trivial to say.

In a scenario where both companies need lidar, the manufacturing lag for lidar could be as long as the technology lag, meaning Tesla would have time to catch up on the technology (mainly, I have in mind machine learning R&D and software development for lidar perception and sensor fusion) while Waymo waits to produce lidars at scale.
 
  • Like
Reactions: tacocat

About Us

Formed in 2006, Tesla Motors Club (TMC) was the first independent online Tesla community. Today it remains the largest and most dynamic community of Tesla enthusiasts. Learn more.

Do you value your experience at TMC? Consider becoming a Supporting Member of Tesla Motors Club. As a thank you for your contribution, you'll get nearly no ads in the Community and Groups sections. Additional perks are available depending on the level of contribution. Please visit the Account Upgrades page for more details.


SUPPORT TMC
Top