Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Elon: "Feature complete for full self driving this year"

This site may earn commission on affiliate links.
Awesome! This is really important info! This would seem to indicate that Tesla collected (in some unknown number of snapshots) the requisite data to do imitation learning.

It indicates that they could have collected some unknown amount of this data. But as I've said before, if you want to avoid biasing your model, you need representative data -- this means you need data when nothing "interesting" is happening, as well as when something interesting is happening. If you only have data from triggers your model will be biased to believe that the world is predominantly the conditions that the triggers are based on, whatever that is. Maybe it's a particular geographic area, maybe it's a hard braking event, whatever. This will lead to the model performing poorly in whatever conditions you didn't trigger on.

For example, there may be a trigger that says "driver took over while pedestrian detected". You're going to capture "long tail" events with that one where distracted pedestrians step into the road in front of you. And then you train your model on that, and the model learns that pedestrians typically jump out into the road in front of cars, and so it slams on the brakes whenever it sees a pedestrian.

There's also the fact that the triggers they have deployed so far have largely been to support development of current features, mostly highway driving for EAP features. They have already discarded the vast majority of their billions of fleet miles. Yes, they have a resource they can tap for training FSD when they're ready to do that, but they mostly haven't tapped it yet, as far as we know. And when they do tap it, it will be expensive. And you may have noticed that they don't have a lot of cash right now.
 
Yes, they have a resource they can tap for training FSD when they're ready to do that

Thank you! That’s the main point!

if you want to avoid biasing your model, you need representative data -- this means you need data when nothing "interesting" is happening

I believe verygreen has said there were completely random triggers at one time. These would collect uninteresting data.

Maybe jimmy_d could comment on this, but AFAIK you don’t want the training dataset to be completely statistically representative of real driving because then the neural network will always just drive straight. It will regress to the mean.

And when they do tap it, it will be expensive.

Source?

In one of my previous posts in this thread, I calculated how if you wanted to do the most computationally intensive part of AlphaStar’s training once per week, every week with Google’s Cloud TPUs at regular retail prices, it would cost ~$50 million/quarter. Tesla’s R&D budget is $350+ million/quarter, and Amir Efrati reported Tesla owns its own GPU clusters that it uses for neural network training, the cash for which has already been spent.

Data is transmitted over wifi, so there are no cellular data charges.

I’m not too familiar with server costs, but it looks like it costs at most about $0.10 per GB to transfer data in and out of Microsoft Azure. If this is accurate, bandwidth costs for uploading 1 GB per day from 1 million vehicles would be $36.5 million/year.

Verygreen, if you were able to provide an estimate of the size of the mid-level representation data, that could be helpful here. IIRC, I think previously you guessed ~10 MB/minute? Average driving time is under 60 minutes per day, so 1 GB per car per day would be excessive if ~10 MB/minute is the right figure.

Say you want to store 730 million GB on AWS S3 — the equivalent of 1 GB per day from 1 million cars over 2 years. (Over 20 billion miles of driving.) The price is $0.021 per GB per month. That’s $184 million per year, or $46 million per quarter.

$50 million/quarter for training + $9 million/quarter ($36.5 million / 4 ) for bandwidth + $46 million/quarter for storage = $105 million/quarter

An additional $105 million/quarter in R&D costs could be offset by an additional 23,500 sales of Full Self-Driving per quarter (assuming a 90% gross margin on the $5,000 sale price). This might just happen automatically as Tesla produces more cars. Tesla will also sell more units of FSD if its R&D results in new features. So even a large increase in training, bandwidth, and storage costs seems manageable.

And you may have noticed that they don't have a lot of cash right now.

Q1 2019 will be an aberrant quarter due to the large debt repayment and possibly due to restructuring costs related to closing stores down, but in Q3 and Q4 2018 Tesla generated $800+ million in free cash flow.

P.S.

Since at least November 2017, then, Tesla started hiring for this position. That puts work on a full self-driving simulator at ~1 year before Amir's article, at least.

lunitiks made a thread in September 2017 about job postings at Tesla that seem related to development of a full self-driving simulator: Autopilot simulation!

That would put the start of work on the simulator at 1.5+ years ago.
 
Last edited:
Awesome! This is really important info! This would seem to indicate that Tesla collected (in some unknown number of snapshots) the requisite data to do imitation learning.

This is why I struggle with responding to you in a meaningful manner.

From just a couple of posts up you sound very sceptical of every other autonomous manufacturer — ones with far better known merits than Tesla in this sphere — and then you do this with crumbs of theoretically positive Tesla news (for your thesis, that is)... go completely overboard in my books with the excitement and importance...

I don’t know why I’m spelling this out. Maybe in the hopes that you might at least think about it at some point.

Here we have Tesla who has shown no tangible progress — or NN training in their consumer fleet — in autonomous driving beyond driver’s aids. That you have exclamations of awesome/important for... and then your every message about those making actually visible progress in this sphere (car responsible driving ie actual autonomy) is laced with seriously excessive doubt at worst and questionable parallelism with Tesla at best.

I do not see a fact-based balance in this approach.
 
(Source: BMW technology: How the carmaker is developing its autonomous driving system)

Sure, BMW says it's doing imitation learning on an engineering fleet of 80 cars, but not on hundreds of thousands of production cars. Similarly, Waymo is doing imitation learning on its hundreds of engineering cars, but not on any production cars.

First, They have more than 80 cars now, secondly we don't know what BMW is doing with their customer cars, they don't even officially acknowledge even gathering and uploading HD Map data. If you bought a 2019 BMW you wouldn't know your car was sending any kind of data.

Secondly we don't know the amount of end to end imitation learning data needed for driving. We do know that AlphaStar wasn't trained on billions of miles of equivalent continuous driving data.

AlphaStar
was trained on 500,000 games and with a typical StarCraft game taking an average 15 mins. This translates to 125,000 hours of gameplay. If you were to translate this to driving, 125,000 hours of driving while going on average 35 MPH would generate 4,375,000 miles. So 500k StarCraft games would be equivalent to 4,375,000 miles of driving while going on average 35mph. Case in point, you don't need 'billions of miles'.

A car driving 24 hours will drive 840 miles a day (with an average of 35mph)
100 cars driving 24 hours would translate to 84,000 miles driven in a day.
100 cars driving 24/7 for a full month (30 days) would generate 2,520,000 miles.
You would only need less than 2 months of driving to collect equivalent StarCraft data.

In mobileye's patent they outline that:

First, using imitation, an initial policy can be constructed using the “behavior cloning” paradigm, using large real world data sets. In some cases, the resulting agents may be suitable. In other cases, the resulting agents at least form very good initial policies for the other agents on the roads. Second, using self-play, our own policy may be used to augment the training. For example, given an initial implementation of the other agents (cars/pedestrians) that may be experienced, a policy may be trained based on a simulator. Some of the other agents may be replaced with the new policy, and the process may be repeated. As a result, the policy can continue to improve as it should respond to alarger variety of other agents that have differing levels of sophistication.​
Tesla also collects some raw sensor data from production cars, as Karpathy has discussed at length and as verygreen's hacking has confirmed. The true scale of collection is hard to know.

We don't have insight into the scale of Tesla's data collecting ...

We do, we already estimated based on @verygreen research that only about ~0.1% of data driven are actually uploaded.

I think navigation and driving are different problems. I believe the navigation and route is handled by a traditional GPS navigation system like Google Maps or TomTom or whatever. I believe supervised imitation learning only comes in at the level of discrete driving tasks, like taking a right turn at an intersection, or taking an exit off a highway, etc.

Awesome! This is really important info! This would seem to indicate that Tesla collected (in some unknown number of snapshots) the requisite data to do imitation learning.

So you are changing your thesis then? Because this is not a AlphaStar approach. Uploading ~0.1 of steering/pedal actions isn't a AlphaStar approach. An AlphaStar approach would be uploading start to end of every single mile driven. That is the only way to create an agent that drives exactly like a human.

You said it yourself:

"Tesla will use billions of steering/pedal inputs to create a driving agent that's just as good as a human driver". (Paraphrasing)​

The reason Waymo hasn't launched isn't because of some long tail that occurs every "1 million miles". If that's the case their safety and comfort disengagement rate would be 1 in 1 million miles. No they haven't launched because their brittle system can't handle situations like merging and lane changing in dense traffic or unprotected left turns or the crazy ways people drive. Most people don't follow the rules, not even in their own lane, they are weaving about, hovering both side of the lane, driving like they are drunk.

But what's stopping them from launching isn't a kid riding a bike with a stop sign. Take for example this clip. Notice how the car freaks out when the truck driver weaves about in its lane. This is what's stopping them from launching. This is why they need to train against an adversarial agent that drives exactly like a human.

Here's a recent comment from a Waymo One rider.

Handling other humans. They work great when people follow rules, but one idiot doing illegal things or completely unexplained decisions requires human intervention.

So no Tesla siphoning ~0.1% of data isn't a AlphaStar approach and is nothing like AlphaStar at all.
 
Last edited:
We do, we already estimated based on @verygreen research that only about ~0.1% of data driven are actually uploaded.
this is oversimplification, but 0.1% is also a great exaggeration.

With Tesla greatly curtailing data collection so that many triggers have a 0.1-0.2% chance of being collected even when conditions are met.

Granted, those would still be mostly events that are at least somewhat interesting. It's probably not super sensible to collect regular driving data from customer cars - you can just pay some money to a dedicated driver and get better product (additional inline annotations).
 
The problem with this is, if this is true then @strangecosmos entire thesis falls apart.
Well, the outliers are still valuable, if your perception system is good enough to recognize them and you then collect all of them (something that does not happen).

What good is your 0.01% probability trigger for something that happens once a year to a single car in your fleet? So the argument would jus shift that "in the future it's still a valuable capability" I suspect.
 
@strangecosmos entire thesis falls apart.

That thesis fell apart from the start because it is based almost entirely on conjecture, cherry-picking, and wishful thinking. @strangecosmos has only ever succeeded in demonstrating that there is some possibility of success in the future. But Occam's Razor says that you should really look at the reliable information we have -- i.e., past and present performance of the vehicles themselves, and the changes to the official product descriptions -- and draw the simplest conclusion which is supported by the evidence, rather than clinging to whichever possibility most appeals to you, no matter how far-fetched it is in light of hard evidence.

Let me just go on record as saying that yes, there is a possible future in which Tesla delivers an L4-capable autonomy system with the current sensor suite, setting aside regulatory and legal concerns. (The lack of redundancy in various systems likely means this will never actually pass the liability hurdles and so Tesla will never allow it, but that's a separate issue.) Just because there's a possible future does not make it likely or worth betting on with your money.
 
hw2.5 adds a bunch of redundancy of course.

And additionally hw3 does on the internal level

You still have only one forward-facing radar, and many key areas are only within view of a single camera. I don't know how much redundancy they have in power delivery, transmission, and actuation either, nor how much fault tolerance in those systems. If the compute system goes down or haywire, do the actuators enter a fail-safe state, or do they go haywire? Is there any kind of monitoring system to ensure correct operation of every component? These are the things you need to consider if you're going to take liability away from the human. Do they truly have enough compute power available to bring the car to a safe stop if the HW3 chip stops working? Do they even have any way to know that it's not working anymore? What kind of "internal redundancy" does the HW3 chip itself have?
 
Well, the outliers are still valuable, if your perception system is good enough to recognize them and you then collect all of them (something that does not happen).

What good is your 0.01% probability trigger for something that happens once a year to a single car in your fleet? So the argument would jus shift that "in the future it's still a valuable capability" I suspect.

But this is not the @strangecosmos thesis. The thesis is that autonomous driving is enabled by training NNs on millions or billions of miles of state-action-pairs that only Tesla is currently positioned to collect — so the thesis goes.

The problem is: Tesla is not. And even if they were in the future it is unlikely they will. Training NNs will be based on and solved elsewhere than in the consumer car.
 
  • Like
Reactions: electracity
There is very little in terms of sensor redundancy as well as known blindspots for the cameras at low speeds.

There is neither sensor redundancy nor sensor diversity. Basically if any single sensor fails it is game over -- and this includes the sensor working nominally (sending camera frames) but the software fails to detect objects in the frames due to a flaw in the ML model (a problem which could be fixed by sensor diversity -- i.e., multiple sensing modalities, which they only have in front of the vehicle via radar and camera.)
 
You still have only one forward-facing radar, and many key areas are only within view of a single camera. I don't know how much redundancy they have in power delivery, transmission, and actuation either, nor how much fault tolerance in those systems. If the compute system goes down or haywire, do the actuators enter a fail-safe state, or do they go haywire? Is there any kind of monitoring system to ensure correct operation of every component? These are the things you need to consider if you're going to take liability away from the human. Do they truly have enough compute power available to bring the car to a safe stop if the HW3 chip stops working? Do they even have any way to know that it's not working anymore? What kind of "internal redundancy" does the HW3 chip itself have?

Tesla seems to be depreciating radar to a backup role. Forward view is the main concern in a fault mode, and there are 3 cameras looking there. HW3 has fully parallel processing. Current cars have multiple sing points of failure in which the only course of action is stopping.

There is neither sensor redundancy nor sensor diversity. Basically if any single sensor fails it is game over -- and this includes the sensor working nominally (sending camera frames) but the software fails to detect objects in the frames due to a flaw in the ML model (a problem which could be fixed by sensor diversity -- i.e., multiple sensing modalities, which they only have in front of the vehicle via radar and camera.)
In almost all cases, the car can move to the side of the road with a sensor missing. If redundancy is required, then that is what is would need to do anyway (redundancy lost).

Of course, if the NN fails to detect an object in the path of the car, that is a failure, but if it can't detect he object during the time it is in view with the car moving, having extra cameras likely wouldn't help.
.
 
many key areas are only within view of a single camera
that's ok, when a camera failure is detected, the car stops and needs to be fixed before the functionality is allowed again. They already do it.

if you have failure of a pillar/side cam, they display a message of "autopilot functionality reduced" and only use three forward cams. If any of those three is dead you cannot use AP at all.

Do they truly have enough compute power available to bring the car to a safe stop if the HW3 chip stops working
on hw2.5 the steering actuator has two paths. hw3 consists of two compute nodes, one monitoring the other. if the primary fails, secondry is the same and is able to take over.

What kind of "internal redundancy" does the HW3 chip itself have?
what do you mean by "hw3 chip"? they have four TRIP chips. two per compute node.

The thesis is that autonomous driving is enabled by training NNs on millions or billions of miles of state-action-pairs that only Tesla is currently positioned to collect — so the thesis goes.
the bolded part (I bolded it) is not wrong. Tesla is positioned very well to collect this data. not so sure about the actual usefullness of it, but collection could be done just fine.