FSD Beta v11.x

EVNow · Jan 25, 2022

Bladerskb said:
Overfit problem. Fleet above 10k-50k is pretty useless as 99.99...% of the data is not used.

Overfit has nothing to do with large fleet - just the opposite. With larger fleet you start getting much more varied data.

Afterall there are more than 1,000 cities world over - you don’t want just 50 on avg from each city.

For Tesla I guess the large fleet lets than collect the exact type of data they want quicker, now - even though they are not really chasing the long tail yet.

Randy Spencer said:
Fascinating, what data would they NOT want to include. If it came from a Tesla vehicle with undamaged cameras you would think it would be valid, perhaps not pertinent, say just driving down an empty highway, but no reason not to include it now that you have Dojo to devour all available input.

First of all they don’t have dojo. That is still in development. May be they have a machine or two - not a data center full of DoJo that is needed.

Second, it’s not about just training compute. No point having 1 million samples of straight lanes. You want 100s of thousands of all types of lanes. Moreover you want to oversample edge cases. Also you have to label data - no point labeling a million videos of driving straight on an empty road, for eg.

Mardak · Jan 26, 2022

EVNow said:
With larger fleet you start getting much more varied data

Just yesterday, I pressed the FSD Beta video snapshot button because I've never seen somebody suddenly drive off an interstate interchange into the dirt to step out and take a smoke waiting for traffic to clear. I'm not sure if there's a more formal definition to describe the frequency of these varied "findings," but if we say a situation like this happens once every 600k miles of travel, would that be similar to 600k Miles-Traveled-Between Finding? If we say the FSD Beta fleet is 20k vehicles traveling an average 30 miles each day, then on average one finding of this type could be detected every day. This particular scenario might be relevant for FSD Beta 11 behaving correctly with vulnerable road users on highways/interstates.

Are there standard thresholds of these MTBF, and if we don't think Tesla is chasing the long tail, are they then looking for examples of things under say 6k MTBF for ~100 examples/day from FSD Beta fleet? Of course increasing the "finding" fleet from just FSD Beta testers to say most Tesla vehicles in the US probably over 1 million, then even that original 600k MTBF example would find ~50/day.

With all the training needed for FSD Beta 11, if they are looking for <6k MTBF type videos, then presumably yes they likely have more than enough data to process, but if it's more of the >600k MTBF situations, then there could be a data collection bottleneck.

EVNow · Jan 26, 2022

Mardak said:
Are there standard thresholds of these MTBF, and if we don't think Tesla is chasing the long tail, are they then looking for examples of things under say 6k MTBF for ~100 examples/day from FSD Beta fleet? Of course increasing the "finding" fleet from just FSD Beta testers to say most Tesla vehicles in the US probably over 1 million, then even that original 600k MTBF example would find ~50/day.

With all the training needed for FSD Beta 11, if they are looking for <6k MTBF type videos, then presumably yes they likely have more than enough data to process, but if it's more of the >600k MTBF situations, then there could be a data collection bottleneck.

There are a couple of ways they might be attacking the problem.

One is to take a case - say traffic lights - and throw every possible edge case to make it very good. Apparently this is what they tried (according to one Karpathy talk) - apparently they even found a "blue" light, for eg.

The other option is to take common cases and train. Then as you expand, keep adding "edge" (and not so edge) cases. This is possibly what they are doing with a lot of FSD Beta stuff.

Regarding MTBF, I've posted this earlier. But depending on your definition of failure - the human average is about 10k miles for the mildest failures (like scraping the tire) to 300k for serious accidents. Then, you want to decide how much better you want to be for AVs ... 2x or 10x ?

So, we are talking about MTBF of 20k miles for the simplest failures and 2x better than humans to 3 Million miles for serious accidents and 10x better than humans.

I'd estimate MTBF of FSD Beta now to be between 10 and 100 miles. So, they have a long way to go.

EVNow · Jan 26, 2022

Elon on the earnings call now -

- Will be shocked if FSD Beta doesn't match human driving (failure rate) this year
- Lot of important changes in the stack coming next few months

Mardak · Jan 26, 2022

EVNow said:
Elon on the earnings call now

He also clarified (reiterated again what he suggested at Autonomy Day 2019):

It's going to be way cheaper to go point to point with a robotaxi or an autonomous Tesla, which is every car we've made in the past 3-4 years will be capable of that, than a bus or subway.

3 years does match up with when FSD Computer was installed by default.

planetary · Jan 26, 2022

I need a Tweet discussing the next FSD release where qualified users in the queue will be onboarded. Driving a month in gutless mode is taking a toll on my sanity.

Bladerskb · Jan 26, 2022

EVNow said:
Elon on the earnings call now -

- Will be shocked if FSD Beta doesn't match human driving (failure rate) this year
- Lot of important changes in the stack coming next few months

He also said in April 2019 "Towards the end of this year, I'd be SHOCKED if not next year, at the latest.. that having the driver intervene will decrease safety! DECREASE!!"

EVNow · Jan 27, 2022

Bladerskb said:
He also said in April 2019 "Towards the end of this year, I'd be SHOCKED if not next year, at the latest.. that having the driver intervene will decrease safety! DECREASE!!"

Not sure why journalists don't ask him about his earlier comments.

I mean, what is the point of asking " when will FSD happen" ? We all know his answer (depending on the month, either "end of this year" or "Next year"). I think they are all just looking for a click bait comment.

What I would like them to ask is - "You have been saying you will achieve FSD soon for several years now - how come you get it wrong every year" ?

JJbell · Jan 27, 2022

EVNow said:
Not sure why journalists don't ask him about his earlier comments.

I mean, what is the point of asking " when will FSD happen" ? We all know his answer (depending on the month, either "end of this year" or "Next year"). I think they are all just looking for a click bait comment.

What I would like them to ask is - "You have been saying you will achieve FSD soon for several years now - how come you get it wrong every year" ?

He has kind of already answered this question though by admitting that he didn't realize how hard the FSD problem was to solve. The better question is: "Why do you keep setting and communicating such aggressive, possibly unachievable targets for yourself?" And I think we all know the answer to that.

The reality is I have yet to see a FSD feature that is better than Tesla in REAL WORLD SCENARIOS. So I will just have faith he will keep setting aggressive timelines for himself and his team because by nature he is an optimist visionary and he will be wrong 90% of the time but it only takes 1 time to be right. And when he is right, and right first, no one will remember all the times he was wrong.

planetary · Jan 27, 2022

JJbell said:
And when he is right, and right first, no one will remember all the times he was wrong.

As it was with my Model S delivery. I like the car so much, I've forgotten (or at least choose to not dwell on) the months and months of EDD chaos and frustration.

sleepydoc · Jan 28, 2022

EVNow said:
Elon on the earnings call now -

- Will be shocked if FSD Beta doesn't match human driving (failure rate) this year
- Lot of important changes in the stack coming next few months

One question I have about FSD ‘failure rates:’ how are they calculating them?

Currently FSD is very much in beta, you have to constantly pay attention and take over on a regular basis. What’s more, most people I know will opt not to use it in difficult situations because they simply don’t trust it. Given that, what’s the true ‘failure rate?’ Sure, it’s next to perfect just driving on the interstate, but so are humans. If you’re comparing FSD failures in easy driving with human failures across all situations then you’re using faulty techniques and the numbers are meaningless. Beyond that, I’d argue that touting the numbers is deception at best, potentially fraud at worst.

momo3605 · Jan 28, 2022

sleepydoc said:
One question I have about FSD ‘failure rates:’ how are they calculating them?

Currently FSD is very much in beta, you have to constantly pay attention and take over on a regular basis. What’s more, most people I know will opt not to use it in difficult situations because they simply don’t trust it. Given that, what’s the true ‘failure rate?’ Sure, it’s next to perfect just driving on the interstate, but so are humans. If you’re comparing FSD failures in easy driving with human failures across all situations then you’re using faulty techniques and the numbers are meaningless. Beyond that, I’d argue that touting the numbers is deception at best, potentially fraud at worst.

Calculating FSD failure rate from disengagements would be the most genuine way to do it, but knowing how Elon does it for the Autopilot safety report, they just look at crashes while AP was engaged.

So I’m sure, they’ll use some metric like that to say, “see, FSD Beta hasn’t caused an accident in over a million driven miles…oh but don’t worry about those disengagements every other mile where the driver saved the car from swerving and hitting an oncoming car…it’s a driver’s assist feature”

sleepydoc · Jan 28, 2022

momo3605 said:
Calculating FSD failure rate from disengagements would be the most genuine way to do it, but knowing how Elon does it for the Autopilot safety report, they just look at crashes while AP was engaged.

Agreed, and even that would only tell you how often the driver felt the need to take over in situations where they felt secure enough to engage FSD in the first place.

I’d be willing to bet that that metric will actually remain somewhat constant as FSD improves simply because as it improves, drivers will use it in more difficult situations where they previously wouldn’t have used it.

momo3605 said:
So I’m sure, they’ll use some metric like that to say, “see, FSD Beta hasn’t caused an accident in over a million driven miles…oh but don’t worry about those disengagements every other mile where the driver saved the car from swerving and hitting an oncoming car…it’s a driver’s assist feature”

Yeah, that last part is in a foot note at the end of a 100 page report in 3 point font!

Daniel in SD · Jan 28, 2022

momo3605 said:
Calculating FSD failure rate from disengagements would be the most genuine way to do it, but knowing how Elon does it for the Autopilot safety report, they just look at crashes while AP was engaged.

So I’m sure, they’ll use some metric like that to say, “see, FSD Beta hasn’t caused an accident in over a million driven miles…oh but don’t worry about those disengagements every other mile where the driver saved the car from swerving and hitting an oncoming car…it’s a driver’s assist feature”

He said they'll do it by analyzing the disengagements.

Take a billion miles of disengagement and collision data
simulate disengagement counterfactuals to see if there would have been a collision and how severe
compare to human performance
robotaxis!

"We've also got to make it work and then demonstrate that if the reliability is significantly in excess of the average human driver or to be allowed... um... you know for before people to be able to use it without... uh... paying attention to the road... um... but i think we have a massive fleet so it will be I think... uh... straightforward to make the argument on statistical grounds just based on the number of interventions, you know, or especially in events that would result in a crash. At scale we think we'll have billions of miles of travel to be able to show that it is, you know, the safety of the car with the autopilot on is a 100 percent or 200 percent or more safer than the average human driver"

EVNow · Jan 28, 2022

momo3605 said:
Calculating FSD failure rate from disengagements would be the most genuine way to do it, but knowing how Elon does it for the Autopilot safety report, they just look at crashes while AP was engaged.

So I’m sure, they’ll use some metric like that to say, “see, FSD Beta hasn’t caused an accident in over a million driven miles…oh but don’t worry about those disengagements every other mile where the driver saved the car from swerving and hitting an oncoming car…it’s a driver’s assist feature”

They are using disengagements. That is what they are tracking - Elon has said that multiple times.

More conservative approach would be to use all interventions. More optimistic approach would be to only take disengagements that would have resulted in accidents - but you can't easily do that when you have 60,000 testers. Only companies with tiny fleets like the AV companies can analyze every disengagement.

momo3605 · Jan 28, 2022

Daniel in SD said:
He said they'll do it by analyzing the disengagements.

Take a billion miles of disengagement and collision data

simulate disengagement counterfactuals to see if there would have been a collision and how severe

compare to human performance

robotaxis!

"We've also got to make it work and then demonstrate that if the reliability is significantly in excess of the average human driver or to be allowed... um... you know for before people to be able to use it without... uh... paying attention to the road... um... but i think we have a massive fleet so it will be I think... uh... straightforward to make the argument on statistical grounds just based on the number of interventions, you know, or especially in events that would result in a crash. At scale we think we'll have billions of miles of travel to be able to show that it is, you know, the safety of the car with the autopilot on is a 100 percent or 200 percent or more safer than the average human driver"

With all those ums and uhs, you really think that’s a finalized plan?

But yes, I’m sure they will be tracking interventions and disengagements internally. Doesn’t mean they’ll ever make it public. Look at the autopilot safety report. It’s obfuscated any useful info like interventions or disengagements

Daniel in SD · Jan 28, 2022

momo3605 said:
With all those ums and uhs, you really think that’s a finalized plan?

But yes, I’m sure they will be tracking interventions and disengagements internally. Doesn’t mean they’ll ever make it public. Look at the autopilot safety report. It’s obfuscated any useful info like interventions or disengagements

He's said a few other times too. I agree it doesn't seem like he's given it that much thought but he's got the basic idea.
Disengagements are irrelevant to Autopilot safety. If error rate determined the safety of a system the requires human monitoring FSD Beta testing would be blood bath.

Daniel in SD · Jan 28, 2022

EVNow said:
They are using disengagements. That is what they are tracking - Elon has said that multiple times.

More conservative approach would be to use all interventions. More optimistic approach would be to only take disengagements that would have resulted in accidents - but you can't easily do that when you have 60,000 testers. Only companies with tiny fleets like the AV companies can analyze every disengagement.

If you don't analyze every disengagement how will know whether or not the disengagement prevented a collision?

EDIT: Actually I suppose Tesla could analyze a random sample of disengagements but the overall number of disengagements analyzed to prove safety seems like it would be the same.

EVNow · Jan 28, 2022

Daniel in SD said:
If you don't analyze every disengagement how will know whether or not the disengagement prevented a collision?

You could come up with some automation to estimate prevented accidents - but that would just be an estimate with no easy way to establish confidence intervals.

Easy thing to do is to just use overall disengagements. That is what I'd expect them to use and that is what Elon says they use. Everything else is speculation.

Daniel in SD · Jan 28, 2022

EVNow said:
You could come up with some automation to estimate prevented accidents - but that would just be an estimate with no easy way to establish confidence intervals.

Easy thing to do is to just use overall disengagements. That is what I'd expect them to use and that is what Elon says they use. Everything else is speculation.

How do you prove the relationship between overall disengagements and safety? I think there's a limit on how low disengagement rate can go if you tell the safety driver that they are responsible for avoiding a collision.

FSD Beta v11.x

Well-Known Member

Active Member

Well-Known Member

Well-Known Member

Active Member

EndlessCheese spreader

Senior Software Engineer

Well-Known Member

Member

EndlessCheese spreader

Well-Known Member

Active Member

Well-Known Member

(supervised)

Well-Known Member

Active Member

(supervised)

(supervised)

Well-Known Member

(supervised)

Similar threads