FSD v12.x (end to end AI)

Mardak · Feb 14, 2024

diplomat33 said:
even if FSD beta was better than Waymo in this one case (highly doubtful), it would still be behind Waymo 99.9999% of other cases… I've argued that Waymo should license a driver assist system based on the Waymo Driver.

Yeah, seems like it could be a pretty good driver assist system and other benefits for Waymo too. Glad we can agree that different levels can have better/worse performance in various aspects and that the same underlying technology can be used for multiple systems with different design intent. Hopefully we'll all get 12.x soon so that we'll all have a better sense of if Tesla is really behind 99.9999% as I have a feeling end-to-end avoids all sorts of potential heuristic errors in all of perception, prediction, planning, control (although it introduces a lot of other potential errors too).

Mardak · Feb 14, 2024

AlanSubie4Life said:
The prediction stack could have incorrectly determined trajectory due to a sign error.

Sounds like Waymo heuristics predicted the angled pickup truck to move "forwards" in the way it was facing into the center turn lane instead of forwards in the direction of the tow truck. I would expect 11.x perception to have plenty of training examples of vehicles being towed to correctly predict their motion, and hopefully end-to-end can carry over that understanding to safely and smoothly drive around all sorts of vehicles even when "improperly" towed.

drtimhill · Feb 14, 2024

diplomat33 said:
This is why I say that Tesla fans need to pick a lane. Either say that the FSD beta cannot do the entire DDT and therefore is L2. Or say that it can do the entire DDT but still needs driver supervision and therefore is supervised L4. But it can't be both. It can't do the entire DDT (L4) but still be L2 because it requires driver supervision. That is not correct.

Again I think it's important to distinguish between capability and certification/declaration. Given the complex nature of NNs in particular, you cannot suddenly determine that the car is magically L4 simply because someone in the software or marketing department decides that (yeah, yeah, Elon). You might have a design which you claim is L4 on paper, but there must be a period during which this claim undergoes rigorous testing. And during that time, what level is the car? It's not L4, because it's not been declared as such (or certified depending on the legal framework in a jurisdiction). So during this period the car needs supervision by a human. Which, in a certain sense, makes it quasi-L2 (actually, real L2 if I read the SAE stuff right). This is where a lot of these discussions get muddled imho.

The point is, at some point such a system, if all goes well, is declared L4 .. but that declaration is not accompanied by any actual change in the software or hardware stack. What changes isnt the technology, but the certification that the testing was satisfactory. To put it colloquially, the car has passed its driving test (and yes, "driving" is the correct word here).

drtimhill · Feb 14, 2024

Knightshade said:
So it seems the real issue was people screaming SEMANTICS while not being interested in the very thing you define semantics as

With that I agree .. and yes, some of the posts were using "it's just semantics" in the colloquially dismissive sense (which bugs me as much as people who say "its just a theory"). I wasn't using it like that .. I use it to mean exactly what the dictionary says.

diplomat33 · Feb 15, 2024

Mardak said:
Sounds like Waymo heuristics predicted the angled pickup truck to move "forwards" in the way it was facing into the center turn lane instead of forwards in the direction of the tow truck. I would expect 11.x perception to have plenty of training examples of vehicles being towed to correctly predict their motion, and hopefully end-to-end can carry over that understanding to safely and smoothly drive around all sorts of vehicles even when "improperly" towed.

Just to be clear, Waymo uses deep neural networks for prediction, not heuristics. Waymo uses zero heuristics for prediction. Specifically, they use transformers to encode scene information (traffic lights, road geometry and objects) and then do multi-agent forecasting.

They are also using distillation to make their forecasting more efficient:

But to your actual point, Waymo's prediction clearly made a mistake which required further ML training. Hence, why Waymo issued the software update.

Hopefully, Tesla does have data on this scenario and trains V12 on it, otherwise, V12 could have the same issue as Waymo had.

If anything, these cases illustrate how difficult solving autonomous driving is. Both Tesla and Waymo will encounter edge cases as they work to improve their autonomous driving systems.

diplomat33 · Feb 15, 2024

drtimhill said:
You might have a design which you claim is L4 on paper, but there must be a period during which this claim undergoes rigorous testing. And during that time, what level is the car? It's not L4, because it's not been declared as such (or certified depending on the legal framework in a jurisdiction). So during this period the car needs supervision by a human. Which, in a certain sense, makes it quasi-L2 (actually, real L2 if I read the SAE stuff right). This is where a lot of these discussions get muddled imho.

I get your point about design versus deployed. But the vehicle is still L4 during testing, before certification. It is not quasi-L2 or L2 during testing with human supervision. It is supervised L4. SAE levels are very clear that L4 is not L2 when there is a safety driver. In fact, J3016 specifically says that it does not matter if the vehicle is under testing or deployed commercially, the level is whatever the design intent is.

Here is what J3016 says on this:

"The level of a driving automation system feature corresponds to the feature’s production design intent. This applies regardless of whether the vehicle on which it is equipped is a production vehicle already deployed in commerce, or a test vehicle that has yet to be deployed. As such, it is incorrect to classify a Level 4 design-intended ADS feature equipped on a test vehicle as Level 2 simply because on-road testing requires a test driver to supervise the feature while engaged, and to intervene if necessary to maintain operation." (p. 36)

powertoold · Feb 15, 2024

diplomat33 said:
I get your point about design versus deployed. But the vehicle is still L4 during testing, before certification. It is not quasi-L2 or L2 during testing with human supervision. It is supervised L4. SAE levels are very clear that L4 is not L2 when there is a safety driver. In fact, J3016 specifically says that it does not matter if the vehicle is under testing or deployed commercially, the level is whatever the design intent is.

Here is what J3016 says on this:

"The level of a driving automation system feature corresponds to the feature’s production design intent. This applies regardless of whether the vehicle on which it is equipped is a production vehicle already deployed in commerce, or a test vehicle that has yet to be deployed. As such, it is incorrect to classify a Level 4 design-intended ADS feature equipped on a test vehicle as Level 2 simply because on-road testing requires a test driver to supervise the feature while engaged, and to intervene if necessary to maintain operation." (p. 36)

Blows my mind either people still don't understand design intent or they're trying to spread the wrong information about the DMV email:

Design intent has nothing to do with a company's future changes to the system

Design intent represents the current design and functionality of the system

It's common sense: before you can deploy a L4 feature, you'll need to test it out first. However, during this testing, the car must still fulfill all the taxonomic requirements of a L4 feature, i.e. the system / software itself doesn't require a safety driver, but you as the developer are requiring it (like in an email or something) because you're testing it out! The system itself cannot require a safety driver to engage.

That's why the current design intent of FSDb is L2, not L5. However, this doesn't mean it'll never get to higher levels of autonomy, Daniel in SD posted that passage from the email. But you don't need to read it because everything I'm saying is comprehension from the J3016.

BBTX · Feb 15, 2024

Just to put something more concrete into this neverending debate, in the hopes of fostering some understanding:

Even if Tesla hypothetically disabled all driver monitoring on our current FSDb 11.x (or a driver perfectly defeated all the monitoring), there are definitely situations that occur where FSDb is very suddenly no longer confident it has any idea how to continue driving safely and throws up an immediate Red Hands Of Death for human takeover.

This behavior would have to be replaced by something much safer, and that does not require immediate human intervention (like some kind of pull over on the shoulder sorts of things, but the corner cases are quite complex!) to meet any definition of L3+. An L3 system cannot just throw its hands in the air at random times and scream for human takeover at speed.

The software doesn't seem to currently even be able to predict when failure due to external conditions is impending and then handle it gracefully and safely. They've improved on this recently in many weather-related cases, but many others remain, AFAICS. Regardless, I've never observed, or heard a report of, it ever doing an automated safety stop on its own.

enemji · Feb 15, 2024

mongo said:
Your honor, I was not driving the car as indicated by my BAC and the fact that it crashed.

I think a root issue is that:
"Driving" != "driving task"
Headine "4 year old takes car for a drive..." : driving
"... and crashes into pond" : failing at driving task

Eating food…?

Daniel in SD · Feb 15, 2024

drtimhill said:
So during this period the car needs supervision by a human. Which, in a certain sense, makes it quasi-L2 (actually, real L2 if I read the SAE stuff right).

Nope.
"The level of a driving automation system feature corresponds to the feature’s production design intent. This applies regardless of whether the vehicle on which it is equipped is a production vehicle already deployed in commerce, or a test vehicle that has yet to be deployed. As such, it is incorrect to classify a Level 4 design-intended ADS feature equipped on a test vehicle as Level 2 simply because on-road testing requires a test driver to supervise the feature while engaged, and to intervene if necessary to maintain operation."

enemji · Feb 15, 2024

Dewg said:
I've been eating popcorn, reading this hilarious debate for several days now.

To your point above - you just answered your own question about Tesla, as Knight and others have repeatedly stated - Tesla is L2. They stated their design intent legally to California's governing body. Case closed.

Tesla does not officially recognize the SAE Levels. It is up to others to apply whatever they feel is the right level based upon their understanding of the description of the service offered and experience of said service.

Daniel in SD · Feb 15, 2024

BBTX said:
Just to put something more concrete into this neverending debate, in the hopes of fostering some understanding:

Even if Tesla hypothetically disabled all driver monitoring on our current FSDb 11.x (or a driver perfectly defeated all the monitoring), there are definitely situations that occur where FSDb is very suddenly no longer confident it has any idea how to continue driving safely and throws up an immediate Red Hands Of Death for human takeover.

This behavior would have to be replaced by something much safer, and that does not require immediate human intervention (like some kind of pull over on the shoulder sorts of things, but the corner cases are quite complex!) to meet any definition of L3+. An L3 system cannot just throw its hands in the air at random times and scream for human takeover at speed.

The software doesn't seem to currently even be able to predict when failure due to external conditions is impending and then handle it gracefully and safely. They've improved on this recently in many weather-related cases, but many others remain, AFAICS. Regardless, I've never observed, or heard a report of, it ever doing an automated safety stop on its own.

Because Tesla is not developing an L3 system, they're developing an L5 system. The human interface is simply to ensure safety while testing.

zoomer0056 · Feb 15, 2024

enemji said:
Tesla does not officially recognize the SAE Levels.

How can one come to this conclusion?

diplomat33 · Feb 15, 2024

enemji said:
Tesla does not officially recognize the SAE Levels.

I am not sure we can say that. Both Tesla and Elon have used the SAE levels. Tesla referenced the SAE levels in their communication with the CA DMV. And I think Elon has referenced the SAE levels in earnings calls or on X.

drtimhill · Feb 15, 2024

Daniel in SD said:
Nope.
"The level of a driving automation system feature corresponds to the feature’s production design intent. This applies regardless of whether the vehicle on which it is equipped is a production vehicle already deployed in commerce, or a test vehicle that has yet to be deployed. As such, it is incorrect to classify a Level 4 design-intended ADS feature equipped on a test vehicle as Level 2 simply because on-road testing requires a test driver to supervise the feature while engaged, and to intervene if necessary to maintain operation."

Ah I missed that in my reading .. good catch.

drtimhill · Feb 15, 2024

diplomat33 said:
I get your point about design versus deployed. But the vehicle is still L4 during testing, before certification. It is not quasi-L2 or L2 during testing with human supervision. It is supervised L4. SAE levels are very clear that L4 is not L2 when there is a safety driver. In fact, J3016 specifically says that it does not matter if the vehicle is under testing or deployed commercially, the level is whatever the design intent is.

Here is what J3016 says on this:

"The level of a driving automation system feature corresponds to the feature’s production design intent. This applies regardless of whether the vehicle on which it is equipped is a production vehicle already deployed in commerce, or a test vehicle that has yet to be deployed. As such, it is incorrect to classify a Level 4 design-intended ADS feature equipped on a test vehicle as Level 2 simply because on-road testing requires a test driver to supervise the feature while engaged, and to intervene if necessary to maintain operation." (p. 36)

Yeah I missed that bit .. at least they got that bit right. Though it seems something of a loophole since it seems anyone could arbitrarily declare that they have a system intended to be L4 even though it is wildly far from that technically.

drtimhill · Feb 15, 2024

diplomat33 said:
Just to be clear, Waymo uses deep neural networks for prediction, not heuristics. Waymo uses zero heuristics for prediction. Specifically, they use transformers to encode scene information (traffic lights, road geometry and objects) and then do multi-agent forecasting.

My concern with Waymo right now is the glacial rollout .. they seem only to do at most one city a year. Sure, you could attribute this to caution, or perhaps finances (Alphabet are not as generous as they once were). But I wonder if the entire infrastructure setup is rather more onerous then they realized (HD maps, bureaucracy, setup of local manual take-over support staff etc). I'm not clear what Waymo's business model is atm (or even if they are).

diplomat33 · Feb 15, 2024

drtimhill said:
Yeah I missed that bit .. at least they got that bit right. Though it seems something of a loophole since it seems anyone could arbitrarily declare that they have a system intended to be L4 even though it is wildly far from that technically.

I think the idea is that you trust that if the manufacturer says they are trying to do L4, they are really trying to do L4. But of course, systems can be at very different stages of development. So the L4 could be at a very early prototype stage, it could be at an early testing with safety drivers stage, it could be at a late testing with safety drivers stage or commercial deployment stage. And that is why we have regulators like the CA DMV or the CPUC that can monitor that and issue testing permits or commercial permits. But regardless of the stage, SAE is saying the level of the design does not change. In other words, just because a system is at an early testing stage, does not make it less L4 than if it is deployed commercially. It is still L4 , just less reliable.

powertoold · Feb 15, 2024

diplomat33 said:
I think the idea is that you trust that if the manufacturer says they are trying to do L4, they are really trying to do L4. But of course, systems can be at very different stages of development. So the L4 could be at a very early prototype stage, it could be at an early testing with safety drivers stage, it could be at a late testing with safety drivers stage or commercial deployment stage. And that is why we have regulators like the CA DMV or the CPUC that can monitor that and issue testing permits or commercial permits. But regardless of the stage, SAE is saying the level of the design does not change. In other words, just because a system is at an early testing stage, does not make it less L4 than if it is deployed commercially. It is still L4 , just less reliable.

This is the reason why the SAE uses the idea of design intent and also says that the levels don't convey system performance.

For example, in the past, people were able to game Tesla's AP nag / driver monitoring system, so that they were able to use the system without someone in the driver's seat. Just because Tesla's driver monitoring system sucked at the time doesn't mean AP became L3 or L4, etc. The system itself was designed to require an attentive driver at all times, so it's L2, even if the system was poorly designed to detect a driver...

Daniel in SD · Feb 15, 2024

drtimhill said:
My concern with Waymo right now is the glacial rollout .. they seem only to do at most one city a year. Sure, you could attribute this to caution, or perhaps finances (Alphabet are not as generous as they once were). But I wonder if the entire infrastructure setup is rather more onerous then they realized (HD maps, bureaucracy, setup of local manual take-over support staff etc). I'm not clear what Waymo's business model is atm (or even if they are).

Their business model is to operate a robotaxi service. I suspect that one reason they’re not scaling fast is that they haven’t gotten the cost low enough. What they’re counting on is that the cost of technology (compute, sensors) will come down, the remote assistance rate will go down, and the price of ride hailing will go up. As long as those trends continue they will eventually be profitable. Unless of course Tesla FSD becomes robotaxi capable.
Waymo really needs a dedicated robotaxi vehicle. Since it looks like Cruise is done maybe they can buy Origins from GM.

FSD v12.x (end to end AI)

Active Member

Active Member

Active Member

Active Member

Average guy who loves autonomous vehicles

Average guy who loves autonomous vehicles

Active Member

Member

Active Member

(supervised)

Active Member

(supervised)

Active Member

Average guy who loves autonomous vehicles

Active Member

Active Member

Active Member

Average guy who loves autonomous vehicles

Active Member

(supervised)

Similar threads