FSD v12.x (end to end AI)

DrChaos · Dec 11, 2023

BBTX said:
Yeah the timing of the transition is everything. I could imagine someone coming up with such an ODD that worked along the lines of: when the rain/construction/whatever is imminent, it does some beeps/warnings and gives maybe ~5-10 seconds to take over gracefully while it's already beginning to slow and prepare to pull off the road or whatever, but probably within 15s at most it's gonna have to really stop if the driver hasn't taken over. This is maybe enough for L3+ (no steering nags, no "pay attention to road") which still requires a capable driver be "ready to take over". You could still be reading a book or doomscrolling your phone or whatever, but be able to take over in those conditions and in that short time window. That's very different than a robotaxi, or the ability to crawl in the back and go to sleep. Basically the ODD's definition would have to include driver readiness (sleeping/absent driver is not ok. driver reading a book is ok).

I think that is what L3 is supposed to be.

diplomat33 · Dec 11, 2023

KArnold said:
The rules are somewhat vague and I'm starting to better appreciate a more clear standard. Especially with Elon who plays the rules very loosely with his "release now, fix later" approach. I don't believe the existing rules talk much about how an L4 vehicle reacts when leaving an ODD. Some articles say the vehicle needs to "stop" - obviously not practical if in the middle if a freeway construction zone. I do not see "graceful" or a timeframe in the standard but perhaps I am missing it. And "no construction zones" can be a definable ODD as can weather.

The SAE levels standard says that the L4 should notify the human before leaving the ODD and if the human does not take over, the L4 should reach a minimum risk condition before exiting the ODD. Here is the diagram that they give to illustrate how it would work:

Note that everything (driver notification, driver take over or minimum risk condition) has to happen before the vehicle exits the ODD. Additionally, minimum risk condition can be either stopping in lane or pulling over to the side.

In the case of L4 highway, if the L4 can't handle construction zones or weather and it is not safe or practical to ask the driver to take over or for the L4 to stop or pull over, then maybe the L4 is not ready for deployment yet. Perhaps, the developer needs to make the L4 more reliable at handling construction zones or weather where it does not need to ask the human to take over or define an ODD (geofence?) where the L4 can safely pull over if need be.

I think that is one reason why we don't see L4 highway yet on consumer cars. There are definitely some safety challenges that still need to be resolved.

KArnold · Dec 11, 2023

diplomat33 said:
In the case of L4 highway, if the L4 can't handle construction zones or weather and it is not safe or practical to ask the driver to take over or for the L4 to stop or pull over, then maybe the L4 is not ready for deployment yet.

So having freeway only, no precipitation, no construction as ODD will never be a real thing. What you say makes sense. It would be nice for these standards to explicitly say that.

By the letter of the law, L4 does not even need a human in the vehicle as long as it's in the ODD. But obviously just stopping in lane if it's starts to drizzle or encounters construction won't work.

Makes me want an L3 solution even more but obviously Tesla isn't apparently interested in that. Maybe they should bite the bullet and call L4/L5 a failure using current hardware but make it L3 without nags every 30 seconds, perhaps still with the ODD restrictions. But then we're back to the liability question - I'm sure that's why Tesla isn't interested in L3.

AlanSubie4Life · Dec 11, 2023

KArnold said:
I'm sure that's why Tesla isn't interested in L3.

It’s also likely not capable of it (with current hardware). L3 is super difficult to do with adequate safety.

We have no idea how safe the current implementation is - we have no data. I assume Tesla has simulations though.

Mardak · Dec 11, 2023

diplomat33 said:
In the case of L4 highway, if the L4 can't handle construction zones or weather and it is not safe or practical to ask the driver to take over or for the L4 to stop or pull over, then maybe the L4 is not ready for deployment yet

The qualifier of "ready for deployment" is more of a business decision and something regulators can push back on as technically a system that tries but fails to get to a minimum risk situation of even "stop within its current travel path" has met the requirement, but the population / governments might not accept that especially if it ends up blocking emergency vehicles and causing more traffic and accidents.

People are probably familiar with large language models hallucinating very confidently about wrong facts, and something similar for end-to-end control could be that it can be quite confident in the wrong behavior. For example, it probably can handle many types of construction based on broad learning to not just give up when entering a construction zone, but it could very well misinterpret some arrangement of cones to drive into wet concrete.

Kablooie · Dec 11, 2023

I expect many of the problems with FSD now will remain in V12. I believe that V12 will have new AI software for the decision making part of the system but the perception software, which defines the things it sees around the car, will remain. Phantom braking incidents are apparently because the perception software thinks it sees an obstacle that isn't really there. This false info will still be passed to the new decision software so would result in the car slowing for something that isn't there.

Mardak · Dec 11, 2023

Kablooie said:
Phantom braking incidents are apparently because the perception software thinks it sees an obstacle that isn't really there. This false info will still be passed to the new decision software so would result in the car slowing for something that isn't there.

Even if a modular end-to-end control approach only reused existing predictions as-is, it could still be much more flexible than existing control heuristics. Right now 11.x might try to always stop or give 3 feet from any detected pedestrian even if it's a last-second prediction while 12.x could have enough training to realize human control tends to continue normally when some types of these false predictions show up. For example, end-to-end could learn to ignore situations of a clear view of the road with a late mis-prediction when approaching mailboxes versus a late pedestrian appearing from parked vehicles should have different behavior.

Kablooie · Dec 11, 2023

Mardak said:
it could still be much more flexible than existing control heuristics.

Yes. I would expect that the new AI might be able to respond in a more measured way due to having "knowledge" of similar situations but still the erroneous information would likely cause some unexpected behavior at times.

gsmith123 · Dec 11, 2023

Kablooie said:
This false info will still be passed to the new decision software so would result in the car slowing for something that isn't there.

As a point of clarification, V12 probably isn't reusing the existing perception stack and feeding it into a new decision layer.

Instead V12 is fully replacing both perception and decision making into one "end to end" AI. That could be both good or bad and could introduce all sorts of regressions. On the flip side it might also resolve a huge portion of FSD bad behaviors all at once.

(This is my understanding but like all things Tesla we don't really know for sure what will actually end up getting released as V12).

Side note: do you still have much phantom braking? I actually can't even remember the last phantom braking incident I've had (sometimes I get mild slowdowns but these are a far stretch from posing a safety risk)

Kablooie · Dec 12, 2023

gsmith123 said:
As a point of clarification, V12 probably isn't reusing the existing perception stack and feeding it into a new decision layer.

Instead V12 is fully replacing both perception and decision making into one "end to end" AI. That could be both good or bad and could introduce all sorts of regressions. On the flip side it might also resolve a huge portion of FSD bad behaviors all at once.

(This is my understanding but like all things Tesla we don't really know for sure what will actually end up getting released as V12).

Side note: do you still have much phantom braking? I actually can't even remember the last phantom braking incident I've had (sometimes I get mild slowdowns but these are a far stretch from posing a safety risk)

I do occasionally get scary, quick slowdowns where I have to accelerate so the car behind doesn’t get too close. Fairly often I get annoying slowdowns on perfectly clear roads. It startles passengers more than me.
True they might completely redo perception but I don’t think they will. Humans need some control over the system to ensure it is reacting to the correct input. Perception will be modified, Musk has already said they don’t need to read signs anymore, but I’ll bet they still have something that locates signs and tags them as being something the system must be aware of.

jeewee3000 · Dec 12, 2023

Regarding the worries regarding construction zones, that seems trainable for a NN in my opinion.

Thing is, the rules of the road are partly known to drivers (not visualized by signage/lights/etc, for example that one should generally should be driving at the right side of the road (yes, LHD example here)) and partly shown to drivers (by signage).

In other words, there is a set of general groundrules that can be overridden and/or clarified by lane markings, traffic lights, signs, etcetera.

All these markings have to be clearly visible and up to code. If a sign is not clearly visible or a lane marking is not painted as the traffic code states, it can/must be disregarded by the driver.

When talking construction zones these may pop up or change significantly from one day to the next, but these changes have to be indicated in the proper manner so drivers can adhere to the rules.

There are only a limited amount of traffic signs/cones/lights allowed to sign for construction zones. FSD v12 should in theory be able to learn these given enough data.

Of course we'll see mishaps, but that'll be the march of nines (or however Elon put it): new mishap -> identify what was misidentified by the car and why -> train for that -> new mishap -> ...

The same goes with police officers doing hand signals to drivers. These hand signals are not random but set in stone (i.e. the law). Therefore these are also trainable.

The greatest challenges arise when construction workers "wing it" regarding placing their signs/cones and think "people will get it" without placing proper signage, or when a police officer at a crossroads freaks out when a driverless vehicle comes up to him (and therefore he doesn't do the correct - legally defined - hand waiving).
---------------------------------

Note: police officer example is based on Belgian traffic laws, only three signs allowed:

(article 4.2 of the Belgian traffic code, translated by deepl:
4.2. In particular, orders are considered to be :

1° the arm raised straight up, which means stopping for all road users, except for those who are at an intersection and who must clear it;

2° the arm or arms outstretched horizontally, which means stopping for road users approaching from directions indicated by the arm or arms;

3° waving a red light transversely, which means stopping for the drivers to whom the light is directed;)

-----------------

So yeah, in theory road rules are trainable IMO.

In practice we will see challenges, I agree, but I think they can be overcome. (but it will take many years)

aronth5 · Dec 12, 2023

jeewee3000 said:
Note: police officer example is based on Belgian traffic laws, only three signs allowed:

(article 4.2 of the Belgian traffic code, translated by deepl:
4.2. In particular, orders are considered to be :

1° the arm raised straight up, which means stopping for all road users, except for those who are at an intersection and who must clear it;

2° the arm or arms outstretched horizontally, which means stopping for road users approaching from directions indicated by the arm or arms;

3° waving a red light transversely, which means stopping for the drivers to whom the light is directed;)

-----------------

So yeah, in theory road rules are trainable IMO.

In practice we will see challenges, I agree, but I think they can be overcome. (but it will take many years)

In the US at least where I live there is no consistency with hand gestures and if there are traffic codes/hand gestures for police officers and construction works to follow that's news to me. I'm impressed Belgian has traffic codes to follow.

jeewee3000 · Dec 12, 2023

aronth5 said:
In the US at least where I live there is no consistency with hand gestures and if there are traffic codes/hand gestures for police officers and construction works to follow that's news to me. I'm impressed Belgian has traffic codes to follow.

For construction workers there are no rules, even in Belgium. But then again, they are not authorised to regulate traffic. If they wave at you or put out their hand (as to say "stop/wait") that has zero legal implications. The driver must follow the rules of the road, not what some random person instructs. (It can help of course, but those situations shouldn't exist in a perfect world.)

aronth5 · Dec 12, 2023

jeewee3000 said:
For construction workers there are no rules, even in Belgium. But then again, they are not authorised to regulate traffic. If they wave at you or put out their hand (as to say "stop/wait") that has zero legal implications. The driver must follow the rules of the road, not what some random person instructs. (It can help of course, but those situations shouldn't exist in a perfect world.)

Thanks. Where I live (Greater Boston) the police have no hand gesture rules and road workers are sometimes directing traffic. Perhaps there are formal guidelines but if there are I've never noticed. I almost always disengage FSD when I come across road work which is usually at least once a day.

kabin · Dec 12, 2023

Kablooie said:
Yes. I would expect that the new AI might be able to respond in a more measured way due to having "knowledge" of similar situations but still the erroneous information would likely cause some unexpected behavior at times.

Agreed! Results of untrained unknown situations will be unknown (intervention needed). Of course Tesla/Elon always give the benefit of the doubt. But on the road, Murphy's law has been the norm for FSD. And the latest goal is to train mostly on ideal scenarios. There were many reasons the team needed to add heuristics including crutches for a design that doesn't work.

Unless the team can pull a magic rabbit out of the hat, E2E's quicker response time will come with increased false alarms. Add to that more untrained and unknown situations and we should be 2 steps back from where we started.

stopcrazypp · Dec 12, 2023

aronth5 said:
Thanks. Where I live (Greater Boston) the police have no hand gesture rules and road workers are sometimes directing traffic. Perhaps there are formal guidelines but if there are I've never noticed. I almost always disengage FSD when I come across road work which is usually at least once a day.

Here in California there are no official hand signals (other than for use for drivers to indicate in car if their lights are broken or for a cyclist to use), but the police guidebook does have a set that drivers seem to intuitively understand (see page 39):

https://post.ca.gov/portals/0/post_docs/basic_course_resources/workbooks/LD_28_V-7.0.pdf

Construction workers generally just use a STOP/SLOW sign to direct traffic and they typically do not do more complex directing (like intersection directing). I think in the US we are aided by our usage of stop signs so a broken intersection automatically turns into a stop sign intersection, which reduces the need to have an officer use hand signals other than in very complex intersections.

aronth5 · Dec 12, 2023

stopcrazypp said:
Here in California there are no official hand signals (other than for use for drivers to indicate in car if their lights are broken or for a cyclist to use), but the police guidebook does have a set that drivers seem to intuitively understand (see page 39):

https://post.ca.gov/portals/0/post_docs/basic_course_resources/workbooks/LD_28_V-7.0.pdf

Construction workers generally just use a STOP/SLOW sign to direct traffic and they typically do not do more complex directing (like intersection directing). I think in the US we are aided by our usage of stop signs so a broken intersection automatically turns into a stop sign intersection, which reduces the need to have an officer use hand signals other than in very complex intersections.

In Massachusetts it's very rare to see anyone using STOP/SLOW signs. This year I believe I've seen signs twice other years never.
That shows you the extreme control local police chiefs have over construction sites to make sure officers are getting full detail pay which of course is paid by the tax payer. Saving money is not in their best interest. New Hampshire is much better at this.

Buckminster · Dec 12, 2023

Large data annotation team:

https://twitter.com/x/status/1734781784628609445

Mardak · Dec 12, 2023

Here's some excerpts from the Foundation Models positions:

Train generative models and multi-task networks at scale, with millions of video clips and thousands of GPUs, with the objective to significantly enhance the capabilities of autonomous perception and planning.
Spearhead the development and training of cutting-edge large generative models, including diffusion models, VAEs, autoregressive models, and GANs.
Lead applied research to advance foundation models for both autonomous driving and humanoid robots.
Work on cutting-edge techniques in multi-task learning, video networks, generative models, imitation learning, semi-supervised learning, and self-supervised learning.

https://www.tesla.com/careers/search/?query=foundation

Expanding to cover things already being done in end-to-end V12 or for something after?

enemji · Dec 13, 2023

Mardak said:
Here's some excerpts from the Foundation Models positions:

Train generative models and multi-task networks at scale, with millions of video clips and thousands of GPUs, with the objective to significantly enhance the capabilities of autonomous perception and planning.

Spearhead the development and training of cutting-edge large generative models, including diffusion models, VAEs, autoregressive models, and GANs.

Lead applied research to advance foundation models for both autonomous driving and humanoid robots.

Work on cutting-edge techniques in multi-task learning, video networks, generative models, imitation learning, semi-supervised learning, and self-supervised learning.

https://www.tesla.com/careers/search/?query=foundation

Expanding to cover things already being done in end-to-end V12 or for something after?

V11 and V12 both will continue work in parallel, I am certain.

FSD v12.x (end to end AI)

Active Member

Average guy who loves autonomous vehicles

Active Member

Efficiency Obsessed Member

Active Member

Member

Active Member

Member

Member

Member

Active Member

Long Time Follower

Active Member

Long Time Follower

Active Member

Well-Known Member

Long Time Follower

Well-Known Member

Active Member

Active Member

Similar threads