FSD v12.x (end to end AI)

Mardak · Mar 8, 2024

kabin said:
What is v12's scenario recall capacity? 100's of Milliseconds? What's really needed for safe driving?

Tesla designed FSD Beta neural network architectures including a "Spatial Recurrent Neural Network Video Module" that conditionally updated internal knowledge based on visibility, positioning and timing. Presumably some of that architecture fed into the decisions for Occupancy network and now end-to-end, so there should be the capability of tracking occluded objects over time, but indeed unclear how long it can remember and how much is necessary.

Are there examples of 12.x doing things based on a previous memory of something no longer visible? These examples might provide some estimations of how long end-to-end can remember, and for this particular failure to allow oncoming stop sign traffic to go first could be 12.2.1 control network not paying enough attention to those memory signals.

Mardak · Mar 8, 2024

powertoold said:
12.2.1 is a beautiful thing, I've been asking for this ever since I got FSDb

https://twitter.com/x/status/1765880305989325183

This squeezing in a narrow shoulder past vehicles for a right turn reminded me of 11.x behavior of folding mirrors in narrow situations. Have people noticed if 12.x still has that ability? I believe AI DRIVR commented on it a bit in older 11.x videos, and did taking 12.x through similar situations just went for it with mirrors unfolded or waited for wider space?

I wonder how end-to-end manages the difference in sizes between say Model X and Model 3 where I think there's roughly 6" width difference. I guess to be safer, assume all vehicles are the size of X, but then the training data is coming from smaller vehicles too…

sleepydoc · Mar 8, 2024

AlanSubie4Life said:
You are definitely starting to sound less confident about these robotaxis. I guess we will see about these specialized locale flavors.

Oct. 2021:

That’s the benefit of 2 ½ years of experience and hindsight. In 2021 FSD was quite rough but improvements were coming quickly. Now we’ve seen Tesla struggle to get things right and fix FSD bugs so it gives a very different impression

AlanSubie4Life · Mar 8, 2024

sleepydoc said:
That’s the benefit of 2 ½ years of experience and hindsight. In 2021 FSD was quite rough but improvements were coming quickly. Now we’ve seen Tesla struggle to get things right and fix FSD bugs so it gives a very different impression

Seems about the same impression to me. 2.5 years ago this kind of statement was pretty questionable. I think things are going roughly as I would have expected three years ago. Things only improve at a certain rate, and at some point it stops.

v12 is not really that different at a high level (rate of progress). I am hoping the reduced manual tuning temporarily slightly improves the rate of progress.

But a thrilling and excellent 1-2 orders of magnitude improvement is nowhere near robotaxis, even tightly limited ones. That’s still quite clear.

kabin · Mar 8, 2024

Mardak said:
Are there examples of 12.x doing things based on a previous memory of something no longer visible?

I think so. I've questioned it in the past from a couple v12 videos. One of the more glaring was an intersection with the white line mark well past the red traffic light and FSD ran the red light probably over a second after passing the red light.

kabin · Mar 8, 2024

powertoold said:
12.2.1 is a beautiful thing, I've been asking for this ever since I got FSDb

https://twitter.com/x/status/1765880305989325183

AI Driver had an example with a closed narrow residential street under construction and v12 ignored the road closed sign and through with, no kidding, inches to spare from hitting a construction vehicle. It seemed way out in front of the skis for a system intended to be 10x safer than a human driver.

Mardak · Mar 8, 2024

enemji said:
The big difference is that in chess everyone is trying to harm everyone whereas in our case the core assumption is that everyone is trying not to hurt anyone.

For both chess and driving, there's a lot of uncertainty in what others might do that affect what you should do. As you suggest the common case might be an overarching core assumption, but people might not always behave that way, e.g., distracted or reacted to something else. We've seen various 12.x disengagements that seem to be because end-to-end might have assumed the lead vehicle was going to keep turning instead of slowing down and other 12.x behaviors that got really close to cross/oncoming traffic that happened to be fine because the other traffic kept moving.

Neural networks with probabilities in internal connections and outputs seem to represent some of that uncertainty and adjusting for which scenarios are more likely, but unclear if it has or needs to have a core underlying principle understanding of what might happen if say the lead vehicle suddenly slams on the brakes if following too closely. Here's an example of 12.2.1's blue path (both more opaque and transparent) perhaps reflecting uncertainty of how quickly cross traffic is approaching to go first or wait:

Where there's also a separate aspect of if it did cut in front of cross traffic, is there some understanding that they'll likely slam on the brakes trying not to hurt anyone?

Mardak · Mar 8, 2024

AlanSubie4Life said:
I'm talking/thinking about just safety interventions, so things like intervening to avoid slowing down too quickly, not allowing sufficient following distance on a lane change, not signaling, incorrect lane choice, avoiding honks from other drivers, etc. Also on things like unprotected lefts (the cross-traffic variety, not the oncoming traffic variety), the intervention rate is very high, and virtually all of those are safety interventions.

Yeah, unprotected lefts with current 12.x seem like they've actually increased interventions in some aspects, and more broadly safety interventions might have slightly regressed with 12.2.1 while significantly improving comfort interventions. Just focusing on your safety interventions, in aggregate, you don't think 12.x will eventually get to 100x, but do you think some or even one type of intervention will get to 10x improved sooner while others will require lots of focused training to even get to 10x? Presumably those interventions that get to 10x first are more likely to also make progress towards 100x.

enemji · Mar 8, 2024

Mardak said:
For both chess and driving, there's a lot of uncertainty in what others might do that affect what you should do.

But in any case, you cannot come in brain dead ie blank page and expect to account for all uncertainties heuristically. Assumptions such as what I described are core to the program. Without such assumptions, we would have never landed on the moon either.

What I am ultimately stating is accidents WILL happen until all cars are driving using some form of FSD. Then it might lessen but accidents will still happen. The cameras and other sensors will play a huge role in determining which vehicle was at fault. There is no way around that.

In the scenario that every FSD wants to play safe, expect a huge amount of standoffs ie Every vehicle will be in a stall position ie draw situation in Chess, hoping and letting the other vehicle make the next move at every situation where there is a possibility of an accident.

Mullermn · Mar 8, 2024

Mardak said:
This squeezing in a narrow shoulder past vehicles for a right turn reminded me of 11.x behavior of folding mirrors in narrow situations. Have people noticed if 12.x still has that ability? I believe AI DRIVR commented on it a bit in older 11.x videos, and did taking 12.x through similar situations just went for it with mirrors unfolded or waited for wider space?

I wonder how end-to-end manages the difference in sizes between say Model X and Model 3 where I think there's roughly 6" width difference. I guess to be safer, assume all vehicles are the size of X, but then the training data is coming from smaller vehicles too…

That's an interesting question, actually, since that mirror folding behaviour is something that basically zero humans will do, especially in a Tesla where you have to go hunting through menus to find the option. I've always thought it was a pointless behaviour (the car on V11 seems to have nowhere near enough confidence to go for any gaps where the difference in folded/unfolded mirrors is going to matter), but it's also a very easy one to do if you're programmatically shaping the driving behaviour so I can see why they did it.

With v12 supposedly trained on real behaviour presumably there would be hardly any training material of this for the NN to imitate, so examples of it still doing it would be interesting.

aronth5 · Mar 8, 2024

AlanSubie4Life said:
v12 is not really that different at a high level (rate of progress). I am hoping the reduced manual tuning temporarily slightly improves the rate of progress.

This statement isn't consistent with the driver/passenger feedback I've heard in the majority of videos.
Comments are much more positive than any release we've seen before.

Kablooie · Mar 8, 2024

Mardak said:
Tesla designed FSD Beta neural network architectures including a "Spatial Recurrent Neural Network Video Module" that conditionally updated internal knowledge based on visibility, positioning and timing. Presumably some of that architecture fed into the decisions for Occupancy network and now end-to-end, so there should be the capability of tracking occluded objects over time, but indeed unclear how long it can remember and how much is necessary.

Are there examples of 12.x doing things based on a previous memory of something no longer visible? These examples might provide some estimations of how long end-to-end can remember, and for this particular failure to allow oncoming stop sign traffic to go first could be 12.2.1 control network not paying enough attention to those memory signals.

The car shouldn't need memory to anticipate an unseen object's trajectory because that trajectory could change while hidden. It should always wait until it can see enough to assess the current situation, not a previously anticipated one. I found that even when it can see clearly, several times, I have had to disengage a left turn before it drives into another car's path. This is a dangerous recurring issue that I hope is addressed in the next version.

Todd Burch · Mar 8, 2024

All,

Before v12 came out, it was clear that the first release of v12 wouldn't be perfect. Tesla still has a lot of number crunching to do. Issues with the first iteration have no bearing on the success of the approach longer term.

In the beginning, champion Go players could beat DeepMind's AI-powered Go initially too. Then with further training it became unbeatable. Then it was doing things no expert Go player had ever seen...

PianoAl · Mar 8, 2024

aronth5 said:
This happens frequently to me with V11. (a few times a week)
Usually if nobody is near me and if I'm familiar with the corner/posts I let FSD correct itself. I used to disengage and let Tesla know but I figure nobody at Tesla pays attention to anything V11.

Same here. Our v12 car hasn't done that yet, but the v11 car does it a third of the time on a particular curve. It corrects itself if you have nerves of steel and don't take over.

PianoAl · Mar 8, 2024

JulienW said:
Anyone else tired of the lower quality onslaught of v12 videos? So many have just one "iPhone" mounted to the sunroof and that is the only view you get.

Here's why: I unexpectedly got V12, and when I realized how rare that was, I thought, "Hey I could get tons of views on YouTube for a while at least. I wouldn't even need to invest in fancy equipment."

I decided against it, but it was tempting.

kabin · Mar 8, 2024

Todd Burch said:
All,

Before v12 came out, it was clear that the first release of v12 wouldn't be perfect. Tesla still has a lot of number crunching to do. Issues with the first iteration have no bearing on the success of the approach longer term.

In the beginning, champion Go players could beat DeepMind's AI-powered Go initially too. Then with further training it became unbeatable. Then it was doing things no expert Go player had ever seen...

There's more than a few reasons no one knew it would be this easy.

kpanda17 · Mar 8, 2024

Supcom said:
Yes, it's still 11.4.9. But it means that the next version of V12, whenever it comes out, will not be based on 2023.44.

strange even more, the latest 2024.2 is 100, a big jump
2024.2.8 to 2024.2.100

I feel it coming, V12

JB47394 · Mar 8, 2024

Todd Burch said:
In the beginning, champion Go players could beat DeepMind's AI-powered Go initially too. Then with further training it became unbeatable. Then it was doing things no expert Go player had ever seen...

Similarly, the success of other applications of AI has no bearing on the application of AI to driving automation. Should we take ChatGPT's hallucinations as a metric of the eventual capabilities of FSD?

I don't condemn V12 for its failures because it's still relatively early going. At the same time, I don't assume that simply because the FSD team is throwing neural networks at the problem that it'll all work out. I'll be encouraged after a few iterations with V12 where it is clear that the Tesla team can spot a problem and fix it. They couldn't do that with V11.

mongo · Mar 8, 2024

Now that's some validation (advertising?)...

Chuck Cook

chazman
FSDBeta v12 is being tested so hard on the UPL's in my neighborhood that I just learned of a text thread of neighbors that were starting to think there was some sort of cult thing going on. They have a text thread talking about what is going on with all these Tesla's in our neighborhood, what are they doing, who are they. They discussed calling 911 to report the suspicious behavior. Fortunately before that happened, one of my neighbors stopped one of the
Tesla
ADAS drivers and asked them what they were doing, and why were were they doing circles in our quiet neighborhood. The explanation was simple and avoided the 911 call. Make no mistake
Tesla_AI
is serious about getting FSDBeta v12 right. They are dedicating resources and time to validate the software. I want to thank everyone involved for your dedication.

https://twitter.com/x/status/1766206509191438825

AlanSubie4Life · Mar 8, 2024

I am so thirsty.

The focus here is incredible. It’s good that they are not bothering with simulations. Would not want to pollute the training data I suppose.
You’d think they could check the results with simulations though.

I guess they are probably just generating training data.

I admit that if it does this with high reliability with significant traffic, I am going to be amazed. I especially don’t think it will be able to deal with turning traffic.

Overall I am bullish on my beer.

FSD v12.x (end to end AI)

Active Member

Active Member

Well-Known Member

Efficiency Obsessed Member

Active Member

Active Member

Active Member

Active Member

Active Member

Adapting to life without USS one hour at a time

Long Time Follower

Member

14-Year Member

Active Member

Active Member

Active Member

Active Member

Active Member

Well-Known Member

Efficiency Obsessed Member

Similar threads