FSD v12.x (end to end AI)

Daniel in SD · Jan 27, 2024

https://x.com/wholemarsblog/status/1751348342121230511?s=46

Has Chuck seen Tesla testing his UPL with V12 yet?
I think this will finally be the version to solve it.

Mardak · Jan 27, 2024

Todd Burch said:
I would imagine we'll see FSD version updates much faster compared to v11. Like maybe an update every 2 months. Once they get into a rhythm of training and testing a new iteration as the previous iteration is downloaded to our cars, I feel like we'll see steady progress now.

Do you think Tesla has a backlog of training data that is waiting to be used, so they'll take some trained network checkpoint to evaluate and release on a regular cycle while training the next checkpoint with the remaining data? It would seem like if it's training limited, it would be important to prioritize what data goes in next, so that would seem to prefer newer data to address situations that are still problematic with the latest version as opposed to more training of problems that have already been fixed.

Overall, it does seem like 12.x should allow for more ongoing incremental improvements no longer requiring engineering to focus on specific issues to figure out the appropriate code changes, so these improvements probably won't / can't be enumerated in release notes. There still can be specific data collection and training efforts to significantly improve on some functionality, e.g., handling school zones or certain complex construction, but even those might "automatically" improve through the usual training process that 12.x so far might already somewhat handle.

Supcom · Jan 27, 2024

Todd Burch said:
I am sure the autopilot team is watching Omar's videos. I am certain they are aware of the excess hesitancy at 4-way stops and are working on addressing it.

Given that many employees have been driving V12 for a month, the AP team likely doesn't need Whole Mars to point out the problem areas. Tesla should have plenty of examples of driving in lots of different scenarios. Unfortunately, we only see videos from Whole Mars, who tends to post a lot of drives that are virtually identical situations.

It was good to see a drive today of V12 on a limited access highway. I wonder if he knows anyone in the suburbs to visit. The 25 mph drives through downtown are getting tedious.

powertoold · Jan 27, 2024

Simple analysis of V12:

1) 7-8 months ago, Elon's livestream showed freezing behavior at stop signs

2) Today, that behavior is still present, doesn't seem improved

3) Unclear if/when it'll be resolved without some new techniques/modified approach

Supcom · Jan 27, 2024

diplomat33 said:
Why does the Tesla in the thumbnail graphic have what is clearly a Waymo-like sensor pod on the roof? What's that about? I guess they are trying to make the Tesla look like a robotaxi. LOL.

It's just an AI hallucination.

jebinc · Jan 27, 2024

AlanSubie4Life said:
I’ve always recognizedi it will take years and I recognize that it will take years with v12 to get to a useful L2 City Streets feature that is more relaxing and safer than just driving oneself.

What makes me negative is people who underestimate the task ahead and claim we’re near autonomy. That just leads to disappointment.

What, no L5 robotaxis this year? I have already spent the millions Elon told us we would made with those…

Mardak · Jan 27, 2024

sleepydoc said:
it's hard to say for sure if there's a plateau or not. They may be but it's also important to remember that as the accident rate gets lower and lower it gets harder to achieve gains.

What would be a reasonable miles per accident? Tesla's highest reported number was 6.57M in 22Q1, but I would think 12.x can get much higher especially given the current limited deployment/availability/usage of FSD Beta technology. I was highlighting the flattening in that Autopilot team has been busy with getting end-to-end ready almost all of 2023, so all forms of Autopilot did not see much improvement, but hopefully we'll see meaningful increases to safety across the fleet this year.

jebinc · Jan 27, 2024

AlanSubie4Life said:
What makes me negative is people who underestimate the task ahead and claim we’re near autonomy. That just leads to disappointment.

If I could add to that valid point…. “Also, when the OEM (Tesla) and their Chairman feed that falsehood - almost on an hourly basis…”

Mardak · Jan 27, 2024

powertoold said:
1) 7-8 months ago, Elon's livestream showed freezing behavior at stop signs
2) Today, that behavior is still present, doesn't seem improved

It's only been 5 months (since end of August), but the problem could have been made worse with overweighted training to get complete stops. Potentially this is less urgent of an issue as the driver has plenty of time to push the accelerator as opposed to safety decisions that the driver might not have time to react to disengage. Specifically for the 12.1.2 hesitancy at stop signs in San Francisco, it seems like a lot of those are at hills where looking for cross traffic is quite different needing to "look" at different parts of the camera view, e.g., bottom corners of fisheye because main camera might be staring into the sky.

JB47394 · Jan 27, 2024

Mardak said:
Tesla's highest reported number was 6.57M in 22Q1, but I would think 12.x can get much higher especially given the current limited deployment/availability/usage of FSD Beta technology.

Yes, but that number is for Autopilot, which is going to be dominated by highway miles. The V12 work is focused on addressing driving on secondary streets, where accident rates roughly triple. I imagine the miles per accident figure will drop significantly unless V12 has solid accident avoidance capabilities. Whether proactive or reactive.

Mardak · Jan 27, 2024

AlanSubie4Life said:
Reaction time still seems to be one second

Theoretically if end-to-end control is making a decision at 36 frames per second, that could be 28ms from input to output. However, average training on human examples could reflect human reaction times in initial 12.x behaviors. Although how many driving situations are truly reactionary versus anticipatory? The earlier example of potential defensive driving had context cues of adjacent lanes slowing down, so that has a lot more time to prepare, but your example of traffic light changing signals has less context except maybe for things like crosswalk countdown.

aronth5 · Jan 27, 2024

Daniel in SD said:
https://x.com/wholemarsblog/status/1751348342121230511?s=46
Has Chuck seen Tesla testing his UPL with V12 yet?
I think this will finally be the version to solve it.

If I recall correctly Chuck has seen Tesla testers at his infamous UPL but didn't know if they were gathering data needed for V12 training or actually trying out V12. Many thought the former based on watching Tesla repeatedly taking the UPL and the consistency of each turn.

Mardak · Jan 27, 2024

AlanSubie4Life said:
Impressive! The subsequent action in response to the green light was sad though. Win some, lose some.

You're referring to the "left" on green from Portola to Twin Peaks (where Google Maps can't decide what's the correct left turn arrow to place on the map…)? Yeah it was quite awkward but also somewhat impressive for 12.1.2 to handle such odd signals on the fly:

The road markings indicate it's a left turn, but the 2 traffic lights are placed in view only after already angling left and not directly in front / median (right edge of the screenshot). These signals are the usual circles and not turn arrows with a special black/white sign: "WAIT FOR GREEN LIGHT." I would guess 12.x was thinking it was an unprotected turn where typically a green circle means oncoming traffic has right of way. Although the true oncoming traffic in this case are the vehicles coming down the hill, so it'll be interesting if 12.x correctly understands that as I definitely was confused initially.

Todd Burch · Jan 27, 2024

Mardak said:
Do you think Tesla has a backlog of training data that is waiting to be used, so they'll take some trained network checkpoint to evaluate and release on a regular cycle while training the next checkpoint with the remaining data? It would seem like if it's training limited, it would be important to prioritize what data goes in next, so that would seem to prefer newer data to address situations that are still problematic with the latest version as opposed to more training of problems that have already been fixed.

Since it costs a lot of money to train the network once (an enormous amount of energy is required to train it), I would expect them to use whatever good training data they have and not hold any back. But that's just a guess.

The key though is the data engine that Karpathy described--the cycle. "Operation Vacation", they called it. Set up the infrastructure so that it can automatically train the network, deploy the network for testing, automatically identify trouble spots via disengagements/interventions, source video of those scenarios, review that video and pick out the good examples, autolabel it, and feed it into the training set for the next training run. It's probably difficult to "review the video and pick out good examples" without human intervention, but most of the rest can theoretically be automated.

And yes, since you have a potential monsoon of data to work with, it makes sense to feed in video of problem situations vs just feeding in more data for scenarios that are already fine.

For example, they don't really need more video for lanekeeping purposes. FSD can pretty much hold lanes without issues nowadays.

AlanSubie4Life · Jan 27, 2024

Mardak said:
You're referring to the "left" on green from Portola to Twin Peaks (where Google Maps can't decide what's the correct left turn arrow to place on the map…)? Yeah it was quite awkward but also somewhat impressive for 12.1.2 to handle such odd signals on the fly:

In isolation all of what FSD does is really impressive. It’s remarkable what it can do.

But as far as a functional driver assist in this context, where you constantly have to anticipate the variety of scenarios where it will fail to negotiate the scenario perfectly (and might be used where a driver is unfamiliar, too, which would make it worse especially in the case of failure), it’s not very close, unfortunately.

It isn’t like there are just a few tweaks and touch ups, and 50 iterations of training, and then all will be well. Way more complex than that, to get to where it needs to be.

A long way from (useful, relaxing) L2 in arbitrary scenarios.

sleepydoc · Jan 27, 2024

zoomer0056 said:
Don't use Darwin award winners for training. It's not that difficult if you know the rule that the person on the left has to yield. If there's a four way tie situation then one can inch forward to judge the others reaction and safely act accordingly. There are rules, however some don't know them and others ignore them. So always yield to the unsafe driver if it calls for that.

The rule isn’t difficult. The practical application is. I haven’t won a Darwin award (yet) and I still have regular instances where I stop at a 4 way stopsign and I along with the other drivers are unsure of who should go first. Who stopped first, especially when no one comes to a complete stop? What about if one driver arrives first but doesn’t actually stop but another driver arrives slightly after and does come to a complete stop - who goes first, the one who’s following the law or the one skirting it? Now throw pedestrians into the mix.

If you’re applying ‘always yield to unsafe drivers’ to the rules then you’ve decided you are using Darwin award winners for training. So now we need to figure out not just the rules but the various ways people break them.

sleepydoc · Jan 27, 2024

Drove through this intersection last week - not surprisingly FSD had some problems. The 2nd picture is taken from the approach indicated by the green arrow on the map - when waiting for the lights the two roads are almost parallel. Some of the lights have louvers on them but not all and even with the louvers you can see both lights at the same time. If you look a the picture the top center lights show the green arrow traffic has a green light but the lights on the right make it look like they have a red light. When I drove it the louvers were barely visible and I had to infer that the light on the right was for my lane and the light on the left was for the road to my left because the lights were about equally visible.

JB47394 · Jan 27, 2024

Whole Mars Catalog using V12 followed by AI DRIVR using V11. (moderator note: updated link)

sleepydoc · Jan 27, 2024

JB47394 said:
Whole Mars Catalog using V12 followed by AI DRIVR using V11.

The cars got separated so it was hard to do a true side by side comparison but in the parts I watched v12 seemed to do noticably better

Todd Burch · Jan 27, 2024

sleepydoc said:
The cars got separated so it was hard to do a true side by side comparison but in the parts I watched v12 seemed to do noticably better

Yeah, realistically you can't get much of a better comparison. Both v11 and v12 had some tricky scenarios to deal with and both in roughly the same conditions.

The things that stuck out for me were:
1. v11 did not properly slow for the speed bump, v12 did.
2. v11 had some jerky steering inputs and speed controls. v12 seemed smooth throughout.
3. v11 hesitated much more at stop signs, frequently getting stuck until there was accelerator input from AI DRIVR. v12 required (I think?) no interventions the whole way. One stop-sign intersection in particular, a vehicle to Omar's right started to roll in the intersection but v12 assertively (and correctly) continued through. v11 didn't experience the same scenario but I'm pretty sure v11 would have just stopped until the other car passed through.

v12 was a clear winner. I don't think any rational person would argue otherwise.

FSD v12.x (end to end AI)

(supervised)

Active Member

Active Member

Active Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

Long Time Follower

Active Member

14-Year Member

Efficiency Obsessed Member

Well-Known Member

Well-Known Member

Active Member

Well-Known Member

14-Year Member

Similar threads