Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
Whether the end to end is explicity one network, or multiple networks chained together does not matter as much as long as it allows backprop to flow through gradients from one part to the next.

I mean, that is kind of the question that people want to know the answer to, and it’s not clear which we will get - and how much “glue” the “nothing but nets” approach has. There’s a big difference between just photons and map info (and other input modes) directly to output, and something much more observable, with nets that perform specific tasks and then interface to other nets. One is “actual end-to-end” (using a reasonable definition) and the other is something else (also can be called “end-to-end, but in pieces”).

In the end maybe it doesn’t matter as long as it gets the job done, but I think people are still allowed to be curious about implementation details. There seem to be a lot of people who think it is end to end…but I think this seems extremely unlikely!

Wow I can't believe there is a conspiracy that V12 isn't really end to end!
There’s no conspiracy, just not sure what we are going to get, and trying to get clarity on terminology. It seems fairly clear there is some sort of “end-to-end” vehicle planning/control module that they are working on. (But this is not “end-to-end” driving as discussed.) This could be an incremental change. Does it use outputs of occupancy network, perception, and lanes? Or is it all combined together (true end to end style)? How are they trained separately? (The discussion above provides some info.)

To me it seems like the incremental approach makes sense, and making this incremental change to the system is a good way to approach development while minimizing regressions.
Question is - how does end-to-end work in terms of routing ?

Because - lane selection is probably the major cause of disengagements.

How will or even can e-2-e make it better? It can't learn from videos ...
It’s multi modal so will presumably take input from navigation, lanes network, occupancy network, perception network.

End to end, baby.
 
  • Like
Reactions: jebinc
TBH, I'd be happy with an update that fixed auto wipers. A few releases ago, mine worked nearly perfectly. A dry wipe once a month or so, and near perfect speed selection for actual rain. 20% of the time I'd need to hit the left stalk button to kick start the wipers in extremely light sprinkle conditions. Then we got FSD 11.4.x. Ever since my auto wipers are worse than worthless. Multiple dry wipes every drive, and when it does rain they almost never start on their own. Today a torrential downpour started, no wipers. I thought I must have left them in manual. Nope, they were in auto. Even after kick starting them, they'd stop--even though it was raining pretty solid. It was almost funny, usually I get the "poor weather detected, FSD may be degraded" message. Not today. I know the front camera couldn't see a thing, I saved the video clip. Still, no wipers and no FSD degraded. Finally, it stopped raining. Yep, you guessed it, the wipers then came on. All I care about in 11.4.8 is hoping the damn auto wipers work. I'll worry about FSD again when V12 comes out.
 
  • Like
Reactions: pilotSteve
Taking inputs isn't the issue.. how can end to end learn which lane to take because it depends on the route ...

I'd love to read a white paper on how they are doing the end to end.
My completely non-technical layman outsider interpretation of it:

The direction the planner needs to go is just an input. Just like a sign or something else. It's not static, of course. But there are plenty of things that aren't static, so it seems like they should be fine to use as inputs. "Be in this lane because this is the way you need to go." I think it needs to learn that it needs to be in the correct lane for the requested route. I don't think it has to "learn which lane to take." That would depend on the route as you say.
It might not even need to know anything about lanes - all that info could come from the lanes network.
There was discussion earlier here about how that might all work and whether these things would need to be trained together, or whatever.

It definitely seems very complicated but it might well be simpler than writing code for everything. I can't see how it will ever work to the required reliability level, but here we are. I don't think anyone knows whether that is possible.
 
  • Like
Reactions: pilotSteve
”next generation” may be v13 for all we know, but it sure sounds like an e2e effort. Will be interesting to see the results.

Well, I think V12 is the beginning of this end-to-end approach. Whether V12 will be enough or whether it will take V13 to finish, remains to be seen.

The single model end-to-end approach is a new approach that seems to offer a lot of promise. The execution will be the hard part. There is no doubt that the single model end-to-end approach can do autonomous driving, the question is how reliable can it get. Can the single model end-to-end approach get to 99.999999% reliable where we would be able to remove driver supervision? I am also interested to see the results and see how good/reliable Tesla is able to make it.

Question is - how does end-to-end work in terms of routing ?

Because - lane selection is probably the major cause of disengagements.

How will or even can e-2-e make it better? It can't learn from videos ...

My guess is there needs to be a router separate from the end-to-end stack that would determine the route and then instruct the end-to-end stack the route that the car needs to go. The end-to-end stack then does the minute-by-minute driving to follow the route given to it by the router. So all lane changes to follow the route would be given to the end-to-end stack by the router.
 
This doesn’t seem right since the router would need to know what lanes were present - which is not knowledge that can be contained reliably in a map.
I'd characterize this as an exception. Even Google router tells you what lanes to use (esp when turning).

So, FSD can give NN routes with lane info - and NN has to figure out what to do if the lane info is incorrect...
 
I'd characterize this as an exception. Even Google router tells you what lanes to use (esp when turning).

So, FSD can give NN routes with lane info - and NN has to figure out what to do if the lane info is incorrect...
Sure, but it is extremely common.

Yes. It needs to be able to drive with lane info that is completely wrong, as is (usually) easy for humans.
 
So, FSD can give NN routes with lane info - and NN has to figure out what to do if the lane info is incorrect...
Right, Tesla has tried to improve on incorrect/missing map data with their Lanes network / "deep lane guidance" predicting lane counts and connectivities, but that too has inaccuracies that result in wrong lane selection with 11.x. Part of the problem is that the existing control heuristics trust the lane data too much even when it's visually obvious, e.g., there are 2 main straight lanes and temporary turn lanes fork from the main lanes are upcoming (but map data only has information that there's an upcoming left-turn only lane, so heuristics conservatively get out of the current left-most lane).

Neural network control avoids the strict heuristics to learn when to not rely on the lanes data -- potentially not even using that data ("best part is no part"). The inputs still need to include the destination and presumably traditional navigation routes, e.g., turn left in 1 mile, and the network given many videos can learn that it should generally be in the left lane for an upcoming left turn but also enough examples to know not to get into left turns for earlier intersections, etc.

Even navigation strictly requiring map data can be wrong especially for unmapped parking lots, so if we see 12.x driving to a destination pin even when navigation doesn't know the turns, this is probably even stronger signal that end-to-end has learned something even more general.
 
Taking inputs isn't the issue.. how can end to end learn which lane to take because it depends on the route ...

I'd love to read a white paper on how they are doing the end to end.

FSD E2E simplicity has to be offset with complexity somewhere. I would guess training and debugging are the biggies. Intermediate nets will likely need to be trained separately. Of course E2E still needs to be trained. Training resources and time will jump. In theory modular designs make debugging easier but I bet it turns into a can of worms. If they didn't have a good handle on V11 debugging ...

I'm still a bit apprehensive about risks of phantom steering, braking, acceleration, path, ...
 
Last edited:
The inputs still need to include the destination and presumably traditional navigation routes, e.g., turn left in 1 mile, and the network given many videos can learn that it should generally be in the left lane for an upcoming left turn but also enough examples to know not to get into left turns for earlier intersections, etc.
I’m going a step further.

FSD needs to learn which turn lane to take when there are multiple turn lanes and a quick turn after that. Frequently FSD uses the wrong turn lane - for eg. If there is a left turn with two turn lanes and then a right turn in 1/2 block to the freeway entrance - FSD May select the left turn lane instead of the right turn lane. Then it struggles to get into the freeway and gets stuck in an awkward position requiring you to rescue.

I guess they can feed enough such videos to make NN connect the correct left turn lane to upcoming right turn. They have to do this kind of special training for a number of situations - which again comes back to limitations of what situations programmers can think of and gather enough of those examples for training…

In other words, long tail problem has been pushed to training from heuristics.

I see the value of end to end NN for second by second driving - but longer term planning I think still needs heuristics … or rather end to end would be more difficult to get right than heuristics. So, I expect better second to second driving with end to end but no better (or worse) lane selection based on route.
 
  • Like
Reactions: pilotSteve
The whole video in controls out paradigm doesn't make sense to me from an implementation pov. Like evnow pointed out, there are situations where it's not clear how this approach will improve or solve the problem.

It's unclear how an all nets approach will understand implicit human decision making. How will it understand that I made a lane change because I'm avoiding an arbitrary obstruction or situation vs navigating to my destination vs fixing a mistake I did prior.
 
Any reports of the V12 employee release expanding beyond the initial 100 cars? It's been pretty quiet. I suspect this release is already gone to retraining. If so, it will be interesting to see how long the recycle time is.

Teslascope is starting to walk back their optimism. Now saying that V12 will be end of the month, at the earliest, but could easily slip to Jan or Feb. They aren't saying anything about the current version expanding.
What is the typical date for the main code holiday release each year?
 
  • Funny
Reactions: AlanSubie4Life
I guess they can feed enough such videos to make NN connect the correct left turn lane to upcoming right turn
Both the videos and navigation inputs are needed to train these differences, and even then it depends very much on how navigation is encoded for the neural networks. Ideally a left turn followed by a right turn in either 500 feet into a parking lot or 1000 feet into a highway slip lane should be similar enough to get in the outer left turn. Similarly, for routes that just continue after the left turn, examples from the fleet could result in the network learning that people generally choose whichever lane has fewer vehicles, and this is probably a reasonable behavior except for cases where the turn destination lane is forced to turn.

I do agree that this seems tricky for end-to-end as it can also be tricky for humans. Around here, there are various freeway offramps with double-left turn to cross under the freeway that then immediately forks to a left turn back onto the freeway. 11.x unnecessarily avoids the inner left-turn lane thinking it'll be forced back onto the freeway and 10.x was often bad at completing the turn from the outer left-turn lane. The department of transportation tries to repaint guiding stripes to help people turn into the correct lanes, but clearly people keep driving over them causing them to fade.

At least from the livestream, it got into a left-turn lane and kept right when forking to a double left turn. This particular turn ends up merging immediately after, so the lane selection didn't matter as much. Obviously this has been an ongoing issue with 11.x, so maybe that not only means 12.x's initial release quality bar will do no better but also hopefully means we can try it out sooner. There's the potential for general end-to-end data engine to "naturally" resolve this type of disengagement assuming a generic feedback loop, but Tesla could also dedicate resources for explicitly finding examples and collecting dedicated training data to address this family of issues.
 
  • Like
Reactions: JHCCAZ