Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Autonomous Car Progress

This site may earn commission on affiliate links.
I don't understand. The very purpose of this effort is to give people an idea of what the model is doing. The fact that it said that it was doing something that it wasn't was a means of letting us know that it has a bug, either in the text generation system or in the driving system. If FSD explained what it was doing, we'd know right away why it was driving at a particular speed, or why it refused to change lanes, or why it does all the other odd things that it does.
My point is an E2E system doesn't "know" anything. It doesn't know there's a bike or a car or a lane to the right or whatever. It's just a huge equation with a billion fixed coefficients. This language engine, for lack of a better word, simply makes stuff up.

Here's a question: how do you train the language engine? You train NNs by feeding in a bunch training data and tweaking the coefficients until you get close to the "right" output. With image recognition the right output is what a human curator says each training image represents. With a braking NN the right output could be the lowest constant g force that stops at the stop sign.

What's the right output for this language NN? What a human curator says, as with image recognition? Probably. But the human curator is just guessing! So they trained the NN to guess like a human. It might make the rider feel better, but it doesn't represent the actual driving decisions.
 
It's just a huge equation with a billion fixed coefficients.

Relevant XKCD:

1713454656617.png
 
What's the right output for this language NN? What a human curator says, as with image recognition? Probably. But the human curator is just guessing! So they trained the NN to guess like a human. It might make the rider feel better, but it doesn't represent the actual driving decisions.
Have you watched that video? It describes what the car is doing, and why. Folks picked up on the fact that a pedestrian had to hustle to get across the road before the car arrived - while the car happily pronounced the road clear (7:00 mark). Is that a flaw in the description or in the control system? I'd claim the latter. I say that the description system was correct in describing what the car thought, and that's invaluable to understanding how the control system worked; it considered the road clear because the pedestrian was hustling to move off the driving surface. FSD would have slowed slightly, and that would certainly have been more reassuring than assuming that the pedestrian would be clear by the time the car arrived. But that's a control problem.