Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
I suspect the release of FSD to everyone who bought or subscribed (in Nov, I think) without Safety Score, had more to do with the lawsuits, o_dowd, DMV and NHTSA actions.
Not sure why it would be related to any of this. I suspect it was about realizing revenue at a convenient time, and it being safe enough as implemented, based on data (of course we have no idea how safe it is (other than an upper limit), as there is not any data on it, but internally they know).

The challenge will be keeping it safe enough, and performance with the edge cases (which of course there are not nearly enough miles to find yet).
 
Last edited:
  • Like
Reactions: EVNow
Starting to roll?

SmartSelect_20230319_091206_Firefox.jpg
 
I don't know which piece of this equation is still using traditional code over neural networks, but I guess this tweet means there's more yet to switch over:

Seems like getting rid of the human-engineered vector space is gonna be the next big barrier that has to come down. If I was forced to drive using only vector space it would be terrifying and bad. It should be vision -> control with no human understandable features in the middle.
 
Seems like getting rid of the human-engineered vector space is gonna be the next big barrier that has to come down. If I was forced to drive using only vector space it would be terrifying and bad. It should be vision -> control with no human understandable features in the middle.

I was actually thinking more about this, and I really appreciate Tesla's approach.

If you've got one network for perception, and one network for planning, and one network for control, you can pinpoint where a problem stems from and debug a part of the system without breaking other parts.

How would you debug an end-to-end vision to control network? Throw more training samples at it and pray?
 
I was actually thinking more about this, and I really appreciate Tesla's approach.

If you've got one network for perception, and one network for planning, and one network for control, you can pinpoint where a problem stems from and debug a part of the system without breaking other parts.

How would you debug an end-to-end vision to control network? Throw more training samples at it and pray?
You can only look at the system in a wholistic way. Like you can look at the dashcam footage and reason about why it did what it did, but you can't examine the intermediate feature space because it won't be human-understandable. That's why deep learning systems are so powerful is because they don't have hand-chosen intermediate layers that engineers picked out.

It's like when people used to make vision-based detectors for cars a decade ago, before deep-learning, you'd make a detector for wheels, and doors, and hoods, and other car-like features that humans picked out. And then you'd decide where a car is based on those detected features. That is crazy talk today, no one would do that. And yeah we've given up the ability to debug why a car wasn't detected at the feature-level. We can only look at the dashcam recording and the final output.

Also it's no different than human drivers, you can ask a human after a crash what happened and they will just make up some story after the fact. They can't access their intermediate mental states and tell you. But the dashcam footage tells you all you need to know.
 
  • Informative
Reactions: pilotSteve
It should be vision -> control with no human understandable features in the middle.

That's called the "end-to-end" approach. Wayve is trying it. They have some nice demos where the car goes around a parked car or navigates a roundabout. But there is a big difference between training the car to perform certain driving tasks and achieving the reliability where you could drop the car anywhere and let it drive fully driverless. So far, nobody has made it work reliably at scale. It is a very difficult approach for many reasons. For one, it is very difficult to train a single NN to figure out all of planning and control directly from vision alone. Second, it is very hard to troubleshoot since there are no "human understandable" features in the middle that you can pinpoint where the failure happened. It is a black box. So if there is a failure, you have to retrain the entire NN until it fixes the problem and hope you don't introduce a new failure.

Having said, the trend has been towards consolidating NN's into fewer but bigger NN's. So we may eventually get to "end-to-end" some day but right now doing "end-to-end" from the start has not been solved yet.

I was actually thinking more about this, and I really appreciate Tesla's approach.

If you've got one network for perception, and one network for planning, and one network for control, you can pinpoint where a problem stems from and debug a part of the system without breaking other parts.

How would you debug an end-to-end vision to control network? Throw more training samples at it and pray?

That's one of the big challenges with the "end-to-end" approach. It is very hard to troubleshoot since it is one big black box. There is no way to pinpoint what caused the failure. So yes, the only thing you can do is retrain the entire NN until it fixes the problem and hope you don't introduce new bugs or failures. That is one reason why all the AV companies have different stacks. It allows them to troubleshoot better. And you can build a NN to do a certain task like perception and test it and validate it until it is reliable. You can build a NN to do another task like behavior prediction and train it and validate it. And then put it together when you know each separate NN is reliable enough.
 
Last edited: