clydeiii
Member
Here’s hoping they do tons of A/B testing and make good choices
You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
What he said is very clear. The misinterpretation is on you.I can see why you got confused by what Elon said. What he said is very confusing and easy to misinterpret.
The crucial part isThey are not throwing out all of their existing work and they are not doing something that is utterly impossible going from V11 to V12. One needs to be really careful in interpreting what Elon says. In this case the full quote is:
v12 is reserved for when FSD is end-to-end AI, from images in to steering, brakes & acceleration out.I highlighted the crucial part: end-to-end AI. The only possible thing this can mean is it will be all NNs because they have finally converted the last pieces of heuristic code over to a NN.
That is what e2e means. Images in, control out.v12 is reserved for when FSD is end-to-end AI, from images in to steering, brakes & acceleration out.
We will see. When V12 is released and we lose on-screen visualization, the occupancy network, and all the other great stuff they've been working on for years then I will humbly apologize. If those things remains then clearly V12 will be using multiple NNs tied together, not one big NN as per your image.What he said is very clear. The misinterpretation is on you.
The crucial part is
That is what e2e means. Images in, control out.
You can speculate on what he meant or that he misspoke but the language he used has a specific meaning and is not easy to misinterpret.
That Elon Musk often is wrong, exaggerating or misrepresenting things is not up for debate. But what he said is not open for misinterpretation. It has a specific meaning which is not up for debate. He said v12 is reserved for when FSD is end-to-end AI, from images in to steering, brakes & acceleration out. Could he be entirely wrong again, lying, misrepresenting things? Yes.We will see.
What that tells us is that Elon Musk is wrong again.When V12 is released and we lose on-screen visualization, the occupancy network, and all the other great stuff they've been working on for years then I will humbly apologize. If those things remains then clearly V12 will be using multiple NNs tied together, not one big NN as per your image.
That said, here’s a thought experiment: if the sensor suite is now limited to vision only - “Tesla Vision” - because that is how humans drive, but the system is now end-to-end NNs trained from video of humans driving with no sign reading or rules programmed into the system - absolutely NOT the way humans learn to drive or actually drive - then how is it that we think the system can be an order of magnitude safer than humans. Isn’t it the case that the most we can hope for here is that the system is 100% as safe as the human drivers in the selected videos?
At all times, making them considerably safer than human drivers taken as a whole. Watch the YouTube dash cam videos available. The vast majority of them are a result of lack of situational awareness (texting, not checking blind spots), ignorance of traffic rules (assuming right of way), or physical impairment (drunk, fatigued, medical event). So the driver who is paying attention, knows the traffic rules and isn't drunk or tired is very rarely going to have problems. A vision-based autonomy system can do that.Isn’t it the case that the most we can hope for here is that the system is 100% as safe as the human drivers in the selected videos?
Perhaps the AV system won't make the 'human errors' related to being distracted by non-driving things, drive while intoxicated, or just ignore traffic regulations. That would go a long way toward being safer than the average human.That said, here’s a thought experiment: if the sensor suite is now limited to vision only - “Tesla Vision” - because that is how humans drive, but the system is now end-to-end NNs trained from video of humans driving with no sign reading or rules programmed into the system - absolutely NOT the way humans learn to drive or actually drive - then how is it that we think the system can be an order of magnitude safer than humans. Isn’t it the case that the most we can hope for here is that the system is 100% as safe as the human drivers in the selected videos?
So I for one am not excited about the pending v12 release with full-stack NNs. There's going to be a ton of regression here and lots of opportunities for the system to veer away (no pun intended) from being a polished L3 autonomous driving system. Hopefully, this isn't more of Elon's "goal" of an L5 robotaxi, which I think everybody who drives these knows (if only deep down inside) is never going to happen. I think it's time for Tesla to start thinking about picking a realistic goal and then making moves using everything available to take this product over the finish line and call it done. It can't be a work-in-progress forever, right?
And maybe just being 2-3x better than the average will be good enough for regulators. But I was always under the impression that we wanted AVs to be better than the top human drivers, not just better than the average. If you train on the best human drivers, then I don't see how the system will be better than the best driver. It can't be better than its training.
It's interesting right because DeepMind created AlphaGo, giving it knowledge of many human games, and AlphaGo ended up able to defeat the world champion. Since then AlphaZero throws out the human games altogether and simply learns from itself.It could work. Assume there are no perfect human drivers. And, assume the top human drivers don’t all have the same failings. Then, the specific failing of one subset of top drivers would be down voted by the more numerous remaining top drivers’ examples without that failing. The same thing would happen for each specific failing. The result could be a learned driving pattern with no failings. Basically, you sand off the bad spots.
Except the vision-based autonomy doesn't know the rules - that's my point. There are no rules, just trained to drive like humans have driven in similar situations. This seems like an approach problem. The autonomous driving system has the opportunity to know A LOT MORE than the human driver. It can be taught all the rules, e.g., the difference between a solid white line and a dashed white line - something that I would be willing to bet 80% of the drivers on this forum don't know. It can have sensor input way beyond that of humans - radar, ultrasonic, integrated mapping data, real-time traffic updates and construction info, vision in spectrums far wider than human vision, etc. We can never expect all of that from human drivers. Why give all that up?So the driver who is paying attention, knows the traffic rules and isn't drunk or tired is very rarely going to have problems. A vision-based autonomy system can do that.
It doesn't need to know the rules. It only needs to follow them. That's what they'll get by using the right training data. The advantage of the neural net system over a heuristic system is that the neural net system handles ambiguities better. That is, places where there are no written rules. The neural net system will just have examples of how people deal with those situations and handle them. So neural networks cover both written and unwritten rules of the road.Except the vision-based autonomy doesn't know the rules - that's my point.
They're not giving up on anything. They're starting from scratch, so they're using the data that they have, which is vision data. There's no reason that the system cannot be trained from LiDAR, RADAR and/or ultrasonic data in combination with vision data. Where will they get that data? It may be simulated based on the vision data, or it may come from new cars that are equipped with the new sensors. The latter is slower because they'll have to collect data for a while, but they're certainly not giving up on anything.We can never expect all of that from human drivers. Why give all that up?
You make many excellent points. I am very skeptical of some of the things Elon said during the V12 video. I am also skeptical of the extrapolations many of the usual suspects on YouTube have made based on what Elon said. After 7 years of horribly wrong predictions about FSD, it feels like I'm watching Elon (Lucy) hold the football to be kicked by the usual suspects (Charlie Brown). Year after year they try to kick the football by making predictions and extrapolations based on a known faulty source of FSD information and year after year they fall flat on their backs as the football is pulled away.If some locality decided to make right turn on red illegal on Jan 1, after 100 years of it being legal, someone has to remove all training video data of right turns on red and replace it with no right turns on red training data, test, and ship it. Edit: even worse, the car would need to know when to reference different sets of training data in different localities.
That begs a somewhat unrelated question: if they're starting from scratch, then what's the point of 11.4.7?They're not giving up on anything. They're starting from scratch, so they're using the data that they have, which is vision data.
I get lots of software updates even though companies might be working on major updates. There's about a half million users of 11.x with 12.x likely a long way off yet. Probably a good idea not to abandon them yet.That begs a somewhat unrelated question: if they're starting from scratch, then what's the point of 11.4.7?