Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

how does the new end to end FSD work, need a block diagram from of data flow from the fleet to DoJO to an individual's car

This site may earn commission on affiliate links.
I have about 100 miles of free trial 12.3.x, and I had to slam in the brakes about 5x to not actually crash. One time, cars coming down a hill from the right at 45mph, car started to cross, right into their path. Would have been a major T-Bone 1 second later.

Another time it almost crashed into a tree at a fork in the road.

5 critical failures in 100 miles. It needs to go 600k to a million miles. How close is it?

Spokane, Wa.
Believe as you may, but I'm pretty sure that the car would have correctly handled both cases.

It takes time to learn to trust FSD, 100 miles is far from enough time.
 
I have about 100 miles of free trial 12.3.x, and I had to slam in the brakes about 5x to not actually crash. One time, cars coming down a hill from the right at 45mph, car started to cross, right into their path. Would have been a major T-Bone 1 second later.

Another time it almost crashed into a tree at a fork in the road.

5 critical failures in 100 miles. It needs to go 600k to a million miles. How close is it?

Spokane, Wa.
I am glad you are not a driving instructor teaching 16 year old teens on how to drive. They would be traumatized by your responses / feedback.
 
Last edited:
I have about 100 miles of free trial 12.3.x, and I had to slam in the brakes about 5x to not actually crash. One time, cars coming down a hill from the right at 45mph, car started to cross, right into their path. Would have been a major T-Bone 1 second later.

Another time it almost crashed into a tree at a fork in the road.

5 critical failures in 100 miles. It needs to go 600k to a million miles. How close is it?

Spokane, Wa.
I have had FSD Beta since 10.2 (original/>12,000 miles mostly dense urban driving) and to the best of my memory I have never had to use the brakes to keep from hitting another car. Have had to for other reasons though but not since v12 (I think). Sounds like you may have been pre doubting and panicking too soon.
 
Believe as you may, but I'm pretty sure that the car would have correctly handled both cases.

It takes time to learn to trust FSD, 100 miles is far from enough time.
The car started near full throttle acceleration into the intersection with the approaching cars about 3 car lengths to the right. I rolled about 10 feet. Literally 3 more feet and I would have been in front of the first car. As I hit the brake and stopped, the cars passed in front of me, about 2-3 feet away, doing about 45mph.

The tree in the fork was repeatable. To be fair, it did navigate it ok several times. Another few times it did the wiggle and red hands of death. One time it kept driving and I slammed on the brakes, stopping feet from the road edge, pointed straight at the 3ft wide pine tree trunk 10ft away.

Don't tell me it would be fine. It's a complete failure.
 
When two data sources disagree, how do you determine the correct one?
I've never quite understood this. If the car is using neural networks, then it gets a bunch of numbers from a bunch of sensors and it tries to make correlations between all the numbers. What difference does it make if the numbers are radar, lidar, ultrasound, camera, thermometer, or anemometer? Where numbers strongly correlate, a conclusion can be drawn. Where they don't, no conclusion. The network should be agnostic to the source of the numbers.
 
  • Informative
Reactions: APotatoGod
Define close? They're not anywhere close to an MTBF that is required to remove supervision and take on liability. I think Tesla are 3-4 orders of magnitude from removing supervision. The main reason to add multiple sensing modalities and hd-maps is to add nines/reliability. They are not required "to drive".

People here seem to actually think that Waymo couldnt drive without maps and Lidar. In reality they can't drive with the required safety metrics required for unsupervised operation in a safety critical context.

It's easy to remove maps and extra sensing modalities if camera-only should be deems safe enough at some point in the future. Right now, it's a lot safer to be in a car with Lidar. It's not very expensive, so why risk it?
Except it's not safer because almost all of the mistakes FSD makes is completely unrelated to perception. This has been the case back to even much older versions. People seem to think if they throw in a lidar or radar it will magically solve the problems of FSD, but it doesn't.

Case in point are the Waymo cars driving into tow trucks and telephone poles. Did it not "see" them? No. The issue is the logic went wrong unrelated to perception. FSD has much more problems like that which needs to be solved.
 
Except it's not safer because almost all of the mistakes FSD makes is completely unrelated to perception. This has been the case back to even much older versions. People seem to think if they throw in a lidar or radar it will magically solve the problems of FSD, but it doesn't.

Case in point are the Waymo cars driving into tow trucks and telephone poles. Did it not "see" them? No. The issue is the logic went wrong unrelated to perception. FSD has much more problems like that which needs to be solved.
Hard to know, but perhaps so in ideal weather. Question for you: If the perception is not an issue why does it still miss stop signs and speed limits?
 
Hard to know, but perhaps so in ideal weather. Question for you: If the perception is not an issue why does it still miss stop signs and speed limits?
If you look at the reports of it, it see the signs, it just chooses to ignore it. Also lidar/radar does not help in that case either given they aren't used to read signs. So that is yet another example of a problem that is unrelated to using cameras.
 
  • Informative
Reactions: APotatoGod
The easiest way to explain this could you drive better with additional data sources?

Tesla got rid of radar because it added very little value and combining the two data sources is problematic.
When two data sources disagree, how do you determine the correct one? Does it even make a difference?
With hand-coded v11-style C++, you're right, it's difficult. One of the strengths of E2E neural networks (v12+) is that they can handle conflicting or fuzzy scenarios MUCH better than hand-engineered "if-then-else"-style coding can. But in a nutshell, the radar signal should be leaned on when the pure-vision signal is ambiguous. (I'm not sure whether the feature-engineered occupancy network incorporated any notion of ambiguity.) 99% of the time, the pure vision signal is just fine. The value of radar/Lidar is in the remaining 1%.
If you look at some of the internal visualizations that Tesla lets out periodically, detection and classification doesn't seem to be a problem. It is able to construct a scene with dramatically more detail than a human ever does. Think of it, can you tell me the number of cars surrounding you at any time?

The problem, AFAIK, is not in the detection and classification, it's in the what the hell do I do with it.

Is great precision required to drive? Are you able to determine the distance to the car in front of you in inches? Do you need to?
It's not about precision; it's about resolving larger-scale ambiguity, or overcoming camera limitations in reduced-visibility scenarios. (E.g. fog, rain, or obstructed camera lenses.) Or even about superhuman driving ability, which should be an eventual goal. Elon used to brag about how Autopilot radar could see two cars in front, by bouncing the signal underneath the lead car, and how this could prevent accidents in a way that pure vision (or even human driving) couldn't. There were examples posted on YouTube of real-life accident-avoidance scenarios like this, where the collision warning would correctly sound before the problem was visible to the cameras or to the driver.
The only place this would be useful is where there are no road rules - ie no stop signs, no traffic lights, no lane markings, no school zones, no speed limits. Basically off-road situations.
Not at all. It's nothing to do with mapping, but much more about situations where the cameras' physical limitations become problematic. (Fog/rain/dust/dirt, visual ambiguity / optical illusions, or low light.) I've had plenty of forward-collision warnings and phantom-braking events where the car sees e.g. the shadow of a tree on the road in front of me, and thinks it's an obstacle. This is less prevalent now, but it still happens from time to time. Radar+Lidar would add enough information to the network to allow it to disambiguate these situations far more reliably. Likewise for e.g. the 2016 fatality involving a tractor-trailer that was the same color as the sky. The cameras couldn't see it, and radar interpreted it as an overhead sign, but Lidar would have accurately identified it as an obstacle. An E2E neural network would be able to properly synthesize all this information together in a coherent and accurate way, and do the right thing in these cases.
 
Last edited:
  • Informative
Reactions: APotatoGod
Have you been driving in the US? How many hours have experienced FSD or are you watching YouTube videos and coming to a conclusion?
I can't speak for spacecoin, but for myself on v12.3.6 I still have to disengage every 2-3 miles in city driving. Most of these are not safety critical, usually it's incorrect lane selection or driving on the shoulder or missing a turn or aiming for a pothole or being rude to other drivers or turning right on a No-Right-On-Red, but all of these will have to be solved with many more 9's before Robotaxi will be accepted as adequate by all the other drivers sharing the road, and by its passengers.
 
Not at all. It's nothing to do with mapping, but much more about situations where the cameras' physical limitations become problematic.
You are missing the point. Maybe I am missing your point too. 🤷🏽‍♂️

The weakest link in AV is vision. Why? Simply because the whole DOT mechanism is based upon human vision. Not sound. Or Ultrasound or subsonic sound. There are no sound signals to send message to human drivers. Humans need to see the sign, comprehend the sign, and react.

Your AV needs to be able to do the same. Unless the signs are modified to enable other means of communication which LiDAR or RADAR can comprehend, LiDAR and RADAR are best served off road where there is no such system designed for humans in place.
 
I can't speak for spacecoin, but for myself on v12.3.6 I still have to disengage every 2-3 miles in city driving. Most of these are not safety critical, usually it's incorrect lane selection or driving on the shoulder or missing a turn or aiming for a pothole or being rude to other drivers or turning right on a No-Right-On-Red, but all of these will have to be solved with many more 9's before Robotaxi will be accepted as adequate by all the other drivers sharing the road, and by its passengers.
Either something is wrong with your car, or you.
 
  • Like
Reactions: ewoodrick
Likewise for e.g. the 2016 fatality involving a tractor-trailer that was the same color as the sky. The cameras couldn't see it, and radar interpreted it as an overhead sign, but Lidar would have accurately identified it as an obstacle.
Agree, to an extent. However the visual data was there. It was just not interpreted correctly and hence resulted in an incorrect decision.
 
I've never quite understood this. If the car is using neural networks, then it gets a bunch of numbers from a bunch of sensors and it tries to make correlations between all the numbers. What difference does it make if the numbers are radar, lidar, ultrasound, camera, thermometer, or anemometer? Where numbers strongly correlate, a conclusion can be drawn. Where they don't, no conclusion. The network should be agnostic to the source of the numbers.
The network is agnostic to the type of data. What it is exceptionally good at is resolving ambiguity and _apparently_ conflicting information. (There is only one ground-truth world outside the car, so there will always be a consistent explanation for any input; the NN just has to find it.) When the numbers "don't correlate", the network will try to parse out what's actually happening to give those apparently discordant inputs, so that the discrepancy can be resolved. (E.g. is it an object the same color as the sky, or a shadow on the roadway that looks like an obstacle, or a water drop on the camera lens, or ...?) With more [and more types of] input information, the discrepancy can be resolved sooner and with more confidence.
 
You are missing the point. Maybe I am missing your point too. 🤷🏽‍♂️

The weakest link in AV is vision. Why? Simply because the whole DOT mechanism is based upon human vision. Not sound. Or Ultrasound or subsonic sound. There are no sound signals to send message to human drivers. Humans need to see the sign, comprehend the sign, and react.
Do you mean the strongest link? Neither radar nor Lidar have anything to do with sound; they are both photon-based. In any case, the "DOT mechanism" has much more to do with road rules, not with perception limitations. Obviously vision is necessary for any AV system. The point is that FSD-style Pure Vision has fundamental weaknesses that human vision doesn't: reduced resolution, reduced dynamic range, prone to sun glare, prone to obstruction by e.g. raindrops or dirt (due to the lens' fixed position and proximity to the glass). Radar and Lidar are ways to compensate for those limitations that humans don't have.
Your AV needs to be able to do the same. Unless the signs are modified to enable other means of communication which LiDAR or RADAR can comprehend, LiDAR and RADAR are best served off road where there is no such system designed for humans in place.
In situations with good visibility, Lidar and radar add practically no value, even off-road. Their value is in compensating for poor visibility of e.g. obstacles, not for lack of signage. And when you're on-road, the best HD maps in the world won't help the camera spot an unexpected obstacle through thick fog, but radar can. Or in the case of an ambiguous car-shaped shadow up ahead in the road, even an HD-mapped road, Lidar can resolve whether it's an actual object much better (and faster) than pure vision can.
 
Do you mean the strongest link?
I do mean it is the weakest link because in bad weather, it will not be able to see properly.

And when you're on-road, the best HD maps in the world won't help the camera spot an unexpected obstacle through thick fog, but radar can.
Then again, radar cannot help you see the road, the lane marking, the signals. All it can do is to make sure you don't hit any objects in the rare event that your AV choses to continue driving even when it is unable to see where it is going.
 
  • Like
Reactions: APotatoGod
Either something is wrong with your car, or you.
That's very kind. I have two cars with FSD; a 2017 M3 and a 2022 MY. Both make mistakes with similar frequency on v12.3.6. So it is not the car's hardware. (Or if it is, it's a very common flaw, given that both of my cars exhibit it.)

Likewise, unless I am actually hallucinating, the car regularly attempts to drive straight from a left-turn-only lane in a particular spot near my house. (Just to give one example of many.) You could argue that this is a mapping problem, and I don't doubt that the maps there may be incorrect, but FSD should still be able to read the road markings and override the maps in that case. So again, unless I am literally hallucinating, it is not a problem with me.

This leaves the car's software. In my experience, it still makes significant mistakes every few miles in city-streets driving, at least in the areas I drive. (West LA mostly.) Maybe your neighborhood is less tricky. I'm looking forward to see whether 12.4.x is as much of improvement as Elon says it is.
 
That's very kind. I have two cars with FSD; a 2017 M3 and a 2022 MY. Both make mistakes with similar frequency on v12.3.6. So it is not the car's hardware. (Or if it is, it's a very common flaw, given that both of my cars exhibit it.)
:) Thank you for handling this jab gracefully.

If that is the case where two of your vehicles are making same mistakes, there is a great chance that the roads you drive on were/are not part of the training dataset.