Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

how does the new end to end FSD work, need a block diagram from of data flow from the fleet to DoJO to an individual's car

This site may earn commission on affiliate links.
I do mean it is the weakest link because in bad weather, it will not be able to see properly.
Yes, hence the value of radar and lidar to improve the safety in such situations.
Then again, radar cannot help you see the road, the lane marking, the signals. All it can do is to make sure you don't hit any objects in the rare event that your AV choses to continue driving even when it is unable to see where it is going.
The roads and lane markings and signals are pre-mapped, so GPS plus local positioning (being able to see lane lines within a few meters to gauge lane position) is sufficient for navigating those. That sort of close-up vision is doable even in very bad weather or with compromised cameras. I'm thinking more of situations where the flow of traffic is too fast for conditions, the sorts of situations that (in the worst case) result in 100-car pile-ups. Radar could add a huge amount of safety in cases like that. And of course there are intermediate cases where conditions are not extremely bad, but the extra safety margin and redundancy provided by radar/lidar could still be very helpful. And again, radar/lidar could also reduce false positives, of the type that cause phantom braking or phantom FCW's. Robotaxi will be no fun if the FCW alarm goes off every five minutes for no reason.
 
Well, right; that's my expectation, but it flies in the face of everyone going on about how difficult "sensor fusion" is, which is why I replied to @ewoodrick's post. Has sensor fusion been difficult because engineers haven't been using neural networks?
It has to be able to work cohesively.

It might make sense like a (dysfunctional) family - husband, wife, daughter and son - all providing inputs from each window position of the car where they are seated. How usable would it be for the one in the drivers seat?
 
Last edited:
  • Funny
Reactions: APotatoGod
I've never quite understood this. If the car is using neural networks, then it gets a bunch of numbers from a bunch of sensors and it tries to make correlations between all the numbers. What difference does it make if the numbers are radar, lidar, ultrasound, camera, thermometer, or anemometer? Where numbers strongly correlate, a conclusion can be drawn. Where they don't, no conclusion. The network should be agnostic to the source of the numbers.
For one, processing all of those sources takes a lot of compute power.
And when the output of the "camera" gives at least the same as Lidar or radar, why use it.
But the "cameras" give even more output, color, which tends to be very helpful for driving.

The other sources just don't add information that the car doesn't already know.

Does it make a difference that a car that's about 100 yards from you is 300.02 feet vs 300? Too much information is dangerous.
I'm pretty sure that If I made you look at all the data sources constantly, you wouldn't be able to drive.
 
Well, right; that's my expectation, but it flies in the face of everyone going on about how difficult "sensor fusion" is, which is why I replied to @ewoodrick's post. Has sensor fusion been difficult because engineers haven't been using neural networks?
Yes. Hand-coding for sensors like radar/lidar is difficult because engineers have very little intuition for how to interpret the signals. But for a NN, it's just pure information. It will learn it like a bat learns to use ultrasound, or like a whale learns whale language (which humans have never been able to figure out, but that NN's are making progress on.)

A few years ago, Lidar was very expensive, and the signal had to be processed with hand-written C++ code, which both made the decision to avoid Lidar make total sense at the time. But with the v12 E2E NN approach, and with Lidar sensors now costing $1k instead of $20k, the situation is completely different. Tesla may be "locked in" to Pure Vision because they have millions of cars on the road without Lidar/radar and don't want to abandon them (or admit that they will have to abandon them), but I still think at some point they may have to bite the bullet and add Lidar/radar back to the sensor suite, especially for L4 Robotaxi. Limited-ODD L3 FSD in good conditions may be achievable with HW4 pure vision, but I'm skeptical that they can achieve wide-ODD L4 with the current fleet and sensor suite.
 
  • Like
Reactions: JB47394
It has to be able to work cohesively.

It might make sense like a (dysfunctional) family - husband, wife, daughter and son - all providing inputs from each window position of the car where they are seated. How usable would it be for the one in the drivers seat?
But this is exactly how Tesla's eight-camera system currently works; it's like having eight separate viewers at once. The thing is that they all share the same "brain" (neural network), so the information flow between them is far more efficient and cohesive than a bunch of passengers trying to yell at each other. Adding additional sensors (radar/lidar/USS) would still be part of the same single-brain system.
 
  • Like
Reactions: JB47394
The roads and lane markings and signals are pre-mapped, so GPS plus local positioning (being able to see lane lines within a few meters to gauge lane position) is sufficient for navigating those.
Correct. That was my original assumption too when Waymo began testing with their HD Maps + LiDAR + RADAR + Vision. That means, if you block the cameras, it would still be able to drive but I have not seen any evidence of that being the case yet.
 
Not at all. It's nothing to do with mapping, but much more about situations where the cameras' physical limitations become problematic. (Fog/rain/dust/dirt, visual ambiguity / optical illusions, or low light.) I've had plenty of forward-collision warnings and phantom-braking events where the car sees e.g. the shadow of a tree on the road in front of me, and thinks it's an obstacle. This is less prevalent now, but it still happens from time to time. Radar+Lidar would add enough information to the network to allow it to disambiguate these situations far more reliably. Likewise for e.g. the 2016 fatality involving a tractor-trailer that was the same color as the sky. The cameras couldn't see it, and radar interpreted it as an overhead sign, but Lidar would have accurately identified it as an obstacle. An E2E neural network would be able to properly synthesize all this information together in a coherent and accurate way, and do the right thing in these cases.

Yes, let's throw away that data that could indicate a pedestrian on the road. It's probably just shadow braking.

Really? Comparing a 2016 incident with today's FSD. There is absolutely no guarantee that it would have detected it.

But as you just got finished saying, 2 of the 3 sources showed no issues, so the car would decide that it was a phantom event.
 
  • Like
Reactions: enemji
But this is exactly how Tesla's eight-camera system currently works; it's like having eight separate viewers at once. The thing is that they all share the same "brain" (neural network), so the information flow between them is far more efficient and cohesive than a bunch of passengers trying to yell at each other. Adding additional sensors (radar/lidar/USS) would still be part of the same single-brain system.
But they are cohesive ie they are all visual guidance.

In my example of the dysfunctional family, it is impossible to drive because wife thinks you are too close, and not braking. The son thinks you are too slow and should speed up. The daughter would be happily filing her nails and give no particular useful guidance. On top of that, husband speaks english, wife speaks spanish, son speaks german and daughter is french.
 
Hard to know, but perhaps so in ideal weather. Question for you: If the perception is not an issue why does it still miss stop signs and speed limits?
Part of this is limitations of the training set; clearly in good conditions the cameras can resolve the signs well enough to read them if the NN was trained to. And agreed that in good weather, perception is not the limiting factor.

But when there are "Autopilot degraded due to poor conditions" messages, or when you get Red-Hands-Of-Death due to sun glare, or you hit a puddle which splashes the windshield and FSD panics, or you get phantom braking because a shadow on the road ahead looks like an obstacle, these are perception issues.

Obviously the system will have to make significant progress on both perception issues (hardware) and training issues (software) before it is Robotaxi-ready. I'm not expecting actual Robotaxi availability until at least 2030, probably 2032-2034, based on the current rate of progress.
 
The other sources just don't add information that the car doesn't already know.
Where that's true, I agree. The point is that other sensors do add information beyond vision. LiDAR is more accurate on ranging, and various materials are RADAR transparent (including smoke, dust, and fog). How useful each aspect of data is to the overall system, I have no idea - though seeing through particulates would certainly come in handy. Snow as well, apparently.

My interest here is technological. Whether the investment in those devices provides enough financial return to justify their cost is a separate question.
 
  • Like
Reactions: Ben W
The car started near full throttle acceleration into the intersection with the approaching cars about 3 car lengths to the right. I rolled about 10 feet. Literally 3 more feet and I would have been in front of the first car. As I hit the brake and stopped, the cars passed in front of me, about 2-3 feet away, doing about 45mph.

The tree in the fork was repeatable. To be fair, it did navigate it ok several times. Another few times it did the wiggle and red hands of death. One time it kept driving and I slammed on the brakes, stopping feet from the road edge, pointed straight at the 3ft wide pine tree trunk 10ft away.

Don't tell me it would be fine. It's a complete failure.
So why would you expect that my cars have been able to drive so well on FSD, to the point where we use it most all of the time? Vs your experience where it doesn't appear that it can get out of the driveway.

I must admit that my use may be a little unfair, FSD has literally drove me from the Great Lakes to the Florida Keys. AFAIK, it hasn't driven near your house.

I've got an 3-way intersection near me that is a T into a US highway at 55mph. It's at the peak of the hill and has sight distance of maybe 4 car lengths in either direction. While that car may scare me at times, it is repeatably able to make the turn.
 
  • Like
Reactions: enemji
I still think at some point they may have to bite the bullet and add Lidar/radar back to the sensor suite, especially for L4 Robotaxi. Limited-ODD L3 FSD in good conditions may be achievable with HW4 pure vision, but I'm skeptical that they can achieve wide-ODD L4 with the current fleet and sensor suite.
Certainly there's no point in installing sensors beyond human abilities for an L2 system. We can't supervise what we can't see.
 
  • Like
Reactions: enemji
Yes, let's throw away that data that could indicate a pedestrian on the road. It's probably just shadow braking.

Really? Comparing a 2016 incident with today's FSD. There is absolutely no guarantee that it would have detected it.

But as you just got finished saying, 2 of the 3 sources showed no issues, so the car would decide that it was a phantom event.
I'm glad you're not programming the car!

The job of the neural network is to provide a cohesive and consistent explanation for everything the sensors are "seeing". It will understand that if the camera sees something that might be a shadow but might be a pedestrian, that's an ambiguity, not a "vote for pedestrian". Right now in this situation the system will phantom-brake in an abundance of caution because it has no way to resolve the ambiguity. (Even if the system thinks it's 99% likely to be a shadow, it will still brake.) But a lidar signal would provide a way to conclusively resolve this ambiguity, so the car would not have to phantom-brake.

The 2016 incident was just an example of a pure-vision limitation. Very likely today's FSD would solve that particular case better, perhaps by observing the cab of the semi and inferring there's probably a trailer behind it, even if the trailer were visually camouflaged against the sky. And such accidents can also be avoided by driving hyper-conservatively and braking even if there's a 0.0001% chance of an obstacle, though this would not be an enjoyable driving experience! Faster and better resolution of ambiguities through improved sensors and software will help decrease the rate of both false-negatives (crashes) and false-positives (phantom braking).
 
Certainly there's no point in installing sensors beyond human abilities for an L2 system. We can't supervise what we can't see.
It's not all beyond human abilities. As detailed in an earlier post, the camera sensors still have significant limitations relative to human vision. (Lower resolution, lower dynamic range, more vulnerable to glare/dirt/dust/raindrops, etc.) And L2 still includes emergency braking and collision avoidance, which by definition are beyond what the human is sensing, yet still extremely valuable.
 
  • Like
Reactions: JB47394
The job of the neural network is to provide a cohesive and consistent explanation for everything the sensors are "seeing". It will understand that if the camera sees something that might be a shadow but might be a pedestrian, that's an ambiguity, not a "vote for pedestrian". Right now in this situation the system will phantom-brake in an abundance of caution because it has no way to resolve the ambiguity. (Even if the system thinks it's 99% likely to be a shadow, it will still brake.) But a lidar signal would provide a way to conclusively resolve this ambiguity, so the car would not have to phantom-brake.
This is a very good example. That being said, I recall the reason to move away was that even Radar or LiDAR is not that perfect. Even their readings are ambiguous. So now you have to figure out which of the two you will depend on, and when.
 
I'm glad you're not programming the car!

The job of the neural network is to provide a cohesive and consistent explanation for everything the sensors are "seeing". It will understand that if the camera sees something that might be a shadow but might be a pedestrian, that's an ambiguity, not a "vote for pedestrian". Right now in this situation the system will phantom-brake in an abundance of caution because it has no way to resolve the ambiguity. (Even if the system thinks it's 99% likely to be a shadow, it will still brake.) But a lidar signal would provide a way to conclusively resolve this ambiguity, so the car would not have to phantom-brake.

The 2016 incident was just an example of a pure-vision limitation. Very likely today's FSD would solve that particular case better, perhaps by observing the cab of the semi and inferring there's probably a trailer behind it, even if the trailer were visually camouflaged against the sky. And such accidents can also be avoided by driving hyper-conservatively and braking even if there's a 0.0001% chance of an obstacle, though this would not be an enjoyable driving experience! Faster and better resolution of ambiguities through improved sensors and software will help decrease the rate of both false-negatives (crashes) and false-positives (phantom braking).

Okay, so how does the system determine which is correct? Lidar has absolutely no way to resolve this.
And let me add which is correct "even if there's a 0.0001% chance of an obstacle"


And you already said that the 2016 incident included radar.
 
So why would you expect that my cars have been able to drive so well on FSD, to the point where we use it most all of the time? Vs your experience where it doesn't appear that it can get out of the driveway.

I must admit that my use may be a little unfair, FSD has literally drove me from the Great Lakes to the Florida Keys. AFAIK, it hasn't driven near your house.

I've got an 3-way intersection near me that is a T into a US highway at 55mph. It's at the peak of the hill and has sight distance of maybe 4 car lengths in either direction. While that car may scare me at times, it is repeatably able to make the turn.
I coumd find normal routes where it did fine. Thats irrelevant if I can find 5 places it fails in 100 miles of use.
 
  • Like
Reactions: Ben W
Another way of putting it is that it is over-reliant on semantic map data, or has perception issues. Why?
It's not a perception issue if it "sees" the sign and even visualizes it! It's a logic issue. For speed, it's also observing the flow of traffic, which is why it may ignore speed signs. Your speed setting will also change behavior.

Basically there are a ton of issues to address that is unrelated to perception (lane selection, and stopping behavior is mentioned in other threads). If you solve all those, will lidar and/or radar still matter? That is basically the logic Tesla is using.

You can talk about driving in incremental weather, but having a system that works in good weather is already very useful. Plus if the visibility is so bad, you can't drive visually, that's already not safe to drive in, in the first place.
 
Last edited: