Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
Tesla has already shown that just cameras is enough for mediocre-quality L2. (The extremely wide ODD is impressive, and the quality on highways is foreseeably approaching L3 level, but the quality within the trickier parts of the ODD is still mediocre.) Tesla has emphatically not yet shown that pure vision will be enough for L4/Robotaxi.

So true. Issues relating to weather, pitch black conditions etc need to be addressed before they get anywhere near L4/Robotaxi. Human eye (yes even my aging eyes) can still do much better than Tesla vision when it comes to distance and low light performance.
 
  • Like
  • Disagree
Reactions: STUtoday and Ben W
So true. Issues relating to weather, pitch black conditions etc need to be addressed before they get anywhere near L4/Robotaxi. Human eye (yes even my aging eyes) can still do much better than Tesla vision when it comes to distance and low light performance.
Correct, and it is exactly in these situations where radar and lidar add the most value. The human eye is still vastly superior to Tesla's cameras, in terms of effective image quality and dynamic range. And the fact that we can move our head around as needed, and sit far back from the not-perfectly-clean windscreen glass (rather than pressed right up against it, where a single dirt smudge or water drop can potentially obscure the entire view), is also a tremendous advantage.
 
Last edited:
  • Like
Reactions: OldManCan
Ben W said:
Lucid is taking this approach. Their cars have 32 external sensors (14 cameras, 5 radars, lidar, 12 ultrasonics), which sets them up extremely well for autonomy once the software and compute hardware is ready. I wish Tesla were more committed to computer upgradeability within their fleet; I would love to know that my 2022 Model Y (with HW3) might be upgradeable to the HW5 computer when it becomes available in a couple years. (Elon said late 2025.) Likewise, I would be far more likely to upgrade to a HW5 Model S in a couple years if I had assurances that it would be forward-compatible with, say, HW7. (Just the computer; not necessarily the entire sensor suite.) Cars don't wear out in 2-3 years like cell phones, and it would add tremendous value to be able to keep them current (compute-wise) for substantially longer. It does potentially add a lot more configurations for Tesla to support, but even this is bounded and manageable, if Tesla were to restrict the upgrades to say 2 generations. This would still be enough to mostly cover the 8-10 year life of the typical car.

That's interesting, I wasn't aware of what Lucid was doing. Though I was referring to the ability of cars to communicate with each other in an autonomous vehicle only setting, not the more complex current setting which is a mix of human and robot drivers. The autonomous driving problem becomes a lot simpler when you remove humans from the equation.

On the topic of upgradablity, I'm also concerned about the scalability of the current approach. When HW4 was announced, Elon mentioned that FSD would not be optimized for HW4 because the neural nets would need to be retrained for it. AI DRIVR speculated that the way they were able to get FSD working on HW4 so quickly was by processing the camera feeds to be normalized across HW3 and HW4. So if Tesla was to add more cameras or sensors to the cars, does that mean the neural nets must be re-built from scratch every time there is a hardware upgrade? Further more, would they need to maintain a separate NN for each hardware configuration? Even currently with one NN and the fairly homogeneous set of Teslas in production, there have been reports of FSD performing differently on different cars, despite being on the same FSD version. This would also explain why FSD is delayed for the Cybertruck. Its dimensions are wildly different from any other Tesla, which would suggest the need for its own NN.

I think the challenge before Tesla now is curation of the training data. If they can figure out how to collect the exact training data that they need to produce the control behavior(s) that they want, then they should be able to take this system to the limits of the hardware, wherever that is.
A problem with neural nets is that you are dealing with one giant network. You can add new training data in "hopes" that it will improve a certain behavior, but you risk affecting another behavior. This is why I feel E2E lacks a solid feeling, since you are not adjusting logic where the result is more predictable. That and the fact that communication between the software and the human is lost. For example, the car can no longer explain its actions with status messages (e.g. "changing lanes to avoid rightmost lane"), and the human can no longer request that the car drive at a specific speed (we can only set the maximum speed).
 
Last edited:
  • Like
Reactions: Ben W
If Lucid used Tesla's sensor suite instead, they would still be losing $249k on every car they sell (although the naive calculation that yields the $250k loss figure is highly misleading for a company in Lucid's phase of growth, and does not accurately reflect marginal cost). Tesla "lost" a similar amount on their early Roadsters, and their early Model S's in 2012, even with no sensors at all. The cost of the off-the-shelf sensors is a drop in the bucket compared to other factors.

Tesla has already shown that just cameras is enough for mediocre-quality L2. (The extremely wide ODD is impressive, and the quality on highways is foreseeably approaching L3 level, but the quality within the trickier parts of the ODD is still mediocre, compared to a skilled human driver.) Tesla has emphatically not yet shown that pure vision will be enough for L4/Robotaxi.

Counterintuitively, adding lidar dramatically reduces the compute requirements; it doesn't increase it. That's because a tremendous amount of value (constructing ground-truth 3D maps) is obtained instantly and for free by lidar, whereas it requires a huge amount of computation (and lag) when done by pure vision. The same is true to a lesser degree for radar and ultrasonics.

Lucid has a daunting uphill climb ahead of them, but there are many aspects of their product that are extremely impressive, and I do hope they survive.
I'm just going to have to disagree with you there on multiple counts for multiple reasons. Adding lidar does NOT reduce the compute requirements, because you STILL need camera vision and the corresponding neural networks to do things that lidar is completely and utterly useless for--Reading signs, identifying and determining the color of stoplights, lane markings, etc. So with Tesla's approach, you get all the info you need from the single set of cameras. With Lucid's approach, you need those same neural networks to read signs, identify stoplight colors, etc...but then you have to do all the lidar processing, the radar processing, you need to localize and fuse that information into one coherent picture of the world around you, etc.

(I do point cloud processing of billions of 3D points from precision measurement hardware for a living, so have a little bit of experience in this area).

I'm willing to bet the incremental cost of 5 radars, N lidars, and the processing sufficient to do the above for an autonomous driving application is going to be significantly north of $1000.

Yes, of course Lucid is losing money on each car for a whole host of reasons--that's why I said part of. One of Rawlinson's biggest mistakes was that he wanted to get back at Musk so tried to out-Tesla the car. Fine, if you disregard the cost of production you can make a car that outperforms a Tesla. But ultimately you have to make money on the thing, and Rawlinson suffered from the same flaw that most CEOs who try to start a car company suffer from: The inability to recognize that making a profit on the car is key to survival.

But the proof is in the pudding. Can't wait to see Lucid's autonomous driving solution in a few years and compare it to Tesla's. That is, if the Saudis haven't decided to stop funding the black hole they're throwing their money into by that point.
 
Last edited:
If you still believe that lidar and radar are required to achieve autonomy (yet, for some reason you are able to drive without such sensors), you will not be convinced by any argument.
Actually, if Tesla or someone releases a L4 system based on vision only, then that will be proof as a fact. But as of now the only L4 systems have LIDAR and radar augmentation.
 
Totally agree that this is the way to go. Our vision only cars can only see as far as the camera hardware allows which is still far less than what human eye can do. Example, we can see the red light coming and start slowing for it in a gentle way far earlier than FSD can do. Another one, on a fast rolling freeway you can see cars upto half a mile ahead starting to slow down, the tail lights building up and prepare for a hard stop. Vision only with today's cameras can't do that. However if the lead cars were communicating their sensor and status updates then followers could take proactive stance. This is the way to solve late response issues.
This is a misconception that has been perpetuated by people like Chuck Cook who, while well-intentioned, is a good guy, and a great tester--misunderstands the science of optics for some reason. Do you think that suddenly cameras just can't see past 200m?

Go get in your car. Point it at the sun. Pull up the camera view. Can you see the sun in the camera? If so, then the range of the camera in your Tesla is over 93 million miles.

There is no hard cutoff for the range of a camera. It's based on the size of the target, focal distance, focal range, etc.

Now try another experiment. Find an intersection with a stoplight. Without looking at the light at all, try to determine whether you have a green or a red light based on clues, none of which have to do with the light itself. It's pretty easy--we do it all the time. If there are cars stopped at the intersection in the same direction of travel as the road you're on, you've got a red. If you see crossing traffic going through the intersection, you've got a red.

AI can easily pick up on context clues like this and fill in missing information, just like humans can.

With all that's happened in the world of AI in the last year or so, it's shocking to me how much people still underestimate its capabilities.
 
This is a terrible example fwiw. The sun's light is traveling to you.
I've give another example.

For years, Chuck Cook thought the car couldn't handle his cross traffic because the cameras couldn't see it. That was not the issue. Watch some dashcam recordings. Assuming you're viewing the video at 100% scale, you can see traffic a long way away. And you're already looking at a video stream that has been compressed for storage purposes. The issues at Chucks intersection are 100% related to planning and control, which the v12 versions are addressing.

Not sugar coating it--the car can't handle that intersection reliably at all yet. No doubt it would be a death sentence for anyone who went through many times and just let the car do its thing. But the reason isn't because the camera can't see the cars.
 
  • Like
Reactions: FSDtester#1
Anything that you can see is a result of light photons traveling to you and landing on your retina. I'm not sure what that has to do with anything.
Think about this. You are not observing the sun 93 million miles away...It's not comparable to deciphering a car 200m away. It was just a dumb thing to say and ruined any point you were trying to make.

Camera quality in megapixels, vocal length, aperture, etc. absolutely matters with FSD. Signs can be read from further away with higher megapixels, scenes can be deciphered to more detail, etc. AI cannot correct or piece together bad images or blurriness exponentially, that's not reality.
 
  • Disagree
Reactions: dhanson865
That's interesting, I wasn't aware of what Lucid was doing. Though I was referring to the ability of cars to communicate with each other in an autonomous vehicle only setting, not the more complex current setting which is a mix of human and robot drivers. The autonomous driving problem becomes a lot simpler when you remove humans from the equation.
Removing humans from the equation won't be a reality for another 50 years, except perhaps for limited "autonomous-only" freeway lanes. There's no practical way to skip the intermediate step, unfortunately.
On the topic of upgradablity, I'm also concerned about the scalability of the current approach. When HW4 was announced, Elon mentioned that cars would not be optimized for HW4 because the neural nets would need to be retrained for it. AI DRIVR speculated that the way they were able to get FSD working on HW4 so quickly was by processing the camera feeds to be normalized across HW3 and HW4. So if Tesla was to add more cameras or sensors to the cars, does that mean the neural nets must be re-built from scratch every time there is a hardware upgrade? Further more, would they need to maintain a separate NN for each hardware configuration? Even currently with one NN and the fairly homogeneous set of Teslas in production, there have been reports of FSD performing differently on different cars, despite being on the same FSD version. This would also explain why FSD is delayed for the Cybertruck. Its dimensions are wildly different from any other Tesla, which would suggest the need for its own NN.
Yes, this is a big problem, and a reason I wish Tesla had designed more forward-compatibility into its compute hardware, so that e.g. a HW5 computer could be a drop-in replacement for HW3 or HW4. That would allow them to put more focus on a single newer compute platform, although it would still need to be adaptive to a range of sensor suites. It's also why I vastly prefer Lucid's sensor approach to Tesla's, in terms of forward-thinking. (Not that Lucid doesn't have other tremendous challenges.)
A problem with neural nets is that you are dealing with one giant network. You can add new training data in "hopes" that it will improve a certain behavior, but you risk affecting another behavior. This is why I feel E2E lacks a solid feeling, since you are not adjusting logic where the result is more predictable. That and the fact that communication between the software and the human is lost. For example, the car can no longer explain its actions with status messages (e.g. "changing lanes to avoid rightmost lane"), and the human can no longer request that the car drive at a specific speed (we can only set the maximum speed).
It would require a parallel neural network with supervised training to re-incorporate such status messages, and it could never perfectly match what the primary NN is "thinking", because the primary NN literally no longer has a "changing lanes to avoid rightmost lane" "bit". So it would be a case of, "the secondary neural network currently thinks that THIS [status message] is why the primary neural network is performing this behavior."

Constraining the behavior should still be solvable (e.g. train the network to more highly prioritize the user request such as set speed), but agreed that no one yet knows how to completely tame a beast as complicated as Tesla's full E2E network. This, combined with the number of orders of magnitude improvement required to get there from its current state, is why I think L4/Robotaxi is still at least 6-8 years away.
 
I've give another example.

For years, Chuck Cook thought the car couldn't handle his cross traffic because the cameras couldn't see it. That was not the issue. Watch some dashcam recordings. Assuming you're viewing the video at 100% scale, you can see traffic a long way away. And you're already looking at a video stream that has been compressed for storage purposes. The issues at Chucks intersection are 100% related to planning and control, which the v12 versions are addressing.

Not sugar coating it--the car can't handle that intersection reliably at all yet. No doubt it would be a death sentence for anyone who went through many times and just let the car do its thing. But the reason isn't because the camera can't see the cars.
Again, you are misinterpreting things.

He had an issue with 1) the placement of the B pillar camera not being able to see what a human can leaning forward (accurate, but not to the level he originally thought) and determining speed/placement in objects at a distance. A low mpx camera, regardless of AI, can have an issue with this which would basically blur images together, but the car is not typically making decisions at that distance.
 
Think about this. You are not observing the sun 93 million miles away...It's not comparable to deciphering a car 200m away. It was just a dumb thing to say and ruined any point you were trying to make.

Camera quality in megapixels, vocal length, aperture, etc. absolutely matters with FSD. Signs can be read from further away with higher megapixels, scenes can be deciphered to more detail, etc. AI cannot correct or piece together bad images or blurriness exponentially, that's not reality.
Thank you for confirming my point.

I'm not disagreeing with you and I have no idea why you think what you said above contradicts anything that I wrote.

What I'm saying is: Tesla published some sort of "range" value on their AP cameras. And many people, like Chuck, took that to mean that the cameras can't see past that range.

My whole point was that you can still see stuff in a camera image that is well beyond the "200m range" or whatever they put on the AP site. So I don't understand why people just assume that beyond some distance x the camera just can't see cars anymore.
 
Thank you for confirming my point.

I'm not disagreeing with you and I have no idea why you think what you said above contradicts anything that I wrote.

What I'm saying is: Tesla published some sort of "range" value on their AP cameras. And many people, like Chuck, took that to mean that the cameras can't see past that range.

My whole point was that you can still see stuff in a camera image that is well beyond the "200m range" or whatever they put on the AP site. So I don't understand why people just assume that beyond some distance x the camera just can't see cars anymore.
I'm not disagreeing with your point. I just think the Sun example was poor and that most of Chuck's qualms were the placement of the B pillar.

I've heard him say a higher megapixel camera could see that better and agree with you that's unlikely.
 
Again, you are misinterpreting things.

He had an issue with 1) the placement of the B pillar camera not being able to see what a human can leaning forward (accurate, but not to the level he originally thought) and determining speed/placement in objects at a distance. A low mpx camera, regardless of AI, can have an issue with this which would basically blur images together, but the car is not typically making decisions at that distance.
I am confusing and misinterpreting nothing.

He had concerns with both. I agree with his concern about the B-pillar cameras. Still don't understand why Tesla didn't place them further forward. I think aesthetics trumped there.

But he also verbalized concerns about camera range in many videos. I don't care enough to dig them up.
 
I'm not disagreeing with your point. I just think the Sun example was poor and that most of Chuck's qualms were the placement of the B pillar.

I've heard him say a higher megapixel camera could see that better and agree with you that's unlikely.
Anyone who says a higher megapixel camera, or one with a higher dynamic range, can see better is absolutely--right.

Of course. That's why HW4 is higher resolution than HW3. And why HW5 will be better than HW4.

And anyone who says that the human eye has greater dynamic range and resolution than HW3 or HW4 or HW5 or HW6 is absolutely--right.

But my dad was blind in one eye. My uncle was completely colorblind. Both of them drove fine. My wife has significant issues seeing at night, and she drives fine.

Low light sensitivity helps--but that's why we have headlights. Being able to see a breadcrumb stuck to a fly's thorax across the room is great, but the road system doesn't demand such visual acuity. It's more than what is needed to drive safely.
 
What I'm saying is: Tesla published some sort of "range" value on their AP cameras. And many people, like Chuck, took that to mean that the cameras can't see past that range.

My whole point was that you can still see stuff in a camera image that is well beyond the "200m range" or whatever they put on the AP site. So I don't understand why people just assume that beyond some distance x the camera just can't see cars anymore.

I read your sun test idea but still don't think the camera sees or computer behind it process as far as human eye can see especially in low light conditions. Example : when you drive in non-lit country roads at night the pillar cameras tell you they are blocked. I can still see the side of the road and trees etc with the ambient light available but the Tesla camera think it is blocked and visualization show me a pitch black picture for the road side.

Not saying vision only can't ever work. All I'm saying is with the current cameras we have onboard our HW4 vehicles the capability is still far from reaching or exceeding human eyes.

EDIT - I see your post above seems like we were typing about same time. Your points are valid however the issue is what is possible and not possible with what we have as of today in our cars. I for one will be very happy to be proven wrong if Tesla can pull off L4/Robotaxi with the HW4 vision only as that would mean I don't have to replace my car which otherwise I like very much! :)
 
  • Like
Reactions: Ben W