Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
I'll say this, then I have to get back to point cloud processing :).

Think about all the complaints we have with just this initial end-to-end version. They are overwhelmingly not issues with cameras seeing things. They are:
-Poor speed control
-Stopping too soon/early at stop signs
-Creeping too slowly for visibility at the intersection
-Waffling when changing lanes or entering a turn lane
-Taking some turns too tight, curbing wheels
-Etc.

Even Chuck's turn is because 12.3 isn't able to handle that intersection well--it's not that it can't see the cars coming down the road.

When I had v11.4.9 the car would try to pull out into traffic right in front of a fast-moving car almost every time. And I have HW4.

It's not that the camera's can't see. It's planning, or some neural network in the system that needs more work. 99.99% of FSD's issues right now are downstream of the perception stage.
 
I read your sun test idea but still don't think the camera sees or computer behind it process as far as human eye can see especially in low light conditions. Example : when you drive in non-lit country roads at night the pillar cameras tell you they are blocked. I can still see the side of the road and trees etc with the ambient light available but the Tesla camera think it is blocked and visualization show me a pitch black picture for the road side.

Not saying vision only can't ever work. All I'm saying is with the current cameras we have onboard our HW4 vehicles the capability is still far from reaching or exceeding human eyes.
There's no way the HW3 or HW4 cameras see anywhere near as well as the human eye. You are absolutely right and I 100% agree with you. I am not claiming that and there's tons of scientific evidence that proves that few cameras in existence can approach the capabilities of the human eye.

What I'm saying is that driving can be done safely with cameras that knowingly have less capability than the human eye. USAF drone pilots regularly fly multi-million dollar UAVs using cameras that have much lower acuity than the human eye, for example.
 
For L4 performance, you have to be able to accurately assess distance and velocity reliably.

For red lights, or yellow lights turning to red lights, I've MANY times seen the car react well in advance of the time required to come to a nice relaxed stop (I'm not saying it came to a relaxed stop (the car doesn't know how to stop), it just reacted with AMPLE margin (then completely f'ed up the profile)). Other times, I have seen the car fail to react for a long time, then suddenly react, slamming on the brakes, etc.

There's no way for us to determine why this is. One possibility is that the reliability of detection is poor (I don't think that's the case for the red light case - I think it is planning - but it is possible the vehicle did not properly perceive the red light on occasion due to lighting, etc.).

Anyway, for the cross traffic case on Chuck's turn, it's very clear the cameras have ample range to perform the turn reliably at least 9/10 times. We've seen it make the right go/no-go decision in recent videos 100% of the time (resulting in 3/7 success rate). Based on many examples of the car's response, they can detect at at least around 150 meters (including some estimate of velocity presumably), sometimes more. But it's really unclear how reliably the cameras and perception system can detect position and velocity at this range.

It's possible that reliability/noise becomes an issue when we get to the 99/100 or 999/1000 level. It seems extremely unlikely to prevent getting to the first 9. You can still have one fatal t-bone collision out of 10 attempts to achieve the first 9 (I guess it has to be the last one)! (I hope Chuck continues to be extremely careful - it's quite a bit more hazardous to supervise the system than it is to drive the turn yourself, as I'm sure we all know from experiencing this.)

I have no idea how we could assess this, because we can't know what happened on any particular failure.

The better the visual system & the perception system, the lower the rate of failure will be. It needs to be really low. We have no idea what is the failure rate currently.
 
  • Like
Reactions: frankvb and kabin
There's no way the HW3 or HW4 cameras see anywhere near as well as the human eye. You are absolutely right and I 100% agree with you. I am not claiming that and there's tons of scientific evidence that proves that few cameras in existence can approach the capabilities of the human eye.

What I'm saying is that driving can be done safely with cameras that knowingly have less capability than the human eye. USAF drone pilots regularly fly multi-million dollar UAVs using cameras that have much lower acuity than the human eye, for example.

I really want you to be right and my 2023 MYLR to reach L4/Robotaxi without any need for retrofit. I really want Tesla/Elon to be right on this. I just feel the rate of progress is not fast enough.
 
EDIT - I see your post above seems like we were typing about same time. Your points are valid however the issue is what is possible and not possible with what we have as of today in our cars. I for one will be very happy to be proven wrong if Tesla can pull off L4/Robotaxi with the HW4 vision only as that would mean I don't have to replace my car which otherwise I like very much! :)
Elon and Ashok have seen 12.4 and early 12.5 versions. Elon accelerated RT plans right around the time he would have been seeing 12.4 or early 12.5 results. I'll just say this: I don't think that's a coincidence.
 
I really want you to be right and my 2023 MYLR to reach L4/Robotaxi without any need for retrofit. I really want Tesla/Elon to be right on this. I just feel the rate of progress is not fast enough.
The rate sucked for quite awhile as the team was persuing hard-coded control code. That was never going to work. But they've now seen the light. I would classify the improvement from 11.4.9 to 12.3 to be huge.
 
Elon and Ashok have seen 12.4 and early 12.5 versions. Elon accelerated RT plans right around the time he would have been seeing 12.4 or early 12.5 results. I'll just say this: I don't think that's a coincidence.

There are numerous threads bashing Elon so I don't mean to add more on it but allow me to say this much. Elon is saying or not saying things based on how he wants to shape the Tesla narrative and stock ticker curve. I think many of us have learned to take things with a grain of salt when it comes from his mouth or X posts.
 
  • Like
Reactions: old pilot and kabin
There are numerous threads bashing Elon so I don't mean to add more on it but allow me to say this much. Elon is saying or not saying things based on how he wants to shape the Tesla narrative and stock ticker curve. I think many of us have learned to take things with a grain of salt when it comes from his mouth or X posts.
You're fairly new here. If you think Elon's concerned about the TSLA stock price, you definitely haven't done your homework. As someone who has owned TSLA since the IPO, I have "enjoyed" the fruits of Elon's stock pumping.

Like the time he said the TSLA price was too high. Or said that without autonomy, Tesla is worth nothing.

Ask me how I know. I've had many 6-figure down days as a stockholder as a result of Elon's "significant interest" in propping up the TSLA stock price.

But I'll tell you one thing, if there's one thing Elon doesn't care much about, it's short-term stock price swings.
 
The rate sucked for quite awhile as the team was persuing hard-coded control code. That was never going to work. But they've now seen the light. I would classify the improvement from 11.4.9 to 12.3 to be huge.
I was hoping that the move to V12 would result in really fast new releases, but so far that hasn't been the case. Now we have at times waited as much as previous version longest, yet, but it hasn't been a real update every 2 weeks...only minor bug fixes, which were frequent on previous versions as well.

I think there's substantive proof that Elon isn't happy with the sales or direction of the company right now and potentially the stock price (as he wants more voting percentage and his comp package).

Until we see what the next real update is like vs. 12.3 in 12.4, it's all a guess at this point.
 
Three is a pattern. I'll be waiting to see 12.3, 12.4, and 12.5 before I start to form an opinion about progress moving forward.
Maybe, but 2 certainly will give us insight.

Odd that it is taking so long for 12.4, when Elon spoke about how great it was before 12.3 went wide. How much "cleaning up" needs to be done on each version before it's released? (obviously rhetorical).

I think the assumption was that releases, without as much hard coding, would produce and deploy rapidly, probably our/my mistake with the interpretation.
 
  • Like
Reactions: JB47394
A little back on topic. To me form 10.2 - 11.4 FSD was a fun CHALLENGE. I almost always used FSD but it was like "video game" type "work/stress" and many times would kinda dread driving in challenging urban areas. No doubt for me V12 is a "game changer" and still a fun experience but NO longer a challenge or stressful. Also instead of going to the gym, home, work, etc. directly I now almost always put in waypoints to drive through different areas for "fun".

I have also been VERY skeptical that our cars will ever be L4. Now with V12 I have softened a little but still over all skeptical. I feel the camera locations preclude L4 but I can almost see a Vision L4 Tesla robotaxi with the addition of a few cameras and faster more powerful HW5 processor. Still needs 4 cross traffic cameras for full safety and will also need a front bumper camera too. Can't be blind in the front bumper area and risk a "Cruze" type accident where a person is in the front bumper and dragged under the car because it can't see them.

I'm still happy with our cars getting as good as they have and can defiantly see a reliable point to point L2 for us within a year.
 
The best way to do that is to be refining one version (12.4) while you’re still earlier in the process for the next version (12.5).
I would guess 12.4 is fine-tuning from 12.3 failures whereas 12.3 was mostly refining from 11.x failures, and these earlier 12.0-12.4 were built on some base foundation understanding that needed to go through various iterations of training and fixing regressions before a new version like 12.4 is ready for customers.

Potentially 12.5 differs in that base understanding has been retrained with more data, improved architecture and for longer with available compute. The same type of fine-tuning needed before 12.3 was widely released would need to happen with this new foundation to avoid regressions for customers, but Autopilot team can already see where it's much improved even if still weeks out from release.
 
  • Like
Reactions: JB47394
I think the assumption was that releases, without as much hard coding, would produce and deploy rapidly, probably our/my mistake with the interpretation.
They still have development tasks, only now they're curation of training data, ordering of training data, tuning depth and width of network layers, the consequences of each change, understanding what approaches converge on good behavior or bad, and so on. They have a research project of a safety critical system on production hardware. I do not envy them.
 
Ben W said:
Removing humans from the equation won't be a reality for another 50 years, except perhaps for limited "autonomous-only" freeway lanes. There's no practical way to skip the intermediate step, unfortunately.

Yes, it's a long ways off and you can't skip the intermediate human+robot driver step. Though I imagine it could be done within 20 to 30 years. There are probably not many cars on the road older than that at any given time. I imagine it would require just a few years to draft and begin implementation of a protocol for inter-operation of autonomous vehicles. The government could mandate that all new vehicles incorporate the system. The system would add a modest cost to the car, but consumers could be incentivized with credits (similar to our current EV consumer incentives), or buy a used car in their price range. In 30 years the number of running cars without the system would be minimal. Those cars would need to be prohibited from being driven so the switch could be made to autonomous only roads.

Ben W said:
Yes, this is a big problem, and a reason I wish Tesla had designed more forward-compatibility into its compute hardware, so that e.g. a HW5 computer could be a drop-in replacement for HW3 or HW4. That would allow them to put more focus on a single newer compute platform, although it would still need to be adaptive to a range of sensor suites. It's also why I vastly prefer Lucid's sensor approach to Tesla's, in terms of forward-thinking. (Not that Lucid doesn't have other tremendous challenges.)

The thing is there will always be advancements in hardware (better cameras, more cameras, new sensors, better placement of sensors), And it doesn't seem possible to future-proof the neural nets. Anytime a new sensor is added or enhanced, the NN will need to be rebuilt. For example, let's say Tesla adds 3 forward facing cameras near the front bumper to enhance cross-traffic detection. Under the previous approach, only the NN responsible for perception needs to be retrained. The layer responsible for making decisions based on output from the perception layer would not need any adjustments. It would simply have a more detailed map of its environment with which it could make better decisions. With E2E, is there even a separate perception layer anymore? As I understand, there is one continuous NN that would need to be retrained with a brand new set of training data featuring the new sensors. And that new data is not going to be of identical scenarios if you are re-capturing real world footage. Which means the new NN might introduce regressions for some situations.

Furthermore, since we are using machine learning, would adding sensors even help? The current placement of the cameras approximates what the human driver can see from the driver's seat. You could add 3 front bumper cameras, but since the human driver cannot see information from that vantage point, whatever information is picked up by those cameras wouldn't influence a human driver's decisions, and therefore wouldn't influence the behavior demonstrated in the training footage. Though I suppose this could be addresses with simulated footage.
 
Last edited:
Just a PSA for anyone who hasn't checked their Software tab in a few days:

I received 2024.3.20 (FSD v12.3.5) a few days ago, and then never got any app notifications about 2024.3.25 (FSD v12.3.6). Just checked the software tab in my car, and not only was 2024.3.25 available, but it was already downloaded and my car was prompting to install immediately or schedule an installation.