Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

March of 9s: The Beginning of the End

This site may earn commission on affiliate links.
I have had enough experience now with FSD 12 to say that in my opinion, I believe Ashok is correct:


I would also stress we are at the beginning of the end. That is, we will hopefully now start to see significant improvement with the "known FSD issues" and "corner case disengagements" with every major FSD release. "The end" (level 3 /4 autonomy within a wide Operational Design Domain (ODD)) is almost certainly year+ away. I'd like to document, just from one consistent anecdotal use case that can be repeated over time, where we are "starting from."

I will be driving a 90ish mile (2-3 hrs) "loop" under FSD to cover a range of driving scenarios. (Mileage is approximate):

For privacy reasons, there are about 15 miles in my loop that are not included in the link above that takes me to/from my actual home.

1713797820977.png


Here is a Link to inspect the route in detail

This route takes me from NJ Suburbs into and out of Manhattan (NYC) and includes approximately:
  • 10 Miles Suburban driving
  • 65 Miles "limited access highway" driving (This should be using the FSD 'highway stack' which is not the same as the FSD 12 stack)
    • includes interchanges
    • Includes Tolls
    • Includes Tunnel
  • 8 miles of other "highway type driving", (will probably fall under the FSD 12 stack)
  • 6 miles of dense city driving...including areas around Times Square, Rockefeller center, etc which will have dense vehicle and pedestrian traffic.
I will not be recording with a phone or anything like that. However I will try and save dashcam footage of anything notable.

I will report on:
  1. Interventions (Accelerator presses, particularly if safety related)
  2. Disengagements (comfort or safety related)
  3. Overall impressions
As we know Version 12.3.x does not support...(but will need to in the future):
  1. Smart Summon or "Banish"..so what I call the "first 100 yards and final 100 yards" is not available to test. (Drop offs / pick-ups).
  2. "Reversing" while on FSD is not yet supported
Finally, there are what I would say 2 well documented "comfort / safety" issues with FSD 12.3.x that I have also experienced regularly first hand:
  1. "Lane Selection Wobble"...for example, approaching an intersection where the single driving lane splits into multiple lanes (turning vs. straight)...FSD may act "indecisively"
  2. Unprotected turn (stop sign) behavior. Notably: stops early....then creeps. If no cars detected it may creeps into intersection instead of "just going". Further, if it has creeped into the intersection, THEN detects a car approaching, it may still hesitate and require intervention (accelerator press) to get it going.
In addition to those two consistent issues, I expect to encounter some issues related to routing, and any number of other 'corner case' issues. All things that will ultimately need to be handled, but we expect to see dealt with as we progress though the "March of 9s"...toward the "end of the end".

Although I have driven FSD regularly over the past 3 weeks...I have yet to take it into NYC.

Vehicle: Refresh Model S (2023), Vision Only, HW4. First test will be using:
Firmware version: 11.1 (2024.3.15)
FSD Version: 12.3.4

So...there's the set-up. I expect later today to drive the first loop.
 
Last edited:
Loops are a great way to test the car against road and signage conditions. Traffic too, if it is consistent. I have a much shorter drive that I usually end up doing at night so not many cars to deal with. In my loop, the Satnav takes me one way to the destination, but another way home, which seems odd, but it is consistent.

I have the same two issues, which I describe as 'The car drives like my sister, not me.' I don't think safety-related slowing is really a concern, as is driving on a painted line for a bit. I love that I can actually affect the car by prodding with my foot when I want to go a bit faster.

I'm finding that around Phoenix, Arizona, FSD works well enough I can usually just let it get me somewhere. It seems to blend in well at night, but drives more timidly than normal daytime traffic. Most of my disengagements are simply to go faster, but some are because I don't want to turn at some intersection chosen by the Satnav. I disengage, drive the route I want, then re-engage. There should be a way to alter the route by touching the Satnav screen somewhere.

The car seems really safe under FSD except for the times other drivers can't figure out what it will do. They must think I'm crazy when I use a blinker to change lanes and the car overrules me.
 
Last edited:
There should be a way to alter the route by touching the Satnav screen somewhere.
You can long press on the Satnav screen on the location of the "waypoint" you want to create, then click on the "+Pin" button to add the waypoint to your drive, altering your route. In many cases, it will add the waypoint in the correct order, but you may need to "edit route" and drag the waypoints into the correct order if you have multiple waypoints.

When using Waypoints with FSD, I have found that the car may want to "stop" at the waypoints...so you may need to accelerate through them or better yet, remove the waypoint from the navigation once you are getting close to it.
 
I have had enough experience now with FSD 12 to say that in my opinion, I believe Ashok is correct:


I would also stress we are at the beginning of the end. That is, we will hopefully now start to see significant improvement with the "known FSD issues" and "corner case disengagements" with every major FSD release. "The end" (level 3 /4 autonomy within a wide Operational Design Domain (ODD)) is almost certainly year+ away. I'd like to document, just from one consistent anecdotal use case that can be repeated over time, where we are "starting from."

I will be driving a 90ish mile (2-3 hrs) "loop" under FSD to cover a range of driving scenarios. (Mileage is approximate):

For privacy reasons, there are about 15 miles in my loop that are not included in the link above that takes me to/from my actual home.

View attachment 1040716

Here is a Link to inspect the route in detail

This route takes me from NJ Suburbs into and out of Manhattan (NYC) and includes approximately:
  • 10 Miles Suburban driving
  • 65 Miles "limited access highway" driving (This should be using the FSD 'highway stack' which is not the same as the FSD 12 stack)
    • includes interchanges
    • Includes Tolls
    • Includes Tunnel
  • 8 miles of other "highway type driving", (will probably fall under the FSD 12 stack)
  • 6 miles of dense city driving...including areas around Times Square, Rockefeller center, etc which will have dense vehicle and pedestrian traffic.
I will not be recording with a phone or anything like that. However I will try and save dashcam footage of anything notable.

I will report on:
  1. Interventions (Accelerator presses, particularly if safety related)
  2. Disengagements (comfort or safety related)
  3. Overall impressions
As we know Version 12.3.x does not support...(but will need to in the future):
  1. Smart Summon or "Banish"..so what I call the "first 100 yards and final 100 yards" is not available to test. (Drop offs / pick-ups).
  2. "Reversing" while on FSD is not yet supported
Finally, there are what I would say 2 well documented "comfort / safety" issues with FSD 12.3.x that I have also experienced regularly first hand:
  1. "Lane Selection Wobble"...for example, approaching an intersection where the single driving lane splits into multiple lanes (turning vs. straight)...FSD may act "indecisively"
  2. Unprotected turn (stop sign) behavior. Notably: stops early....then creeps. If no cars detected it may creeps into intersection instead of "just going". Further, if it has creeped into the intersection, THEN detects a car approaching, it may still hesitate and require intervention (accelerator press) to get it going.
In addition to those two consistent issues, I expect to encounter some issues related to routing, and any number of other 'corner case' issues. All things that will ultimately need to be handled, but we expect to see dealt with as we progress though the "March of 9s"...toward the "end of the end".

Although I have driven FSD regularly over the past 3 weeks...I have yet to take it into NYC.

Vehicle: Refresh Model S, Vision Only, HW4.

So...there's the set-up. I expect later today to drive the first loop.
Thanks in advance for this test. Could you provide a little more detail when you post your final result? E.g. year of your MS, actual software version. I’m on V11.1 2024.3.15. I don’t know how the versions vary by model but it would help to understand what you are using.
 
Thanks in advance for this test. Could you provide a little more detail when you post your final result? E.g. year of your MS, actual software version. I’m on V11.1 2024.3.15. I don’t know how the versions vary by model but it would help to understand what you are using.
Sure...I updated the first post with vehicle and software info. I will repeat that info when reporting my impressions.
 
  • Like
Reactions: Cal1
would also stress we are at the beginning of the end. That is, we will hopefully now start to see significant improvement with the "known FSD issues" and "corner case disengagements" with every major FSD release. "The end" (level 3 /4 autonomy within a wide Operational Design Domain (ODD)) is almost certainly year+ away.
What is unknown is how long will it actually take - year+, 5 year+, 10 year + .... ? Nobody knows.

For eg. how long will it take for Tesla to recognize/respond to school bus and school zones ? Surely they have enough training material. It isn't an edge case and very easy to gather more training data if they want. They know the deficiency - but have not tried to address it. Makes me think it isn't easy.
 
  • Like
Reactions: cusetownusa
Crossing/riding the left line isn’t usually a problem, but when there’s a wall and no shoulder it can be very uncomfortably close. Sadly, I’ve recently gotten a flat tire because of this. I’m responsible still and should have disengaged, but something flashed under the car too quickly and I didn’t see it in time before the front left tire hit it and blew out.
 
What is unknown is how long will it actually take - year+, 5 year+, 10 year + .... ? Nobody knows.

For eg. how long will it take for Tesla to recognize/respond to school bus and school zones ? Surely they have enough training material. It isn't an edge case and very easy to gather more training data if they want. They know the deficiency - but have not tried to address it. Makes me think it isn't easy.
It most definitely is not easy. You need to train for edge cases for proper behavior without "overtraining" leading to regressions in other behaviors.
 
  • Like
Reactions: DrChaos
It most definitely is not easy. You need to train for edge cases for proper behavior without "overtraining" leading to regressions in other behaviors.
Yes - more specifically - with V10, V11 we saw that FSD hit local Maxima and didn't improve significantly for a long time. Why would it be any different this time - or more accurately what will be the proof that it is same or different this time ?
 
Yes - more specifically - with V10, V11 we saw that FSD hit local Maxima and didn't improve significantly for a long time. Why would it be any different this time - or more accurately what will be the proof that it is same or different this time ?
The proof this time will be
1) How often "major" releases are released to the public. The current talk is that FSD 12.4 is the next "significant" release. The First V12.3.x release (to the public) was around April 1. Let's see when the first 12.4 release is to the public. (Is Tesla able to iterate relatively quickly...which would back their claim of no longer be "compute" limited?)

2) What kinds of improvements happen (or not) with those releases, and are their significant regressions?

If the next 'big release' comes, and we see little actual improvement or major regressions...we can be looking at new local maximums.

Note, we also still have to see Tesla integrate Smart Summon and Banish...and potentially integrate the highway so there is again "one stack to rule them all". Though it's not clear to me if Tesla will prefer to maintain different stacks for Highway vs. City streets. Util these things happen, "door to door" Level 3+ autonomy cannot happen. So we'll also see how long it takes Tesla to finally integrate those features.
 
  • Like
Reactions: cusetownusa
2) What kinds of improvements happen (or not) with those releases, and are their significant regressions?

If the next 'big release' comes, and we see little actual improvement or major regressions...we can be looking at new local maximums.
Right - its the pace of improvement that we need to figure out. Without that - its all speculation.

Specifically I want Tesla to publish categorized hard data on interventions/disengagements. We need to be able to tell the change of that metric by release. If Tesla is willing to do that - it shows they are confident they can make progress. If they don't publish that - it means they are not confident, either.
 
Right - its the pace of improvement that we need to figure out. Without that - its all speculation.

Specifically I want Tesla to publish categorized hard data on interventions/disengagements.

While I'm sure they will (and probably have been) sharing that detail with regulators, I doubt they would share it publicly...at least in detail. Maybe in grand generalizations, sure.

The purpose of this thread is to try and at least (anecdotally and sometimes subjectively) see how the improvements are tracking around a fairly consistent "course".
 
1) I'm not seeing the step change people are talking about.
Screenshot 2024-04-22 at 18.59.42.png


2) ADAS won't become ADS without major re-archtecturing and new approaches. Drago commented on this the other day.

3) The march of nines begins when you have hundreds of drives in a row without interventions. If Ashok doesn't understand that, then he's an overpaid clown. Or perhaps it's simply that he has another goal for his bonus-check than autonomy.
 
Last edited:
Right - its the pace of improvement that we need to figure out. Without that - its all speculation.

Specifically I want Tesla to publish categorized hard data on interventions/disengagements. We need to be able to tell the change of that metric by release. If Tesla is willing to do that - it shows they are confident they can make progress. If they don't publish that - it means they are not confident, either.
Tesla is confidently introverted.
 
The march of nines begins when you have hundreds of drives in a row without interventions. If Ashok doesn't understand that, then he's an overpaid clown. Or perhaps it's simply that he has another goal for his bonus-check than autonomy.
Keep this (several times cross posted) crap out of this thread, please. If you want to start (another) thread ranting about how this is impossible...go ahead. A different thread.
 
Last edited: