So will the next Gigafactory just be to house the server farm required for the march of the nines?TL-DR; Probably better in the FSD thread, but hey, it is the weekend! It is happening faster than I think most realize, but I think you have a good grasp on the challenges at hand. And I realize Tesla has hit some local maxima in the past and had to re-architect, but this time, it seems apparent that their approach has no fundamental flaws and they've exposed nearly the entire process. And the more I typed below, the more questions might come up, so happy to respond if ping'd in the FSD thread...
To address your points:
I think a practical ramification would be that it demonstrates safety is higher with FSD being enabled and thus those with it are inherently safer on the road with it on vs off. And being able to recognize revenue as well.
The method of releasing beta first to a small subset of objectionably measured extraordinarily safe drivers I thought was brilliant. Demonstrate >10x safety statistically, then expand and repeat. At wide release, all the eyes of the world will be descending on the data, even more so than it is now. This is great and good as the data will demonstrate >10x safety. Possibly even much higher like >100x based on what I'm calculating.
For wide release, I would think that they would have objective and subjective key metrics by which they have determined would be a minimum level of practical usefulness and safety in order to release it wide. I'd imagine that one would be what percentage of miles driven is with FSD on vs off. I use it basically all the time, but until it gets smoother with near zero unnatural slowdowns, my wife won't use it. This is why I say that wide release is imminent as they are targeting and have largely solved these top objective and subjective issues with the current 10.69.3
With 10.69.2.4 there are still legitimate safety issues. I have two that are recurring on my normal routes (a UPL without a creep wall and several multi-lane roundabouts) but that is entirety of where it fails for critical safety.
Do they need to solve these prior to wide release? No, as FSD will still have "beta" in its name.
Does it need to demonstrate objective >10x safety? I think they are targeting this.
Do they need the majority of people to use for a majority of their miles? Totally, because that is the key to getting more useful data.
Do they have a way to consume the vast amounts of data that will be produced? That is what the 'human-out-of-the-loop-auto-labeller' and Dojo is for. Obviously, they can do that with their GPU clusters in the meantime, just in slower iterative steps.
Is it possible they could hit a "local max" (aka technological wall as you put it)? Sure, but that is highly unlikely at this point and it seems 10.69.3 is a test of that. This build seems like the first real step towards a wide release candidate. The items in the release notes suggest they are polishing out as much as they can to output a very stable release.
The other thing that I want to address for folks is this idea that the build is going to get exponentially better. For me, it has, however, this is most likely vastly different as a subjective discernment for everybody. Normal human opinion is great, and valued, as well as the objective measures. Objectively we can measure it simply as (FYI, these are my actual numbers):
1. What % of the time is AP used on highway? >90% (meets criteria for wide release)
2. What % of the time is FSD used for straight surface streets? >70% (needs to be >90%)
3. What are the top issues holding back the biggest gains for surface streets?
Then you can more accurately guess: What *would* be the % used if the top three issues are addressed? >90% (thus meeting the criteria for wide release)
- 40% of all interventions are due to unnecessary slowdowns
- 30% are due to short duration lane change or missed turn due to wrong lane
- 20% are due to the high jerk rate in steering wheel and throttle
- 10% are due to construction/school zones and roundabouts
- 10% other
So, practically, you'd start looking at ways to address these and I think they have done that with this build.
Assuming they achieve wide release, then what is next? This is then the true march of 9's.
My worries are that when their current implementation of the lane connectivity graph is applied, that is it not able to achieve a 99.99999% success rate overtime (where overtime could be a year) and the neural planner on top of that is also not able to achieve a 99.99999% success rate. These stacked tolerances are key to fully human unsupervised driving. The reason I see it succeeding is that they are not only optimizing the NNs, but are adding new ones to essentially supervise other NNs. While this might be considered a 'crutch', this is a faster way to get to a full solution as one NN might be able to achieve some aspects of driving to an exceptional high degree, while others it might not be as good at. Now take that output and apply it as input to another NN that *CAN* achieve high levels of success. This is why they have a 3d occupancy NN and then it's output is an input to the occupancy flow NN and then outputting to an object detection NN which has outputs for vectors of those objects, which is then used for the neural planner and then for control. All of these have tolerances or limits on their accuracy, precision and recall. Stack them all together and it becomes much less. But if you can have a layer supervise another, correct those issues, before they are pushed to another (aka two layer model) then you start to reap benefits of one layer without consuming the entirety of its weaknesses.
And to close out, this is why I'm most convinced they have the right architecture which is the the lane connectivity graph. When this outputs *about* where the destination is located (which is beyond the current perspective of the real-time camera system), the neural planner can essentially *always* start you about in the right direction. This might sound simple and obvious, but to create this lane connectivity graph was seemingly profoundly difficult and complex. Have you noticed how on AI day they used satellite photos of intersections to demonstrate how they build the topology? Obviously, they can't use satellite images as they become stale immediately and thus could not be used. They must be harvesting data from the fleet they know has a certain age (freshness) and then building out their understanding of the intersection from those images (not necessarily NeRFs, but enough to build the topology), guiding the car with that learning and at some point this information goes stale and has to be refreshed via the same or similar process. It is this process which will enable the foundation of self driving, being able to graph the possible destination options and then predicting which one is best and being right 99.99999% of the time.
/s