Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD Beta 10.69

This site may earn commission on affiliate links.
When some of us experience “regressions” in behaviors (ie, worse handling of certain situations) with FSDb software updates (at least initially-then things seem to improve) - what is the mechanism/reason? Is this resetting/zeroing out fleet learning?? Or something else.
This is purely speculation, but it may be bias built into, let's call it, the "certification" dataset. In training a neural network, you have training data and validation data that are used to train and tune the neural network. I believe they put a lot of effort into updating training/validation data with new scenarios to capture new issues and add new capabilities. You then run the neural network over a testing dataset to confirm your results. This dataset must also be carefully curated to include different scenarios of the new features/fixes being implemented. I would bet in addition to data capture from the fleet, they also simulate a lot of the testing dataset to test very specific scenarios.

At some point, however (and this is where the speculation comes in), they must run some sort of simulated tests over a comprehensive "certification" dataset to make sure they haven't introduced some horrible or dangerous new behavior into FSD before releasing it to employees and other early-access program testers. This dataset may be very static to represent a baseline of functionality for certifying releases. And, if that dataset was built from, e.g., California driving, then when it rolls out to California employees and EAP users in California, the results of the testing and certification will likely be validated. But what about when it reaches users in Montana? Outside of particular scenarios captured from Montana in the training/validation datasets, there's been no verification of function of the new neural networks in Montana driving. Thus you may see regressions show up in some contexts or areas where there are no regressions in other areas or contexts.

I will add that this could be the case in the feature recognition (road markings, lighted signals, sign, etc.) and driving decision (lane selection, stop light/stop sign behavior, etc.) neural networks. When it comes to the vision processing and occupancy networks, this type of bias should have no effect.

Let the flaming begin.
 
Last edited:
The term "regression" is often used when a previously hard-coded function is transitioned to a newly created neural net, or when old neural nets are decommissioned and replaced with newer, streamlined nets. The nets have to be trained, so the functionality seems worse until the fleet data is processed and updates tweak the performance.

I think this point causes frustration from many FSD Beta users who feel that an update made things worse. The goal is to improve functionality and capability, but it's our job to test the update which is providing valuable data to Tesla for training the NNs in real-world situations.

Any time I read about the creation of a neural net, or the removal of an older one, in the release notes, I know there will be some rough driving ahead. I prepare for it, expect it, and report issues so that the net can be trained. As updates roll out, that issue is usually smoothed out.

Remember people - we signed up for this. We asked to be invited, had to prove we were safe drivers, had to agree to disclosures and language telling us that the car can do the wrong thing at the wrong time, and have to pay extreme attention while using the system, reporting as we go. We do this to make the system better, and safer, for everyone.

43,000 people died in car crashes last year in the US. If our efforts can produce an ADAS system that saves even a few of those lives, it's worth it in my book.
 
This is purely speculation, but it may be bias built into, let's call it, the "certification" dataset. In training a neural network, you have training data and validation data that are used to train and tune the neural network. I believe they put a lot of effort into updating training/validation data with new scenarios to capture new issues and add new capabilities. You then run the neural network over a testing dataset to confirm your results. This dataset must also be carefully curated to include different scenarios of the new features/fixes being implemented. I would bet in addition to data capture from the fleet, they also simulate a lot of the testing dataset to test very specific scenarios.

At some point, however (and this is where the speculation comes in), they must run some sort of simulated tests over a comprehensive "certification" dataset to make sure they haven't introduced some horrible or dangerous new behavior into FSD before releasing it to employees and other early-access program testers. This dataset may be very static to represent a baseline of functionality for certifying releases. And, if that dataset was built from, e.g., California driving, then when it rolls out to California employees and EAP users in California, the results of the testing and certification will likely be validated. But what about when it reaches users in Montana? Outside of particular scenarios captured from Montana in the training/validation datasets, there's been no verification of function of the new neural networks in Montana driving. Thus you may see regressions show up in some contexts or areas where there are no regressions in other areas or contexts.

I will add that this could be the case in the feature recognition (road markings, lighted signals, sign, etc.) and driving decision (lane selection, stop light/stop sign behavior, etc.) neural networks. When it comes to the vision processing and occupancy networks, this type of bias should have no effect.

Let the flaming begin.
An excellent post Goose. It does seem that California drivers benefit more from the updates than other areas. Many of the influencer videos I see showing excellent FSD performance aren't all from California, which could indicate similarities between those regions and California in terms of street design, signage, traffic controls, etc.

Hopefully the testers in Montana are reporting their issues, which should be giving Tesla valuable data for training. We just don't know how that data is being used, or if there is some priority at Tesla for training for more common traffic conditions.
 
  • Like
Reactions: FSDtester#1
Cool. My understanding also is that there is no “learning” in the vehicle - the Model is computed centrally - then runs on individual cars - there is no local AI.
Thing is, Everybody Says That. But I have my doubts. First: I do have a background in some serious DSP. And for some time I kept up with the beginnings and development of neural networks.

Feedback in neural networks, where the weights interior to the neural networks and a selection of outputs of the neural network feeding back into the inputs array of data have been part of the model since Day One. Heck, we of the wetware persuasion do this all the time.

Doing this sans constraints is likely a good way to descend into madness. But multivariate constraints on multivariate input, multivariate output, with multivariate feedback has been a thing for a very, very long time. One can constrain interior state variables, the range of input, range out outputs, and the ranges of feedbacks pretty much as one sees fit, without even resorting to what goes on in the various stages of neural networking. And, yeah, I've done this, non-neural-network style, on and off as part of work for decades.

It gets even weirder. I mentioned the word, "State Variables". Let's see if I can give an example. There's this class of problems called, "System Identification". A classic one was this Soviet-era problem involving predicting the water level of a large river system with feeder rivers in Northern Siberia, the better to predict flooding, set dam flow levels, and all that. One might have a bunch of rain gauges scattered across several thousand square miles as basic input and some automated water level monitors here and there. But complete monitoring was pretty much impossible: Way too much area and not enough money to instrument it all.

So, create models. Lots of models. Create the models algorithmically. The main intent is feed input data into the model, look at the data coming out of the model, compare that data with real, measured data from the field, then run algorithms that re-run the model as a function of time, change weights, function types, the number of state variables that change as a function of time, and modify the model to minimize the error difference between the predicted and actual outputs. Do enough of this, stepping the complexity of the model up as one goes, and, eventually, it can be used to predict what the various water levels in the river system are going to be like due to rainfall here, there, and everywhere, with a starting configuration of water levels at T=0. As I said, this is a problem in system identification.

Thing is, the model itself has state variables that change over time. But those state variables don't have to have a basis in reality. They may not be water flow rates, pond levels, resistance to water flow, or anything else: So long as the model works what values these variables take don't matter, so long as the end result works. Further, if constraining those state variables makes the model work better, then, well, why not?

Interestingly, the starting values for those not-based-in-reality state values can make big differences in what the end results might be.

So, now let's switch back to Tesla. Big, complicated algorithms in now-you-see-it, now-you-don't decisions on which way to drive, set the accelerator, set the brakes, with a fast-processing neural network built in, with multitudinous numbers of interior state variables both inside the neural network and outside, feedback of all sorts up the wazoo, and constraints everywhere. Finally, the whole business is a research project: NOBODY has done stuff like this before. Research is sometimes jokingly referred to as, "The process of running up alleys to find out if they're blind."

I've mentioned the idea that there might be feedback-sensitive neural network algorithms built into Tesla's software and hardware. Others have pulled up tweets and such from people inside Tesla that have stated that They Don't Do That. Well, that may or may not be completely true. But, even if it is, initial values count. So do constraints. And.. Tesla needs data on how well all this works in the real world.

As an example of how this might work: Seed the state variables at fixed intervals (on power-up? before each drive? who knows?) with a random number generator. With or without constraints; with or without correlation on how other state variables are set. See how it all works due to performance criteria that Tesla designs and tracks. Report numbers back to the mothership. Use that to change stuff going forward: Either with the next software release or, if one wants to get strange, on the next drive 😁.

Thing is: Proving that Tesla does or doesn't do any of the above is hard to do. Except.. I've certainly seen different behavior on different days. Just watch our guy with his unprotected left turns: Minor changes in the environment is all that it is.. or is it deeper than that?

Fun.
 
Thing is: Proving that Tesla does or doesn't do any of the above is hard to do.
Well I think there's a measure of common sense that can be injected here. Sounds to me like the type of ML you are talking in your post are recurrent NNs to predict trends or states in chaotic data over time, like weather forecasting or price/demand modeling. These NNs predict future state over long periods starting with current state data and past trends, some of which must be assumed at the beginning. Pseudo-randomizing the initial current state and/or trend data can help in training and verifying different models.

However, in vehicle automation, the entire input state is environmental: camera, radar, and USS input, speed, direction, acceleration, temperature, light level, drive mode, map data, etc. all can be determined from sensors. None of the current state depends on output from the NNs (not counting indirectly through manipulation of the environment by the steering/acceleration). Even with a substantial amount of recurrency in the NNs, any effect from initially randomizing any of the state values would be gone in seconds.

As far as actually randomizing (or even locally optimizing) the weighting in the NNs themselves, I don't see how the car would be capable of doing such without vast amounts of onboard validation and testing data to make sure some tweak didn't take things completely off the rails.
 
  • Informative
Reactions: FSDtester#1
I don't mind regressions because that is typical but after a week or so FSD seems to settle down and the regressions are either less or I get used to them. But every once in awhile I get a release that has some nasty and dangerous regressions. 10.69.2.4 has been that for me. And yet FSD has handled some situations better than ever.
 
Well I think there's a measure of common sense that can be injected here. Sounds to me like the type of ML you are talking in your post are recurrent NNs to predict trends or states in chaotic data over time, like weather forecasting or price/demand modeling. These NNs predict future state over long periods starting with current state data and past trends, some of which must be assumed at the beginning. Pseudo-randomizing the initial current state and/or trend data can help in training and verifying different models.

However, in vehicle automation, the entire input state is environmental: camera, radar, and USS input, speed, direction, acceleration, temperature, light level, drive mode, map data, etc. all can be determined from sensors. None of the current state depends on output from the NNs (not counting indirectly through manipulation of the environment by the steering/acceleration). Even with a substantial amount of recurrency in the NNs, any effect from initially randomizing any of the state values would be gone in seconds.

As far as actually randomizing (or even locally optimizing) the weighting in the NNs themselves, I don't see how the car would be capable of doing such without vast amounts of onboard validation and testing data to make sure some tweak didn't take things completely off the rails.
I brought up that "System Identification" business not because I think that Tesla runs that kind of code. But rather because when one gets into seriously complex control systems, and that pretty much describes FSDb if anything does, the control loops, state variables, feedback systems, and what-all begin to resemble a lot of other mathematical/coding methods used in other places.

State variables are used in System Identification, sure. But State Variables and their initial values are used in control systems everywhere. And the point I wanted to make was that these variables may or may not have direct correlation with physical parameters like mass, velocity, speed, and what-all. In fact, when one is trying to optimize a created complex control system to Do Something That One Wants, the state variables that pop out of the mess and what they represent may be a complete mystery to the control system's creators. Just like in System ID, sure.

If that's the case (and if you haven't gotten the idea that I'm waving my arms around pretty hard right now, I'll let you know that I'm levitating at the moment), varying the initial values of the state variables/vectors to find out just what they're doing to FSDb's operation on the road might very well be a high priority project within Tesla. Sure, set the initial values over that way and the car ends up driving the wrong way down a one-way street; set them over this way and the car drives like it's being navigated by a 100-year-old granny with lousy reflexes. Somewhere in the middle shows much better results.. but what's the range, anyway? And why's it doing all that, anyway?

It's handwaving, may not be what Tesla is doing anyway, but, well, people do see different driving dynamics on a day-to-day basis around here. So, maybe?
 
  • Informative
Reactions: FSDtester#1
Regarding regressions, I have a tricky intersection that I often use in day to day driving. Each release of FSDb handles this intersection differently. It's a right hand turn onto a three-lane tollway frontage road that requires the car to quickly get into the left lane to enter the tollway. It used to be that FSDb always insisted on turning onto the rightmost lane, then could not make two required lane changes in the short space before the onramp. A few versions ago, FSDb started turning into the middle lane and was able to get onto the tollway successfully - most of the time. The last two version of FSDb tends to stop the car as it acts confused about changing into the left lane. It doesn't matter if there are any other cars around me. Big fail.

This is consistent to the point that I simply disconnect and make the turn manually. It is absolutely a regression as FSDb was making this turn successfully just a couple versions ago.

And for those who think the car can learn, I can say that after attempting this intersection about 10 times, the car has shown no signs of either improvement or further regression. It makes the same mistake every time.
 
The car has to stay on the extreme two left lanes only to turn. However, every time the car will try to the extreme right lane and then try to squeeze and turn from the middle lane which is not allowed. The previous times it will get stuck and give up. Since there were not that many cars it made a clumsy left turn but I disengaged at that time.
 
It's actually really nice the Autopilot team is capable of rolling out multiple FSDb versions at once. That way, we can be bumped up to the 2022.36 feature branch and start providing data on how the velocity estimation/lane guidance is performing, while the employees are collecting highway performance data and refining the single stack.

Much preferable to being stuck on a several-months-old feature branch while kinks in the bleeding edge are ironed out.