Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
Chess? Please... That's a trivial sandboxed small problem space that you can brute force
I bring up chess because it's simpler in the hopes that people can better understand how reinforcement learning for predicting end-to-end controls progresses. The core part is that the same network size can be improved with examples of where it performed poorly as well as examples of how it could do better. Chess neural networks can beat humans even when only considering the current board position to predict which one action to take, i.e., it doesn't need to brute-force explore any next possible moves. (Yes, it'll play even better if it can think about future board positions too, but that doesn't seem as applicable to 12.x.)

FSD Beta 12.x has more complications than chess such as real-time processing of the video at 36fps, but for each frame, it's still predicting the appropriate control action, so for a relatively "early" network like 12.2.1, this could result in decision wobble if each of the 36 frames don't predict the same best control. Similarly, human disengagement and correct driving examples provide reinforcement learning data for the end-to-end network to improve controls for the next release without requiring a larger neural network that might approach hardware compute limits. Instead of brute-forcing with a chess simulator to find better behaviors, Tesla can deploy 12.x to find real-world examples of where humans drive better.
 
How do they tweak a NN release? Add a bunch of miles/incidents of special drives featuring how to do a certain maneuver correctly or how to behave in certain situations and then retrain the whole thing again? Or can they retrain certain subsets? I'm not an expert on NN, so I'm asking for info from those of you who are knowledgeable in the field.
They do lots of pushes to their master branch. Some are adding more data, some are tuning parameters, some are tuning loss functions, some are relabling old data etc. Then they train for a certain time period, validate and if it's better than the previous version they release it. If you wanna see some of all the ways they improve the software, read the previous release notes.
 
  • Informative
Reactions: JB47394
How do they tweak a NN release? Add a bunch of miles/incidents of special drives featuring how to do a certain maneuver correctly or how to behave in certain situations and then retrain the whole thing again? Or can they retrain certain subsets? I'm not an expert on NN, so I'm asking for info from those of you who are knowledgeable in the field.

Most likely through fine-tuning (adding additional training data in top of the entire model) as others have said. But one interesting thing to note is we do actually have the technology now to localize where certain behavior in a model comes from and surgically modify just those weights.

Folks have been using it to identify where an LLM might be storing incorrect information, and modifying the weights to correct it without changing anything else about the model: [2401.07526] Editing Arbitrary Propositions in LLMs without Subject Labels

So in theory, if Tesla was observing one specific bad behavior from FSD Beta, they could isolate the exact weights causing it and try directly tweaking them.
 
If you mean merge as in training the sub actions on multiple identically sized networks and turning then into one same size network, no.
If you mean merge as in training the sub actions on multiple identically sized networks and then putting them in parallel followed by an additional network (with more training for that section), yes.
I was after the former as a means of accelerating the training process. It would be interesting to talk to a mathematician about the potential for building a data set that could be used that way. That is, instead of building final weights and trying to merge that, build some structure that allows for merging. It may be a mathematical impossibility or simply intractable, but it's one of those shower thoughts that intrigues me.
 
  • Like
Reactions: JHCCAZ
I was after the former as a means of accelerating the training process. It would be interesting to talk to a mathematician about the potential for building a data set that could be used that way. That is, instead of building final weights and trying to merge that, build some structure that allows for merging. It may be a mathematical impossibility or simply intractable, but it's one of those shower thoughts that intrigues me.
If the independent nets had weights that matched all the way up to the input layer, they could be merged, but that's unlikely to occur.
 
  • Like
Reactions: FSDtester#1
Most likely through fine-tuning (adding additional training data in top of the entire model) as others have said. But one interesting thing to note is we do actually have the technology now to localize where certain behavior in a model comes from and surgically modify just those weights.

Folks have been using it to identify where an LLM might be storing incorrect information, and modifying the weights to correct it without changing anything else about the model: [2401.07526] Editing Arbitrary Propositions in LLMs without Subject Labels

So in theory, if Tesla was observing one specific bad behavior from FSD Beta, they could isolate the exact weights causing it and try directly tweaking them.

A shame we can’t do that to the human brain. My doofus weighting is too high; would like to crank that back a bit.
 
Getting closer? First indication issues may be resolved so maybe something will happen this weekend. Even so would likely just mean employees get a new build first but who knows.
Thanks for the update. I don't think I remember Tesla ever releasing a version to the public on a weekend though. As hard as Elon is said to work his employees, it seems like autopilot/FSD team gets the weekend off.

They do like to release on Fridays though which I find odd that if it turns out to be a terrible version that they would let it play out over the weekend before they pull it. (If it was truly bad, I'm sure someone would get to it even on a weekend but still)
 
The more I use V12, the more I have that feeling that it's the real deal: the final boss of FSD

Some of the maneuvers and decisions V12 makes create lasting memories

V12 is getting scary good at fixing its wrong lane selections

The missing piece is some form of lane or area memory that helps it avoid the same mistakes every time

But, it's better to first be good at fixing mistakes
 
The most delightful thing about 12.2.1 is that it works better when there's more traffic
I wonder if that’s because it’s inadvertently trained to follow lead cars, essentially piggybacking off of human drivers? If that’s the case it will only be as good as the other drivers around, or potentially limited by them.
Completely setting aside edge cases, FSD can't be causing accidents
And there’s the rub. FSD *will* cause accidents and people won’t accept it like they do people causing accidents.

It’s just like vaccines - they save millions of lives but people focus on the one adverse reaction and want to throw the baby out with the bath water.
 
The missing piece is some form of lane or area memory that helps it avoid the same mistakes every time
That’s one of the most frustrating things about FSD. I remember my wife asking once why it kept making the same mistake and I explained to her that it had no memory and therefore no ability to learn.

Having a memory and learning would imply the ability to modify the NN on the vehicle which I don’t see happening in the near future.
 
I wonder if that’s because it’s inadvertently trained to follow lead cars, essentially piggybacking off of human drivers? If that’s the case it will only be as good as the other drivers around, or potentially limited by them.

That's part of it yes

It works better when there are other cars around so it can have a better reference for speed

Combine that with its adept lane changing ability and other very good behaviors
 
The more I use V12, the more I have that feeling that it's the real deal: the final boss of FSD....
The more I read about the more you use v12 the more I won't to say you suck you lucky....... 🤣

Screenshot 2024-03-09 at 12.26.00 PM.png
 
FSD *will* cause accidents and people won’t accept it like they do people causing accidents
The point of my post was that they need to be rare, not extremely common (which they would be if it were just good enough to eliminate 80% of human-caused accidents (or whatever) - it would probably increase the total number of accidents many times over, and of course no one would be ok with that).