Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD rewrite will go out on Oct 20 to limited beta

This site may earn commission on affiliate links.
You can see notable difference in operation of Tesla stack in areas that have those maps and that don't - fully consistent with my claim of "when you have HD maps, system performance is improved".
Can you tell us more about the improvements that you see? Can you, for instance, give us two clips of video, one in a region with these maps and one in a region without, and then point out where the maps are assisting the vision?


I don't agree with the cheating, but they do use very detailed maps. They don't want to call it HD maps? Fine, but if it qwalks like a duck and quacks like a duck....
Can you define what is a "very detailed map"? What is the source of these maps? Can we examine these maps? (If not "we" (the general public), can "you" (a gifted Tesla hacker)?) What is the source of these maps?
 
  • Helpful
Reactions: mikes_fsd
WXur0dW.png


I don't have the FSD Beta but on 2020.40.8, I've noticed this behavior too. My car stopped well short of the stop line and then lunged forward twice to reach the stop line.

That’s hilarious about the mirrors automatically folding in
 
I hope Lex has Karpathy next. The next few months are going to be so interesting..., the rate of change in improvement of the FSD build seems already mighty quick. I wonder were we will stand in a year, this might just really be the big one:)
I hope so, but there aren’t many Karpathy interviews. He is far more interesting & knowledgeable to me than Musk. I wonder if his employment contract restricts his ability for these types of things or he simply ‘avoids the limelight’
 
  • Like
Reactions: diplomat33
L5 is actually what Elon mentions when he talks about FSD, and the way Tesla determines path planning from vision is an L5 requirement. As in if you don't have that you don't have an L5 system. An L5 system vehicle needs to be able to go on roads that the system hasn't seen yet.
In that scenario, it could be argued that L5 isn’t “anything a human can do”
It seems to me that people are no longer defining L5 as human capability in any situation
 
  • Love
Reactions: GSP
I wish people would be a bit more clear with what they mean rather than argue that their semantics was somehow not technically incorrect, wasting energy trying to defend self conceptions. Listen to this guy at 18:00

Back to the question. I would be very surprised if in version 2020.48.8.11 that any form of map is input to the same neural network that takes the cameras and radar and later has its outputs visualized on the display. I believe some neural network might be using some map data and outputs from that NN to decide which lane to change to.

None of us really know, of course. Probably the only thing it might be using map info for is lane information (if its available) so that it can weight the NN to look for expected lane markings .. but ultimately I think the NN and cameras will have the final say.
 
None of us really know, of course. Probably the only thing it might be using map info for is lane information (if its available) so that it can weight the NN to look for expected lane markings .. but ultimately I think the NN and cameras will have the final say.
Well, to be honest, we have access to sources that know, for instance: Karpathy knows.
This year alone, in 2 public engagements, he said, we do not use HD Maps. The problem is that we have "experts" here that say otherwise.
 
None of us really know, of course. Probably the only thing it might be using map info for is lane information (if its available) so that it can weight the NN to look for expected lane markings .. but ultimately I think the NN and cameras will have the final say.
Yeah we don’t know. But we know that the neural network prefers a fixed input size and that the map in many representations lacks a fixed size. Making it fixed size like a 2D image involves some form of quantization with a loss of quality, to get centimeter level precision involves a pretty large input. Then for it to be useful we must localize ourself in it which is a whole process in itself, likely better done after the perception step. It also corrensponds to reality in a very biased way, ie it very seldom differs from reality the neural network might learn to trust it too much. If I was engineering the neural network, I would choose to decouple the map from the sensor input at training and inference.
 
  • Helpful
Reactions: mikes_fsd
So having watched a few of the City Streets beta videos, I think we are about to enter an interesting transition period:

First, its clear that 'City Streets' is running independently of the older NoA system, so Tesla are apparently running both the old "2D stack" (including the legacy NN) and the new "3D stack" (with the new NN that requires HW3). This is apparent by the hand-off between City Streets and NoA, and the lack of any changes in the behavior when the car is in the older NoA mode on freeways.

So, what happens when City Streets is more stable (later in 2021)? I'm guessing that at that point Tesla will re-work NoA into "NoA+" (my name), completely updating it to use the new 3D stack. This will allow the car to use the same sophisticated 3D stack for both City Streets and NoA that works seamlessly on all roads, and significantly enhance the smarts of NoA+ compared to NoA (better lane awareness, ramp negotiation etc etc).

Of course, NoA+ will requires HW3. So I'm guessing that at this point Tesla will branch the stacks/software. Those without FSD package will stay on the current NoA and HW2.x (which will become part of EAP in effect), while the combination of City Streets and NoA+ will become the FSD package.

Of course, they could package things differently (and probably will), but I'm still betting on the NoA+ re-write, as they will be anxious to retire the old 2D NN stack and move everything onto the new 3D stack (and thus focus development efforts only on the 3D stack).

(Note: 2D/3D names are not really accurate here, but you know what I mean .. the old non-integrated camera NN stack be the newer integrated NM stack.)
 
Yeah we don’t know. But we know that the neural network prefers a fixed input size and that the map in many representations lacks a fixed size. Making it fixed size like a 2D image involves some form of quantization with a loss of quality, to get centimeter level precision involves a pretty large input. Then for it to be useful we must localize ourself in it which is a whole process in itself, likely better done after the perception step. It also corrensponds to reality in a very biased way, ie it very seldom differs from reality the neural network might learn to trust it too much. If I was engineering the neural network, I would choose to decouple the map from the sensor input at training and inference.

No, that's not the way stuff like that works (well, not usually). You are not going to render the map data into an image and then feed it visually to the NN. What you do is train the NN with an additional non-visual input that says "expect X lanes" (where X might be "unknown number" for no map data). This allows the NN to weight its recognition based on expectations from the map data (much like a human does actually). So if it expects 2 lanes it can focus, as it were, on finding them. However, such an NN will still base its decisions on what it sees with the cameras, not what the map data says (if it clearly sees 3, it will ignore the map data) .. its more of a hint when the camera data is more ambiguous (bad weather or lighting, or markings obscured by vehicles).

This is also often (though not always) how dynamic NNs work. On a clear sunny day they "see" two lanes, and note that in a store somewhere. Then, next time they are at that location, they look at that stored data, and use it as input to assist in the recognition process. (btw I'm not saying this is how the car works, in fact I'm pretty sure the car does not attempt any kind of dynamic learning.)
 
I wish they did that when entering the garage. They should have a "tight parking mode" that stops beeping all the time, reduces maximum speed to crawl, starts showing images from all the cameras, folds mirrors ...

You can already set the mirrors to fold based on location. Mine fold the same time it fires my garage door opener, ends up being about a car length away from the edge of my driveway, most times.
 
  • Like
Reactions: mikes_fsd
So... when is 4D not 4D?

There are parts of this that look very much like "4D". When driving by cars that are parked on the side of the street for example.

But other parts seem less so. For example, when the car is not moving, we still see some "pop" and "jitter" of objects around the car. Good example is when the pedestrians walk around the car in the teslaraj video.