Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

HW2.5 capabilities

This site may earn commission on affiliate links.
It should have enough resolution to detect a vehicle directly ahead of you at 100 meters pretty easily. I have also noticed that it comes in pretty hot on stopped cars and my interpretation has been that autopilot wants to see radar confirmation if possible.

.42 is a lot better about detecting a car at a greater distance and applying braking sooner. It also blends the regen and physical brakes better than before.

At 40mph, it will detect and stop previously stopped undetected vehicles without any steering wheel white knuckle action. I haven't tried above 40mph, but I will in the coming days and report back. I can't really do anything above 50mph but before above 45mph it would see the car (it would appear on the IC) and then disappear. The end result was that AP would brake lightly at first, and then accelerate, and then I or it would SLAM on the brakes at the last second. Usually it was me in that game of chicken slamming on the brakes. Clearly I don't do that with anyone else in the car and I don't let it even get to the point where I feel it can't stop with ample room to spare. I feel if it was going to take action beyond emergency braking, it would do that about 10% of the time at 50mph. Based on my 40mph tests this morning, I think that number might be above 75% with this latest update. Certainly it would fail about 20% of the time with prior to .40 fw at 40mph. It aced 4/4 this morning with none of them seeming anything but human in terms of the response (better than ICE human where people also slam on the brakes at the last possible moment for some reason rather than coasting to a stop).
 
Why do these things always come in crazy file formats. Why dont they just make them in Paint, Word, Excel or Windows Media Player. So annoying.
haha. I was hoping there was a plugin for Notepad++ but I didn't find one. In the meantime I'm just looking at the same 'code' he provided. I just did a 'quote' option on it to then expand it in a temp post so that I could cut-n-paste it into Notepad++.
 
who knows.

Here's a sample if anybody wants to dissect it (from the above snapshot).

Do you also see a .proto file to go along with it? Content would look something like this:

Code:
syntax = "proto2";

package tutorial;

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
   MOBILE = 0;
   HOME = 1;
   WORK = 2;
  }

  message PhoneNumber {
   required string number = 1;
   optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}
 
It should have enough resolution to detect a vehicle directly ahead of you at 100 meters pretty easily. I have also noticed that it comes in pretty hot on stopped cars and my interpretation has been that autopilot wants to see radar confirmation if possible.

100m is medium-range at highway speeds. At 75mph, you will cover 100m in just about exactly 3s. That is not enough time to stop comfortably -- think being on a highway at 75mph coming up on a traffic jam or accident. 100m is fine for driver assist. A FSD system needs to see farther than 100m.

Now let's say you're FSDing down an undivided 45mph street. Most people are driving at 50-55mph. There's only a tiny little double-yellow line separating you from oncoming traffic. The oncoming traffic on this particular evening includes a drunk driver, who of course is speeding and swerving. Your FSD car of course is doing the limit, which compensates for the drunk driver's speeding, but that leaves a relative speed of around 100+mph. 100m gives you a little more than 2s from the time the NN first recognizes the vehicle as a vehicle, and you need to not only recognize it but localize it and its trajectory accurately enough to know that it's just swerved across the double yellow line. Think fast, little GPU...

Humans can do this very easily, if they're paying attention. (Whether they can react properly is another question...) A proper L3+ system can do this, but not at the resolution of the current AP nets.

Thankfully, I think we've just gotten confirmation that they do in fact have something better in the works, but I think a GPU/CPU upgrade is going to be required for true L3+.
 
  • Informative
Reactions: jimmy_d
.42 is a lot better about detecting a car at a greater distance and applying braking sooner. It also blends the regen and physical brakes better than before.

At 40mph, it will detect and stop previously stopped undetected vehicles without any steering wheel white knuckle action. I haven't tried above 40mph, but I will in the coming days and report back. I can't really do anything above 50mph but before above 45mph it would see the car (it would appear on the IC) and then disappear. The end result was that AP would brake lightly at first, and then accelerate, and then I or it would SLAM on the brakes at the last second. Usually it was me in that game of chicken slamming on the brakes. Clearly I don't do that with anyone else in the car and I don't let it even get to the point where I feel it can't stop with ample room to spare. I feel if it was going to take action beyond emergency braking, it would do that about 10% of the time at 50mph. Based on my 40mph tests this morning, I think that number might be above 75% with this latest update. Certainly it would fail about 20% of the time with prior to .40 fw at 40mph. It aced 4/4 this morning with none of them seeming anything but human in terms of the response (better than ICE human where people also slam on the brakes at the last possible moment for some reason rather than coasting to a stop).

compared to .34 or .40?
 
roads, road_splines, lanes, lane_splines, traffic signals....

BTW the autopilot track log got extended again. A lot more extra info and now it contains some sort of a binary log at the end that grows as you travel. 6 byte chunks are added per some unit of travel/time?

sample of a new trip log:

Looks like they're just starting to get serious about gathering performance metrics from the fleet. This is step 1 of being able to incrementally improve the system and know whether they're making progress or regressing.

You'd think maybe they would have gotten to step 1 a little sooner...
 
  • Like
Reactions: lunitiks and croman
100m is medium-range at highway speeds. At 75mph, you will cover 100m in just about exactly 3s. That is not enough time to stop comfortably -- think being on a highway at 75mph coming up on a traffic jam or accident. 100m is fine for driver assist. A FSD system needs to see farther than 100m.

Now let's say you're FSDing down an undivided 45mph street. Most people are driving at 50-55mph. There's only a tiny little double-yellow line separating you from oncoming traffic. The oncoming traffic on this particular evening includes a drunk driver, who of course is speeding and swerving. Your FSD car of course is doing the limit, which compensates for the drunk driver's speeding, but that leaves a relative speed of around 100+mph. 100m gives you a little more than 2s from the time the NN first recognizes the vehicle as a vehicle, and you need to not only recognize it but localize it and its trajectory accurately enough to know that it's just swerved across the double yellow line. Think fast, little GPU...

Humans can do this very easily, if they're paying attention. (Whether they can react properly is another question...) A proper L3+ system can do this, but not at the resolution of the current AP nets.

Thankfully, I think we've just gotten confirmation that they do in fact have something better in the works, but I think a GPU/CPU upgrade is going to be required for true L3+.

With pulse Doppler or chirping continuous wave radar reading the locations twenty times per second, the car knows the location and trajectory of anything with that much relative motion far sooner than any human can sort it out.

The scenario you describe isn't easily handled after the car sees it, though - depending on what you think the oncoming car will do next, there may or may not be a way to avoid the accident.
 
  • Informative
Reactions: buttershrimp
With pulse Doppler or chirping continuous wave radar reading the locations twenty times per second, the car knows the location and trajectory of anything with that much relative motion far sooner than any human can sort it out.

The scenario you describe isn't easily handled after the car sees it, though - depending on what you think the oncoming car will do next, there may or may not be a way to avoid the accident.

There may be some badass radar out there, but I don't think the radars currently in use on Teslas have enough spatial resolution to reliably know that the car is over the line from far away. I could be mistaken. Radar is definitely good at detecting relative velocity.

I also readily admit that perceiving the disaster in advance is only part of the problem, and avoiding it is quite another. As I mentioned, even a human may not be able to respond well in such a situation. But I am at least confident that a human that's paying attention to the road rather than their phone would see it coming.
 
There may be some badass radar out there, but I don't think the radars currently in use on Teslas have enough spatial resolution to reliably know that the car is over the line from far away. I could be mistaken. Radar is definitely good at detecting relative velocity.

I also readily admit that perceiving the disaster in advance is only part of the problem, and avoiding it is quite another. As I mentioned, even a human may not be able to respond well in such a situation. But I am at least confident that a human that's paying attention to the road rather than their phone would see it coming.

Not sure about over the line. But headed to the left of me, headed to the right of me, or headed straight at me, absolutely.
 
Not sure about over the line. But headed to the left of me, headed to the right of me, or headed straight at me, absolutely.

This is actually a really good point. But it would be pretty late to the game, because if it's swerving it's going to oscillate between headed to the left of me and headed to the right of me, and only at just the right times headed straight at me, until it gets pretty close -- which reinforces for me the idea that radar is good for ACC and last-ditch collision mitigation (e.g., AEB), not as a primary sensor for L3+. That said, with more advanced logic consuming the radar output, you could probably do something reasonable in this specific situation -- at least know that something was kinda wrong, even if you don't know exactly what.
 
100m is medium-range at highway speeds...

That 100m number is a total swag on my part - it was intended to be a lower bound. You can look at the downscales that @lunitiks posted to get a feel for what you can see how far away. I was just thinking that if braking wasn't showing up it probably wasn't because vision couldn't see car stopped at a light ahead of you because of resolution issues. Rather the radar problem with seeing stopped vehicles was probably a bigger contributor.

My experience is that TACC is really good at slowing to match the speed of a vehicle 100m ahead of me in moving traffic, but it often fails to brake for a stopped vehicle that is 100m ahead of me on local roads. I've always assumed this was because doppler radar sucks at seeing stopped vehicles but is really good at moving vehicls even if they are pretty far away.

Of course, without knowing what kind of confidence values the vision network is producing I'm only guessing.

Edit: here's the post with the downscales: #1254. It seems like you can see cars at distance pretty well, and a network can probably see them even better if it's well done.
 
Ok, I guess I have to take back some of my trash talking. After looking at the fisheye wiper and repeater networks it's pretty clear that these aren't strictly cut-and-paste. They are all based on GoogLeNet, but tweaked. The wiper network is pretty dramatically cut down.

Of note - the wiper network seems to be outputting 5 different classes of wiper operation, so I'm guess "off" plus 4 speeds of "on".

The repeater networks are straight GLN with outputs that suggest they are being used to generate 4 flavors of bounding box at 104x160 resolution and 6 other kinds of output at full resolution. That feels a lot like what you'd use to look for, say, vehicles in your blind spot. Could the bounding boxes be 4 different kinds of vehicle maybe? The other full res categories could help to understand the context of the view: pavement, walls, trees, curbs, etc?

The fact that they bother to use a different network for the repeater suggest that the GLN layers aren't being trained as generic recognizers. If they were then you could use the same network for the main as for the repeaters. If you find value in using a different network it's probably because you want to take advantage of the fact that what you see from a repeater is different from what you see from the main camera.

There's only one repeater network, so I'm assuming that maybe it gets used for both repeaters? I wonder if they reverse the image for the left one relative to the right one to make them match better? Or are they only using one of the repeater cameras right now?

Also, crawling through the differences in these newer networks led me to notice something about the main camera network that I missed the first time - the final two pre-deconvolution layers in the GLN portion (5a and 5b) are both using dilation. Dilation is a new-ish technique as of this year (more or less). The presence of dilation in this network suggests that whoever is generating it is keep up with latest developments and actively trying new stuff.

Finally, I took a look at the 4 months older 17.26.76 main camera network and found it to be different in some interesting ways from the recent stuff. The core was still built on GLN but the interpretation layers were quite different and arguably more complicated than the recent main camera networks.

Saying this is kind of a reach based on not a lot of info, but I wonder if maybe they were unhappy with what it was doing and started over with something simpler sometime in the last 4 months.
 
So my backup method of getting a definitive answer about usage of the additional NNs failed as well, so I broke down and went for the indirect way (for now anyway).

If you jack up debug to the vision task - it only tells you that it loads and initializes the main_narrow NN, not the other two.

The file locations was a bit reshuffled compared to prior releases so I it's possible somebody forgot to filter the new names from th efinal release, or it might also mean that the code to use the new NNs is really around the corner.
 
That is about the time the last AP head guy left and the new one started... There was some reason after all Elon suggested he find employment elsewhere after all...
Ditto. He “championed a rewrite” of the neural network but it sure looks like they rolled back his work and went in a different discussion. That probably cost them valuable time on the AP2 roadmap.