Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

HW2.5 capabilities

This site may earn commission on affiliate links.
Just for fun I downsampled some frames grabbed from some of the HW2 cameras under different conditions (source) to 104x160. Kind of gives you a sense of what level of detail we're talking about. These are full size (Try zooming in 500x.)...

I'd say this jibes with what I was saying earlier. This is perfectly adequate resolution for identifying lane lines and vehicles at short to medium range -- roughly the range you see on the IC display. Which is to say, this is adequate to reproduce what the MobileEye system did to achieve AP1 parity (roughly speaking, except for the rain sensing wipers of course).

The network we're looking at clearly isn't the architecture they intend to use for FSD or even the EAP features that go beyond AP1 parity. At least, we certainly hope they're working on something else...

... they must be, right?

RIGHT?!
 
Whoa, I just got my hands on 2017.42 - lots of goodies there. two more NNs, one for the wide angle cam and one for the repeaters. Also even more maps stuff (perhaps that stull will finally start working now?)

I am a bit nervous to actually install it due to the suspension problems that are widely reported.
Awesome stuff!

I may be incorrect, but I thought the suspension issues were mostly (exclusively?) related to AP2.5 cars. Somebody speculated on a thread that there may have been some suspension hardware/software/sensors that was added/removed on AP2.5 "era" cars, thus 42.*1* was supposed to fix that?
 
  • Like
Reactions: MP3Mike
Whoa, I just got my hands on 2017.42 - lots of goodies there. two more NNs, one for the wide angle cam and one for the repeaters. Also even more maps stuff (perhaps that stull will finally start working now?)

I am a bit nervous to actually install it due to the suspension problems that are widely reported.

The suspension is a false error. Can still drive with it.

GO GO GO GO :)
 
Whoa, I just got my hands on 2017.42 - lots of goodies there. two more NNs, one for the wide angle cam and one for the repeaters. Also even more maps stuff (perhaps that stull will finally start working now?)

I am a bit nervous to actually install it due to the suspension problems that are widely reported.

Whoa this is interesting. I'm speculating that the wide angle NN is for rain detection as otherwise it'd be used for some FSD functionality if the original Tesla EAP vs. FSD camera distinction is to be believed.

Wonder how long until we see the repeaters being used... I think rendering cars in blind spots on the IC is the next step. I think we have a clue this is the plan in the way the Model 3 toy car is rendered so high up on the Y axis.
 
  • Like
Reactions: DKPowers
So I got a chance to look at the network specification for the AP2 neural network in 40.1. As @verygreen previously reported, the input is a single 416x640 image with two color channels - probably red and grey. Internally the network processes 104x160 reduced frames as quantized 8 bit values. The network itself is a tailored version of the original GoogLeNet inception network plus a set of deconvolution layers that present the output. Output is a collection of 16 single color frames, some at full and some at quarter resolution. The network is probably a bit less than 15 million parameters given the file size.

So what does this mean? Images go into it, and for every frame in the network produces a set of 16 interpretations which also come in the form of grayscale images. Some external process takes those processed frames and makes control decisions based on them, probably after including radar and other sensors. This is not an end-to-end network: it doesn't have any scalar outputs that could be used directly as controls.

Also, the kernel library includes a number of items with intriguing names that are unused in the current network. At a minimum this must mean that there are variations of this network that have more features than they are exploiting in the current version, or that they have other networks with enhanced functionality which share the same set of kernels.

Awesome!! Thank you for sharing. Also how are you able to see the network specification?? This is blowing my mind that is a hardly modified GoogleLeNet network.


So if we cautionously assume that current network has data at 60fps (2 cameras at 30fps each) + we'll get 2 more of similar complexity NNs at another 60 fps (2x repeater + 2x pillars) = 180fps in total + a hopefully simplier wide camera NN and that just leaves the backup camera that has a totally different picture pattern, but on the other hand would not need to be used all the time.
This looks like there's a chance the whole performance would be about where it should be unless they drastically redo their NNs to make them much heavier I imagine.

I imagine it is very likely they will redo their networks in the future that may be heavier.

Actually 70% is on the good side. Utilization is basically how many of the theoretical maximum flops for some particular chip a particular piece of code is using. For instance CPU utilization on a typical laptop might bounce around 10% for a typical heavy workload. Only pretty well optimized stuff that is well tuned to the particular hardware you're using will every get above about 25%. That's just the nature of today's general purpose computer IC's: they can do a lot of different things but they aren't optimized for any of them.

Sorry, that's kind of a digression.

I don't actually know that this piece of code is running at 70%, so my guess might be optimistic. I found a benchmark running Inception-V2 on a GP104 IC which showed 70% utilization. The first 2/3 of this 40.1 network is basically identical to Inception-V2, and the last 1/3 is computationally similar. I think the AP2 hardware is likely based on a GP102 IC, which is architecturally identical to the GP104 but about half the size, so I think it should be able to get similar utilization. That seemed to be the best way for me to generate an estimate.

I think the hardware isn't underpowered at all. I haven't seen a network that is know to be able to do FSD so it's only speculation on my part, but I'd expect this chip to be able to do it if the FSD algorithm is decently mature. Properly tuned this chip can probably run a much more competent network than this one on 8 cameras simultaneously at 30fps. The fact that the encoding portion of this network is a trivial cut-and-paste from a Imagenet contest winner from 2014 suggests to me that they aren't even trying to optimize this puppy.

I mean seriously, it's like an intern did the network architecture for this. That's a big part of why I think they must be working on something else. Because this sure doesn't feel like the product of a world class team with tons of resources.

Comma AI seems to be doing the basic stuff for lane keeping on a smart phone, which is probably somewhere between a hundred and a thousand times less capable than this GPU. That gives me hope that the problem will turn out to be tractable with existing hardware.

My thoughts exactly! it looks as if they do not see the network as a bottleneck at this time, and have not lifted a finger to try to improve it yet. Either that, or this is their temp solution, while they are working on a more advanced network that is not / will not be ready to be deployed for quite a while.

This actually just makes me more confident in Tesla's ability to make AP2 more capable.

Isn't it possible, actually plausible, that the NN deployed and evaluated by us is a placeholder legacy version? I mean, it seems obvious to me that this is the case, that the future EAP/FSD stuff hasn't been deployed and thus we can't infer or validate its existence (nor can any competitors!). Yes, this is bad for the implied benefits of a "shadow mode" but I don't think it's necessarily bad for the suspected advantage of Tesla's fleet learning. Our cars could be collecting data and shipping it home where it is used to train the model(s) we've yet to see. That wouldn't require a NN at all, just the sensors/cameras and a data feed. It would also provide excellent data for the simulator we keep hearing about...

As an aside, and I admit I am not certain of the relevance, but I once interviewed one of the guys that built the original Forza Motorsport. He told me an interesting story. Up until that time all racing games had an AI "line" which the non-player cars would use to get around each track. So, when they built Forza they applied this strategy and built an ideal "line" for every track in the game. Because of the Xbox's increased compute power the game had far better physics - each car was its own model and behaved its own way - a major selling point when they pitched the game for funding. The physics were so good (relatively speaking) that the classic, generalized line that was used for the car to follow simply didn't work. Cars would go careening off the track or into walls. Amazing, right!? Thus, they had to rethink the car AI. They subsequently took each car and had to "train" it to drive around each track, each car producing its very own line for each track based on the conditions (wet, dry) and modifications (tires, suspension, engine, etc). Cool stuff. (And probably not all that different from Tesla's simulator.)

There was a post earlier in this thread about Tesla's pathing being algorithmic with the NN (CNN, DLNN, whatever) building the world "model". That comment got me thinking about the Forza example. It also got me thinking about the recent air suspension woes in 2017.41 - it may answer why they are messing with air suspension in then first place; obviously suspension performance is a major input to the capabilities of the car, which matters in algorithmic pathing. You need to know exactly how the car is going to behave or it might go flying off the track. :)

Also, if an Xbox360 can model these physics (for all the cars in at race at once in real time), smartphones today probably can (Comma.ai), and AP2 or AP2.5 absolutely can.

Videogame physics != Real world physics.

Also an Xbox does not need to process image data, perhaps the most computationally expensive part. Video game engines have the ground truth information.

Just pointing out, 2.5 has 2x the GPU capacity of 2.0, so it's 100% more powerful.
Source?

Whoa, I just got my hands on 2017.42 - lots of goodies there. two more NNs, one for the wide angle cam and one for the repeaters. Also even more maps stuff (perhaps that stull will finally start working now?)

I am a bit nervous to actually install it due to the suspension problems that are widely reported.

Woah!!! again how are you finding this information??!! So there are three NNs in 2017.42?? main front cam, wide angle cam, and repeaters? also what about the long range front camera?

So this must mean that 2017.42 uses atleast 4 or 5 cameras then? Why the NN be there if they don't get used. Once you have it installed you should put tape over the cameras to verify which ones are being used.

And I am curious about the new map stuff you mention, please share!!

"(perhaps that stull will finally start working now?)" what will start working now?

Again! how are you seeing the NN files in the 2017.42 update?

Thanks
 
Last edited:
  • Informative
Reactions: buttershrimp
I bet we may be seeing those new NN in the much ballyhooed “stealth mode” collecting data before they are rolled out to every one. It would explain why they seem intent on rolling this out to everyone. Exciting times.:)

Yes this makes sense, I imagine the will use those repeaters for lane changing, like without driver input. and also necessary for exiting/merging for on ramp to off ramp.

The NNs are visible in the firmware. Yes, there are 3 NNs in 2017.42.

But how do you examine the firmware?

What NN does the front long/narrow range camera use? the same as the main front camera use?

I'd like ramp to ramp AP without nags to ease going to the in law's for Thanksgiving...

You need maps and side cameras for this.... which looks like they are working on.
 
Ok, I wonder if somebody just jumped ahead of themselves here and forgot to exclude the extra NNs or what, but so far it does not look like anything's majorly different being done with the cameras. Sadly they removed the explicit logging of what camera does what.

Of course there's no calibration data for all those other cameras too (very visible warning about that).

I still need to actually drive the car to see if anything changes, the map messages have certainly changed a bit too now.