Inside the NVIDIA PX2 board on my HW2 AP2.0 Model S (with Pics!)

stopcrazypp · May 21, 2017

LargeHamCollider said:
The other SoC has nowhere near 3 TFLOPS, an ARM57 core does 16 single precision GFLOPS at 2 gHz, times four gets you to .064TFLOPS, Denver 2 should be similar, so Parker contributes ~0.1TFLOPS.

Edit: I suspect @stopcrazypp is correct, still doesn't explain the "Titan supercomputer" comment though.

You are forgetting the integrated GPU in Parker with 256 Cuda cores. That's why Parker is quoted at 1.5 TFLOPS (FP16).

lunitiks · May 21, 2017

Sir Guacamolaf said:
Heat sinks + fan - only a matter of time before those fans will need to be replaced.

Looks like the fans are SUNON Maglev

EarlyAdopter · May 21, 2017

On the topic of speculation of enough computing power and/or redundancy in what we see here in this module for FSD - do we know there isn't a second module elsewhere in the car?

lunitiks · May 21, 2017

EarlyAdopter said:
do we know there isn't a second module elsewhere in the car?

Yes. There isn't

lunitiks · May 21, 2017

@kdday Your sticker also has a serial number:

SA16K0017045

This tells us that your AP2ECU was manufactured in 2016 ("16"), in November ("K"), and it was the 17.045th part with that part number

haakonks · May 22, 2017

techmaven said:
One possibility is that the Pascal family offers full power FP16 at double the speed of FP32. NN calcs, especially image calcs might not need FP32. So a GP106 @ 4.5 TFLOPs, doubled is 9. Add another 1.5 TFLOPs x 2 = 3 TP for the other SoC, and we're at 12 TFLOPs.

Only GP100 (Tesla P100) and the Tegra X1 and X2 SoCs supports FP16 at double speed.

GP106 offers INT8 at quadruple speed. So 4,5 TFLOPs FP32 = 18 TFLOPs INT8.

You dont't want to train the neural networks in the car, that you do on a supercomputer somewhere, what you need in the car is fast inference for the network, and for that the INT8 capabilities of the GP106 is good.

haakonks · May 22, 2017

haakonks said:
GP106 offers INT8 at quadruple speed. So 4,5 TFLOPs FP32 = 18 TFLOPs INT8.

Quadro-version of GP106 is not clocked as high as the gaming-version so 3.0 TFLOPS FP32 and 12.0 TFLOPS INT8.
Still has 1024 CUDA cores, from the pictures posted of the GPU it looks like it uses 4 GB GDDR5 memory on a 128-bit memory bus.

bjornb · May 22, 2017

bjornb said:
I believe I have read that the reference/dev PX2 use the more powerful GP102 discrete GPU, but I am not sure..
Have Tesla used a smaller/less powerful discrete GPU? GP106 vs GP102.

LargeHamCollider said:
Huge thanks to @kdday, I am very surprised to see a GP106 on there, I was expecting a GP102 which is ~3x more powerful and ~4x more expensive.

I thought so too initially, but @haakonks has explained about the performance above. Also the GP102 requires a lot more power and cooling.

haakonks · May 22, 2017

Reference Drive PX2 boards from Nvidia has 2x GP106 boards (similar to Quadro P2000, but with less memory 128-bit bus and 4GB memory). If you need more memory and compute cores, you can change the GPUs out with 2x under clocked GP104 boards (similar to the GPU found in Tesla P4). Both have a TDP at 75W.

A single GP102 have 250W TDP.

Bladerskb · May 22, 2017

haakonks said:
Only GP100 (Tesla P100) and the Tegra X1 and X2 SoCs supports FP16 at double speed.

GP106 offers INT8 at quadruple speed. So 4,5 TFLOPs FP32 = 18 TFLOPs INT8.

You dont't want to train the neural networks in the car, that you do on a supercomputer somewhere, what you need in the car is fast inference for the network, and for that the INT8 capabilities of the GP106 is good.

I disagree. Nvidia DLTOPS is referring to training and not inference. The entire drive px2 unveiling conference focused on it being a one stop shop for training and running models live. They even compared how long it took them to train 10 million images drive px2 versus a regular gpu/cpu.

Their alexnet test where drive px2 came out 6 times faster than titan x(2015) was based on training not inference.

Titan x (2016) for example has 44 int 8 inference and no deep learning library supports int 8 at the moment.

The 24 DLTOPS of drive px2 is training.

EDIT: but since we never got a fp16 number. We will never really know for sure. Nvidia has a problem of creating new labels and comparisons like the 150 macbook pro = drive px 2

JeffK · May 22, 2017

Bladerskb said:
I disagree. Nvidia DLTOPS is referring to training and not inference. The entire drive px2 unveiling conference focused on it being a one stop shop for training and running models live. They even compared how long it took them to train 10 million images drive px2 versus a regular gpu/cpu.

Their alexnet test where drive px2 came out 6 times faster than titan x(2015) was based on training not inference.

Titan x (2016) for example has 44 int 8 inference and no deep learning library supports int 8 at the moment.

The 24 DLTOPS of drive px2 is training.

EDIT: but since we never got a fp16 number. We will never really know for sure. Nvidia has a problem of creating new labels and comparisons like the 150 macbook pro = drive px 2

They may have said that but they didn't mean it. Here's a quote from the Nvidia website:

With a unified architecture, deep neural networks can be trained on a system in the data center, and then deployed in the car.

People don't normally train at 8 bit ints. Google's former TPU for instance was only for inference. The new one supports FP16 so they are finally about to do training on it.

It's possible they could augment the network or train a smaller network based on local patterns, but the major networks are going to have been trained outside the car.

Matias · May 22, 2017

techmaven said:
I suspect FSD option pricing is more of a contract to make it happen rather than an actual specific hardware purchase. This is what Musk said from the most recent earnings call:

Obviously, wiring harness changes to an existing car drives up retrofit costs dramatically. Swapping out the autopilot hardware is not expensive and likely already folded into the option price. The existing boards can be put into the stockpile for warranty or repair replacement of EAP option hardware. And if it comes to pass, there will be newer hardware at higher computing power for less cost.

Just think about the need for the Service center capacity to swap out all the boards. By the end of this year there will be more than 100 000 AP 2 cars in the wild..At the same time fixing the teething pains in the Model 3..

lunitiks · May 22, 2017

Matias said:
Just think about the nees for the Service center capacity to swap out all the boards. By the end of this year there will be more than 100 000 AP 2 cars in the wild..

Btw Teslas FRT on this procedure is one hour (remove and replace AP2ECU, incl reset). So 100.000+ man-hours?

Mike K · May 22, 2017

lunitiks said:
Btw Teslas FRT on this procedure is one hour (remove and replace AP2ECU, incl reset). So 100.000+ man-hours?

But keep in mind it would only apply to cars that purchased FSD, the retrofits wouldn't need to happen at the same time and they could likely be done in the parking lot with the customer present. It seems Tesla made the unit easily accessible for just this reason.

I see it being no worse than a minor recall really.

Is everyone in agreement that the hardware as it sits likely is not going to be capable of full level 5 autonomy?

KyleDay · May 22, 2017

Yes I agree.

By the way, I'm posting more pictures tonight. I'm going to pull out the daughterboard, clean off the thermal paste, and post pics of the GPU and memory ICs. Also, I'll remove the CPU heat sink and take pics of it.

Check back in then...

lunitiks · May 22, 2017

kdday said:
Yes I agree.

By the way, I'm posting more pictures tonight. I'm going to pull out the daughterboard, clean off the thermal paste, and post pics of the GPU and memory ICs. Also, I'll remove the CPU heat sink and take pics of it.

Check back in then...

You my friend is nothing less than a super hero.

Request: Plz capture a close up of the Ublox device

techmaven · May 22, 2017

lunitiks said:
Btw Teslas FRT on this procedure is one hour (remove and replace AP2ECU, incl reset). So 100.000+ man-hours?

Service tech's are on salary anyways... and the take rate for FSD is probably pretty low up until now. And again, the FSD option likely already builds this into the price. But they should have a handle on this *before* massive Model 3 volumes.

Matias · May 22, 2017

techmaven said:
Service tech's are on salary anyways... and the take rate for FSD is probably pretty low up until now. And again, the FSD option likely already builds this into the price. But they should have a handle on this *before* massive Model 3 volumes.

Personally i'm not worried about the cost to Tesla, as Tesla has already baked this in. I'm worried about the availability of swap times.

ecarfan · May 22, 2017

@techmaven thanks for your post. I think Musk is pretty clear that Tesla will do whatever is required to ensure that anyone purchasing the FSDC option now will have their car upgraded in the future with whatever software AND hardware is required to enable that feature when it starts to roll out (and of course it will be incremental, not happen all at once).

techmaven said:
I suspect FSD option pricing is more of a contract to make it happen rather than an actual specific hardware purchase. This is what Musk said from the most recent earnings call:
"The sensor hardware and compute power required for at least level 4 to level 5 autonomy has been in every Tesla produced since October of last year, approximately. So it's a matter of upgrading the software, and we can reach level 5. And if it does seem that we need to upgrade the compute power, it's designed to be easy to upgrade, basically access it through the glove box and plug in a more powerful computer, so we don't think it will be, but if it is, that's pretty easy to do. So the important thing to appreciate is that the sensor hardware and wiring harness is necessary for full autonomy, which is essentially having the eight cameras, the radar, and ultrasonics, that's in place, so with each passing release, the car's autonomy level will improve."
Obviously, wiring harness changes to an existing car drives up retrofit costs dramatically. Swapping out the autopilot hardware is not expensive and likely already folded into the option price. The existing boards can be put into the stockpile for warranty or repair replacement of EAP option hardware. And if it comes to pass, there will be newer hardware at higher computing power for less cost.

@Matias as others have pointed out, it is very likely that only a small fraction of all Teslas sold include the FSDC option. Most buyers probably don't select that option (which should not be surprising, since the functionality does not exist yet and the initial implementation is likely over a year away, some say "years away", and even some Tesla fans say "never going to happen"). So when the time comes for FSDC to start to roll out, only a fraction of all Teslas sold since October 2016 will need upgrades.

Matias said:
Just think about the need for the Service center capacity to swap out all the boards. By the end of this year there will be more than 100 000 AP 2 cars in the wild..At the same time fixing the teething pains in the Model 3..

lunitiks · May 22, 2017

lunitiks said:
Request: Plz capture a close up of the Ublox device

Also @kdday perhaps you could take some measurments while your at it?

Plus, close ups of he fan labels (Sunon)?

Inside the NVIDIA PX2 board on my HW2 AP2.0 Model S (with Pics!)

Well-Known Member

Cool James & Black Teacher

Active Member

Cool James & Black Teacher

Cool James & Black Teacher

New Member

New Member

Member

New Member

Senior Software Engineer

Well-Known Member

Active Member

Cool James & Black Teacher

Member

Active Member

Cool James & Black Teacher

Active Member

Active Member

Well-Known Member

Cool James & Black Teacher

Similar threads