Neural Networks

Vitold · Oct 17, 2018

Speaking of APv3...which CPU architecture do you think will it have?

heltok · Oct 17, 2018

Vitold said:
Speaking of APv3...which CPU architecture do you think will it have?

I think ARM is very likely.

Joe F · Oct 17, 2018

It's common knowledge some here are just better than all others at all things related to autonomous vehicles. Some even profess they can do better than Tesla.

You would all benefit by paying rapt attention to such knowledge people, IMHO.

Vitold · Oct 17, 2018

heltok said:
I think ARM is very likely.

Is it possible that they will stick with Parker SoC?

BioSehnsucht · Oct 17, 2018

Vitold said:
Is it possible that they will stick with Parker SoC?

It seems like the easiest (though not best) path forward. Swap out the GP106 for their NN chip (assuming it also speaks PCIe) and all associated hardware (RAM and such attached to the GP106 for whatever the NN chip needs). Minimal changes to board layout, minimal re-routing, re-tooling, etc for production and assmebly.

"Best" is debatable upon what your priorities are but a custom ARM SoC (perhaps integrated the NN chip itself) with just enough of everything they need and no more is my idea of "best".

Vitold · Oct 17, 2018

BioSehnsucht said:
It seems like the easiest (though not best) path forward. Swap out the GP106 for their NN chip (assuming it also speaks PCIe) and all associated hardware (RAM and such attached to the GP106 for whatever theNN chip needs). Minimal changes to board layout, minimal re-routing, re-tooling, etc for production and assmebly.
"Best" is debatable upon what your priorities are but a custom ARM SoC (perhaps integrated the NN chip itself) with just enough of everything they need and no more is my idea of "best".

If they could still use Parker SoC than delays are less likely and 2.5 revision makes more sense. However I wonder how nvidia feels about it.

brendon1970 · Oct 17, 2018

Vitold said:
If they could still use Parker SoC than delays are less likely and 2.5 revision makes more sense. However I wonder how nvidia feels about it.

Hopefully those decisions have been made and set in concrete given that they claim to have cars running AP3 test hardware on the road.

BioSehnsucht · Oct 17, 2018

Vitold said:
If they could still use Parker SoC than delays are less likely and 2.5 revision makes more sense. However I wonder how nvidia feels about it.

Nvidia will sell them every Parker they can, as long as Tesla is buying. Outside of Nintendo Switch they don't have many real major design wins for their "mobile" SoCs (Tegra & friends) - mostly due to being 'hotter' and 'more power hungry' than the competition (this only really impacts portable devices, not EVs). Plus this lets Nvidia still brag about powering Tesla vehicles even if they aren't doing the heavy lifting.

brendon1970 said:
Hopefully those decisions have been made and set in concrete given that they claim to have cars running AP3 test hardware on the road.

I'm sure that whatever the design is, it's quite fixed at this point. All we can do is speculate until they see fit to tell us more, or deliver it in vehicles for someone to take apart.

Doesn't mean I won't enjoy speculating

S4WRXTTCS · Oct 17, 2018

BioSehnsucht said:
Nvidia will sell them every Parker they can, as long as Tesla is buying. Outside of Nintendo Switch they don't have many real major design wins for their "mobile" SoCs (Tegra & friends) - mostly due to being 'hotter' and 'more power hungry' than the competition (this only really impacts portable devices, not EVs). Plus this lets Nvidia still brag about powering Tesla vehicles even if they aren't doing the heavy lifting.

I'm sure that whatever the design is, it's quite fixed at this point. All we can do is speculate until they see fit to tell us more, or deliver it in vehicles for someone to take apart.

Doesn't mean I won't enjoy speculating

In the automotive market NVidia has a lot of potential as lots of car makers are working with them for their next generation models. NVidia is arguably the hottest company in AI right now.

The NVidia switch basically uses a NVidia Jetson TX1, and that's the most well known win.

The Jetson TX1 is also used in Skydio's self-flying drone that rich people can buy at an Apple store.

I would expect to see a lot more consumer products with Jetson SoC's in them as it really wasn't that long ago they started to come out. Lots of the early projects are just coming onto the market now. Also, keep in mind that odds are that a Kiosk robot at a Lowes (or elsewhere) is going to have NVidia inside. I imagine they have a lot of design wins, but it's stuff that you never even think about.

I looked into using them where I work, but they were too expensive. I tried to work with Movidius (what's used in DJI drones), but they ignored me. I was hoping things would change after the Intel acquisition, but no dice so far.

Keep in mind I'm only talking about SoC's capable of Neural Network acceleration, and not the older chip like what was used in MCU1.

NVidia doesn't appear to even be pursuing low-end, and low power stuff. In fact with the Xavier they're targeting even higher end stuff like Kiosk robots.

As to Tesla I fully expect them to continue using the Parker SoC, and to interface to their own silicon through a PCI-Express interface. I don't see them wanting to work with ARM to license anything.

scaesare · Oct 17, 2018

tomc603 said:
Extra power? The vehicle is sitting on top of the equivalent of three days worth of summer time electricity usage for my entire house. The batteries contain so much energy that people use them to store solar so they can operate every appliance in their house. Plugging in to a home charger charges at about the same speed the car drives through a town. This is all to say that these batteries offer an amazing amount of electrical charge, and that even if post-processing of data consumed 700W (we know it doesn't, but let's say it did), that's still only 2-3 miles of range per hour of processing.

He said:

...when there is computational power extra

He wasn't talking about electrical power.

BioSehnsucht · Oct 17, 2018

S4WRXTTCS said:
NVidia doesn't appear to even be pursuing low-end, and low power stuff. In fact with the Xavier they're targeting even higher end stuff like Kiosk robots.

As to Tesla I fully expect them to continue using the Parker SoC, and to interface to their own silicon through a PCI-Express interface. I don't see them wanting to work with ARM to license anything.

Nvidia didn't so much give up on the "low end" as their hardware was never really competitive there. Yes, the Tegras would nearly always deliver better performance in gaming, but when they were trying to court the tablet and related markets with the Tegras they got practically zero design wins, because they couldn't reduce power consumption enough for competitive run times, plus they were running hotter than the competition. Most products in that segment value run time and thermals over gaming performance.

Give them even slightly less constrained power and thermal envelopes though and they'll stomp on almost everything else in that segment - this is partly why they started getting wins in things like infotainment systems and such, the other reason being they priced them aggressively to move units. So for certain use cases, they're a no-brainer choice.

They pretty much had to pivot their target market away from the original goal though to find success, but now that they found it, they're doing pretty well with the Tegra products. Now that they've figured that out they're going bigger rather than trying to compete in the low end, and it's working well for them, as there's plenty of use cases for such hardware.

strangecosmos · Oct 18, 2018

If someone wants to learn about neural networks without a background in computer science or math, where should they start? Can anyone recommend a good online course or a similar deep dive that is accessible to a general audience?

J1mbo · Oct 18, 2018

heltok said:
I think ARM is very likely.

No doubt they will continue to use ARM for the CPU, but willing to bet they have gone for a TPU-like approach with the new Tesla chip for the NN heavy lifting.

Interesting look at Google's TPU architecture here.

AperiodicCoder · Oct 18, 2018

@strangecosmos: I would recommend starting with this quick excellent series: Neural networks - YouTube (Just recently watched it myself after someone suggested it as a good starting point.)

mongo · Oct 18, 2018

strangecosmos said:
If someone wants to learn about neural networks without a background in computer science or math, where should they start? Can anyone recommend a good online course or a similar deep dive that is accessible to a general audience?

Stanford has on-line class materials, including from Karpathy himself.
Stanford University CS231n: Convolutional Neural Networks for Visual Recognition

AperiodicCoder · Oct 18, 2018

Fantastic @mongo! Time to dive deeper.

jimmy_d · Oct 18, 2018

bswn1 said:
To confirm are you saying the single net processing all cameras is NOT the one you now believe is running the car and rather it's individual NNs processing each independently like v8?

Just trying to understand.

In a word, yes.

You could read his comment a few different ways since he's talking about how much computation the car is employing, which isn't the same thing as how much computation is needed by the camera network (which is one of the things I talked about in my post). The two are related, but there's a lot more going on in the car than just the camera network, and how much computation the camera network needs depends on how much computation is needed for one frame multiplied by how many frames you run through it. I can estimate the former, but I don't know the latter. The simplest explanation that makes sense to me is that 1) the car is currently driving on the non-AKNET_V9 networks, 2) those networks are an evolutionary extension of the V8 networks (more capable, more power hungry, but architecturally similar), 3) elon's comment is about how much computation the car is using on those networks, and 4) AKNET_V9 is an early attempt at a next-generation system, perhaps something intended for HW3 / FSD, which is getting into the V9 firmware for some kind of testing purpose (or possibly, just by accident).

An interesting question to consider would be, is there some way to tell which network is being use to drive? In V8 the number of cameras being used in the networks was different so you could do stuff like tape over a camera to see if it had an effect on the car and try to figure out if the number of cameras that was being used would tell you which networks were getting used. In V9 both sets of networks use all the cameras so this trick doesn't work. There might be some other way to tell if we knew more about the non-AKNET_V9 networks, but right now we don't know much other than that they are separate networks for each camera and that there's a complete set of them for all 8 external cameras.

J1mbo · Oct 18, 2018

jimmy_d said:
An interesting question to consider would be, is there some way to tell which network is being use to drive? In V8 the number of cameras being used in the networks was different so you could do stuff like tape over a camera to see if it had an effect on the car and try to figure out if the number of cameras that was being used would tell you which networks were getting used. In V9 both sets of networks use all the cameras so this trick doesn't work. There might be some other way to tell if we knew more about the non-AKNET_V9 networks, but right now we don't know much other than that they are separate networks for each camera and that there's a complete set of them for all 8 external cameras.

Do we know which NN family (V8/V9) is producing the data that @verygreen is using for those visualisations?

Just wondering if one set is actually feeding the other, rather than both sets running independently.

jimmy_d · Oct 18, 2018

Vitold said:
Speaking of APv3...which CPU architecture do you think will it have?

Oh this is a topic I really love:

What do we know?:
- Tesla vision is CNN based (in particular, inception style CNNs)
- Tesla vision is, at least so far, using 8bit quantized networks
- Tesla vision's networks are now hardware limited on a system that uses a Tegra+GP106 (about 10Tops 8bit)
- HW3 is about a 10x speedup over the above - so it would need to be 100Tops to be 10x the GP106
- Tesla has HW3 silicon testing in the field now
- Tesla spent between 2 and 3 years developing the silicon
- Tesla has a relatively modestly sized IC development team (not Intel, not AMD, not even Apple)

That's a lot to go on actually. But of course there are a lot of different architecture options even so. I'm probably biased, but I really like the 8 bit systolic array (which is what the TPU uses) as an option for Tesla here. For one thing, it's can be made very high performance with simple scaling. The core compute unit is a very large matrix multiplier (in TPU case, 256x265 or 64k multiply/add units). On top of that you need onboard weight memory, onboard memory for intermediate results, a set of simple nonlinear operator engines, accumulator memory, and a simple pipeline manager/control state machine. For another thing, the architecture is relatively simple in ways that make it much more doable for a small team, and quickly. Another point is that systolic arrays are very well suited to CNNs (as opposed to, say, RNNs). And there's the point that google already showed how to do it and detailed that in a paper so it's a known quantity - there's not much risk compared to other comparably powerful architectures. Google showed that you can get better than 90% utilization on a 256x256 systolic array. To have 100Tops of usable performance with a 256x256 systolic array you need about 800MHz clock rate, which is easily achievable with current IC processes even if you have pretty simple clock management and shallow pipelines.

If Tesla built their own systolic array IC then HW3 APE would probably end up looking pretty much just like HW2 APE, but with Tesla's NN processor (the systolic array chip in this case) substituting for the GP106 on the current APE. Of course there would be a bunch of small hardware and software changes, and Tesla would have to develop a set of software tools to enable efficient use of the new chip.

jimmy_d · Oct 18, 2018

Here's the TPU paper for anyone who wants to geek out over the possibilities: https://arxiv.org/pdf/1704.04760.pdf

Neural Networks

Active Member

Active Member

Disruption is hard.

Active Member

Model 3 LR

Active Member

Member

Model 3 LR

Well-Known Member

Well-Known Member

Model 3 LR

Non-Member

Active Member

Member

Well-Known Member

Member

Deep Learning Dork

Active Member

Deep Learning Dork

Deep Learning Dork

Similar threads