AP2.0 Cameras: Capabilities and Limitations?

arwooldridge · May 21, 2017

Makes sense as the M8 series ublox has been available since 2014 now. Early Ublox M8 receivers were hackable for raw data but no longer possible on latest shipped devices from Firmware 3.01 ( early 2016)
Tesla could in principle get bespoke raw data enabled receivers from Ublox, the data is all available inside the chip just not accessible unless its an M8T but this one does not have dead reckoning. They might be better to wait for cheaper and more robust dual frequency receivers though.

JeffK · May 21, 2017

Reciprocity said:
Awesome info on this thread but I'm gonna toss a wrench in like I am known to do. Even I have made the mistake of thinking Tesla could use Xavier in the future, but then I remembered they have a deal with Samsung to make a chip. At first i assumed this was for the mpu? If that's what it's called. But then I thought about how Apple made its own are processors for the iPhone and iPad.

They could be licensing something from Nvidia but I think the goal would be to design a chip that only has what Tesla needs and not as full featured, AKA expensive, as it relates to the px2/Xavier. No lidar support for example.

The goal would be two fold, one processor for everything that is built specifically for Tesla and can be iterated over time to support upgrades of existing cars and future cars. Price and power consumption minimized. Support for augmented reality HUD? Tesla won't want several disparate systems and cpus that drive up complexity and cost.

Now what to do with 100,000 drive px2 boards? Can they rack then up and build a machine learning super computer in a data center?

Sorry to derail this thread again, but I think it's the right audience, minus one guy who won't be named.

The chip with Samsung is supposed to be still two+ years out (end of 2019)... Also lidar "support" adds no additional cost because it's not driving a Lidar unit at any time. Lidar would provide raw data on its own.

What concerns me is that Parker alone should only be around 1.5 TFLOPS half precision (750 GFLOPS single precision). This is not nearly enough to run what they want without discrete GPUs.

If I play Devil's advocate for a second and just think out loud. A single Parker alone would be plenty of power to emulate and even be better than AP1 and at the same time would be able to generate mapping data which will later be used for FSD. The up-charge on FSD is enough to afford a later replacement of the board for those owners while people who opted only for EAP don't need to be upgraded. It'd be a little underhanded, but they could have been planning for the upgrade the entire time and simply wanted to be first to market to deliver a car with all the required sensors on-board.

Or secretly have a custom board with a Parker chip and a currently disabled GP102 chip from the Pascal based Titan X coming in at 10.8 TFLOPS or 40x EyeQ3. This would take up less real estate than the normal DrivePX 2 system, have more computing power, and be almost exactly in line with Elon's comments during the AP 2 announcement.

bjornb said:
Do we know anything more specific about this?
What kind of 'operations per second' is he talking about? TFLOPS? DLTOPS?

Typically when speaking of GPU computation speeds we always refer to single precision (FP32) unless otherwise stated.
DLTOPS are 8 bit integer calculations.
EyeQ3 was measured in MAC/s (Multiply and Accumulate Operations per Second.) which you multiply by two to get FLOPS. Example: EyeQ3 was capable of 102 MAC/s or 0.204 TFLOPS. Nvidia's older Tegra X1 was computationally faster but required five times the power. The Nvidia Volta architecture, which Xavier is based on, corrects that former disadvantage.

Although, now that I've actually calculated the computation power of the EyeQ3 myself instead of trusting the TFLOPS numbers on random news websites 40X EyeQ3 is only 8.16 TFLOPS or basically a full Drive PX2 system.

Reciprocity · May 21, 2017

JeffK said:
If I play Devil's advocate for a second and just think out loud. A single Parker alone would be plenty of power to emulate and even be better than AP1 and at the same time would be able to generate mapping data which will later be used for FSD. The up-charge on FSD is enough to afford a later replacement of the board for those owners while people who opted only for EAP don't need to be upgraded. It'd be a little underhanded, but they could have been planning for the upgrade the entire time and simply wanted to be first to market to deliver a car with all the required sensors on-board.

I get your point and didn't think about only a partial upgrade of hardware. That having been stated, all cars will enable FSD if FSD exists. Whether they do it the day it becomes active or the next owner does it or the one after that. But your point is still very valid, you don't have to replace 100% of the boards day one, you could do it over years and years which would drive down the cost as those chips will get cheaper and cheaper over time.

Great info from everyone, really this is a hyper technical thread and I am learning a lot.

chillaban · May 21, 2017

@verygreen do you have lspci output? I presume the GPU would be connected via PCIe, and there's a bit of bridge enumeration in the dmesg but it's been so long since I've worked with Linux that I don't remember if it'll print out enumerated devices as a part of bootup.

(Hopefully that'll settle the autochauffeur debate. I am guessing it is the board with at least one discrete GPU, as there's no way they can add up to 8TFLOPS using an integrated GPU and those crappy CPU cores)

lunitiks · May 21, 2017

@chillaban in the dmesg there's several references to "kernel-a" and "kernel-b", like:

[ 0.000000] Kernel command line: root=/dev/mmcblk0p2 rootwait ip=off ro console=ttyS1,115200n8 no_console_suspend=1 tegra_keep_boot_clocks bootpart=kernel-b envpart=2 sdhci_tegra.en_boot_part_access=1 mtdparts=mtd0:65536K@0K(whole_device),524288@0(stuff),524288@20709376(env),262144@21233664(recovery-linux-dtb),262144@21495808(kernel-a-dtb),262144@21757952(kernel-b-dtb),11534336@22020096(recovery-linux),15728640@33554432(kernel-a),15728640@49283072(kernel-b) tegraid=18.1.2.0.0 vpr=0xba00000@0xec500000 lp0_vec=0x10000@0xf7ff0000 earlycon=uart8250,mmio32,0x03110000 bl_prof_dataptr=65536@0x0f7fe0000

and

[ 0.892559] 0x000001480000-0x0000014c0000 : "kernel-a-dtb"

and

[ 0.893697] 0x0000014c0000-0x000001500000 : "kernel-b-dtb"

and

[ 0.896306] 0x000002000000-0x000002f00000 : "kernel-a"
[ 0.897113] 0x000002f00000-0x000003e00000 : "kernel-b"

Does this tell us anything about the single/double Tegra/Parker question, or? (I'm no Linux-guy, so I'm asking.)

J1mbo · May 21, 2017

Reciprocity said:
Awesome info on this thread but I'm gonna toss a wrench in like I am known to do. Even I have made the mistake of thinking Tesla could use Xavier in the future, but then I remembered they have a deal with Samsung to make a chip. At first i assumed this was for the mpu? If that's what it's called. But then I thought about how Apple made its own are processors for the iPhone and iPad.

They could be licensing something from Nvidia but I think the goal would be to design a chip that only has what Tesla needs and not as full featured, AKA expensive, as it relates to the px2/Xavier. No lidar support for example.

The goal would be two fold, one processor for everything that is built specifically for Tesla and can be iterated over time to support upgrades of existing cars and future cars. Price and power consumption minimized. Support for augmented reality HUD? Tesla won't want several disparate systems and cpus that drive up complexity and cost.

Now what to do with 100,000 drive px2 boards? Can they rack then up and build a machine learning super computer in a data center?

Sorry to derail this thread again, but I think it's the right audience, minus one guy who won't be named.

Disagree on many counts. Tesla do not need to waste in a bleeding edge market where specialist chip companies are spending billions on R&D.

You are derailing the thread. Why not start a new topic on this?

LargeHamCollider · May 21, 2017

At this point it looks fairly conclusive, there's one GP102 on board (not two as had been speculated) and one Parker on board. The "FLOPS" mentioned are standard Floating Point Operations Per Second.

Total cost of AP2 hardware is near, probably below, $1000.

verygreen · May 21, 2017

lunitiks said:
New @chillaban in the dmesg there's several references to "kernel-a" and "kernel-b", like:

that's different.
It's their failsafe system similar to chromeos and newer androids:
have two bootable images, one ""online" one "offline" you update the offline image, then reboot into it, if things go wrong, you return to the previous image effortlessly.
A bit in flash determines which image is currently "online" to boot into. IC and cid operate in the same way.

Bladerskb · May 21, 2017

bjornb said:
Yes, I agree that customers have a L2 system now and EAP is also L2, but what is the potential of the system in your opinion?
L2? L3? L4? L5?

Lets assume for now that the processing power of the autopilot ECU is not a limitation.

Elon Musk has said 'all cars exiting the factory have hardware necessary for Level 5 Autonomy' and 'with Level 5 literally meaning hardware capable of full self-driving for driver-less capability' (but I have a feeling you have a meaning about Elons credibility.. ;-) ).

On their own Autopilot page they actually do not go that far:
" the hardware needed for full self-driving capability at a safety level substantially greater than that of a human driver".
"enabling full self-driving in almost all circumstances, at what we believe will be a probability of safety at least twice as good as the average human driver".

To me it seems like they say the potential is L4 on the web page(?)

Well as you can already tell, they hyped it up by calling it "level 5" but their statement is far from it as it states "in almost all conditions".

As far as the full system capabilities, we already know there are multiple situation where the side repeater cameras will be blocked when pulling out of driveways/parking lots that are adjacent to a highway. not only that but the entire architecture infrastructure is built on concrete side barriers, bushes, gardening, whatever that are at the height level of the side repeaters which would eliminate them. Making these situations far more in occurrence than you think. This is why Nissan said all cameras must be on top unless the views are jeopardized and why mobileye moved up their camera design to mirrors.

Then you look at their use of ultrasonic for parking and how that fails to see low/thin objects.

Finally the problem with vision only system (which i'm the first and only person who argued that this weren't a vision/radar system but a vision only while the radar's task will eventually be somewhat like look for the car ahead and the car ahead of it) is that the vision system doesn't see what it doesn't understand. If a UFO lands in the middle of the street, using only a vision system, the car will run into it. There are possible ways to circumvent this, using a semantic free space that detects the road textures, treat the edge of the free space as an obstacle.

But that's not only a bandage to a sink hole sized problem.

The full potential for this system is L3 on highways and L2 on urban roads.
Elon has been proclaiming over a span of 2 years ago that level 5 will ready by dec 2017, june 2018 and now lately april 2019. In 2015 he said "I really consider autonomous driving a solved problem," Musk said in June 2016 in The Guardian. "I think we are probably less than two years away."

Fast Forward 12 months and now he is saying if they solve vision then they solve autonomous driving.

He's a hype machine, you will never know tesla true capabilities listening to what he says. He vowed in 2016 that 100-200k model 3 will be made by end of 2017. The true reality is more like 10k-20k.

But I won't be surprised when Elon releases their FSD software as driver assistance early 2018. Elon has always showed he has a big ego and wants to be seen as first and the best. AP2 software was released in pre-alpha state and qa tested by customers. I won't be surprised if they do the same to FSD. But it will be a pre-alpha product with disengagement every 1-10 miles.

To Elon, this is just another way to assert that they are first to FSD and that its currently in "beta" and he knows his fans/media will lap it up. Just like his hyped media cross country drive.

bjornb said:
About processing power:

The Tesla web page says "Processing Power Increased 40x".
On the AP2 presentation Elon said "The compute power increases by a factor of 40 (40x increase in compute power), it’s such a gigantic increase in computing power, infact the computer is capable of 12 trillion operations per second".

Do we know anything more specific about this?
What kind of 'operations per second' is he talking about? TFLOPS? DLTOPS?
I have seen EyeQ3 described with a performance of 0,2 and 0,256 TFLOPS, but I dont know whether this is a comparable value to what is presented about the Nvidia PX2 with 8 TFLOPS/24 DLTOPs (I do sometimes follow and read news regarding new CPUs/GPUs on Anandtech etc, and I know that the values differ whether it is FP32, FP16 TFLOPS etc..,).

i remember Nvidia saying that one PX2 is equivalent to 4 or 6 titan x,

JeffK · May 21, 2017

Bladerskb said:
i remember Nvidia saying that one PX2 is equivalent to 4 or 6 titan x,

There are two Titan Xs unfortunately which causes confusion... in that slide Drive PX2 which is Pascal based was being compared to 2015 Titan X which was Maxwell based... the 2016 Titan X was Pascal based (and faster than Drive PX2)

The new $700 1080 Ti is also faster than Drive PX2

LargeHamCollider · May 21, 2017

Bladerskb said:
Well as you can already tell, they hyped it up by calling it "level 5" but their statement is far from it as it states "in almost all conditions".

As far as the full system capabilities, we already know there are multiple situation where the side repeater cameras will be blocked when pulling out of driveways/parking lots that are adjacent to a highway. not only that but the entire architecture infrastructure is built on concrete side barriers, bushes, gardening, whatever that are at the height level of the side repeaters which would eliminate them. Making these situations far more in occurrence than you think. This is why Nissan said all cameras must be on top unless the views are jeopardized and why mobileye moved up their camera design to mirrors.

Then you look at their use of ultrasonic for parking and how that fails to see low/thin objects.

Finally the problem with vision only system (which i'm the first and only person who argued that this weren't a vision/radar system but a vision only while the radar's task will eventually be somewhat like look for the car ahead and the car ahead of it) is that the vision system doesn't see what it doesn't understand. If a UFO lands in the middle of the street, using only a vision system, the car will run into it. There are possible ways to circumvent this, using a semantic free space that detects the road textures, treat the edge of the free space as an obstacle.

But that's not only a bandage to a sink hole sized problem.

The full potential for this system is L3 on highways and L2 on urban roads.
Elon has been proclaiming over a span of 2 years ago that level 5 will ready by dec 2017, june 2018 and now lately april 2019. In 2015 he said "I really consider autonomous driving a solved problem," now he is saying if they solve vision then they solve autonomous driving.

He's a hype machine, you will never know tesla true capabilities listening to what he says. He vowed in 2016 that 100-200k model 3 will be made by end of 2017. The true reality is more like 10k-20k.

But I won't be surprised when Elon releases their FSD software as driver assistance early 2018. Elon has always showed he has a big ego and wants to be seen as first and the best. AP2 software was released in pre-alpha state and qa tested by customers. I won't be surprised if they do the same to FSD. But it will be a pre-alpha product with disengagement every 1-10 miles.

To Elon, this is just another way to assert that they are first to FSD and that its currently in "beta" and he knows his fans/media will lap it up. Just like his hyped media cross country drive.

i remember Nvidia saying that one PX2 is equivalent to 4 or 6 titan x,

There are multiple nVidia cards called Titan X. The latest uses the same GP102 as PX2. Also stop sperging this thread.

Bladerskb · May 21, 2017

JeffK said:
There are two Titan Xs unfortunately which causes confusion... in that slide Drive PX2 which is Pascal based was being compared to 2015 Titan X which was Maxwell based... the 2016 Titan X was Pascal based (and faster than Drive PX2)

The new $700 1080 Ti is also faster than Drive PX2

not quite, based on these benchmarks.
NVIDIA GeForce GTX 1070 On Linux: Testing With OpenGL, OpenCL, CUDA & Vulkan - Phoronix
GitHub - jcjohnson/cnn-benchmarks: Benchmarks for popular CNN models

drive px2 for example is 6 times faster than titan x (2015) on AlexNet.

The is also no mention of Titan x (2016) FP16 capabilities which have probably been gimped. same goes for 1080 ti fp16 performance.

JeffK · May 21, 2017

Bladerskb said:
not quite, based on these benchmarks.
NVIDIA GeForce GTX 1070 On Linux: Testing With OpenGL, OpenCL, CUDA & Vulkan - Phoronix
GitHub - jcjohnson/cnn-benchmarks: Benchmarks for popular CNN models

drive px2 for example is 6 times faster than titan x (2015) on AlexNet.

The is also no mention of Titan x (2016) FP16 capabilities which have probably been gimped. same goes for 1080 ti fp16 performance.

What do you mean "not quite"

10 TFLOPS > 8 TFLOPS obviously

Your first link doesn't even have a 1080 Ti.

LargeHamCollider · May 21, 2017

The GTX 1080 ti is not a fully enabled GP102, PX2 probably is, of course clocks are unknown at this time so that could account for some delta.

JeffK · May 21, 2017

LargeHamCollider said:
The GTX 1080 ti is not a fully enabled GP102, PX2 probably is, of course clocks are unknown at this time so that could account for some delta.

Yeah, the consumer cards are gimped for FP16 and 8 bit ops. If you want a fully enabled chip you'd go with the Tesla line like the new V100 which kicks some serious butt.

Bladerskb · May 21, 2017

JeffK said:
What do you mean "not quite"

10 TFLOPS > 8 TFLOPS obviously

Your first link doesn't even have a 1080 Ti.

Its not that cut and dry. those flops are based on FP32 and not fp16.

"Most Deep Learning only requires half precision (FP16) calculations, so make sure you choose a GPU that has been optimised for this type of workload. For instance, while most GeForce gaming cards are optimised for single precision (FP32) they do not run FP16 significantly faster. Similarly, many older Tesla cards such as those based on the Kepler architecture were optimised for single (FP32) and double (FP64) precision and so are not such a good choice for Deep Learning. In contrast, Tesla are GPUs based on the Pascal architecture can process two half precision (FP16) calculations in one operation, effectively halving the memory load leading to a big speed up in Deep Learning. However, this is not true for all Pascal GPUs, which is why we don’t recommend GeForce cards in our Deep Learning systems. "

For example "The tegra x1 (maxwell) is able to do 0.512 Terra flops in FP32 and 1.024 in FP16 The Tegra P1 (Pascal) is a able to do 0.750 Terra Flops in FP32 ans 1.500 in FP16"

Deep learning test ran by Nvidia (notice the performance of FP16 vs FP32 of Tegra X1)

FP16 performance was artificially limited on the GTX 1080.

FP16 performance on GTX 1080 is artificially limited to 1/64th the FP32 rate • r/MachineLearning

Hence why Drive PX 2 is still more powerful than all of them because Nvidia made it that way.

JeffK · May 21, 2017

Bladerskb said:
Its not that cut and dry. those flops are based on FP32 and not fp16.

"Most Deep Learning only requires half precision (FP16) calculations, so make sure you choose a GPU that has been optimised for this type of workload. For instance, while most GeForce gaming cards are optimised for single precision (FP32) they do not run FP16 significantly faster. Similarly, many older Tesla cards such as those based on the Kepler architecture were optimised for single (FP32) and double (FP64) precision and so are not such a good choice for Deep Learning. In contrast, Tesla are GPUs based on the Pascal architecture can process two half precision (FP16) calculations in one operation, effectively halving the memory load leading to a big speed up in Deep Learning. However, this is not true for all Pascal GPUs, which is why we don’t recommend GeForce cards in our Deep Learning systems. "

For example "The tegra x1 (maxwell) is able to do 0.512 Terra flops in FP32 and 1.024 in FP16 The Tegra P1 (Pascal) is a able to do 0.750 Terra Flops in FP32 ans 1.500 in FP16"

Deep learning test ran by Nvidia (notice the performance of FP16 vs FP32 of Tegra X1)

FP16 performance was artificially limited on the GTX 1080.

FP16 performance on GTX 1080 is artificially limited to 1/64th the FP32 rate • r/MachineLearning

As I said when measuring GPU performance generally we use FP32 so as far as FP32 the Pascal Titan X and 1080Ti are faster than Drive PX2 that's undeniable.

There's no question that the Drive PX2 will outperform the TitanX in FP16. Just so you know 1080 != 1080 Ti btw

Now did you hear about Google's new TPU 2.0 180 TFLOPS (at possibly FP16 instead of the older one that only did 8 bit ops). I want one so bad....

KyleDay · May 21, 2017

Thank you all for this info. Are there reasons why we haven't seen a PX2 board extracted out of a HW2 car yet? I have the car in my garage and just had the car taken apart during a stereo installation. Is there any reason I shouldn't open up the glove compartment area and pull out the board?

lunitiks · May 21, 2017

kdday said:
Is there any reason I shouldn't open up the glove compartment area and pull out the board?

Abso-frigging-lutely not! Please, please, PLEASE do it. I'll provide you with all the info and step-by-step stuff you need

JeffK · May 21, 2017

kdday said:
Thank you all for this info. Are there reasons why we haven't seen a PX2 board extracted out of a HW2 car yet? I have the car in my garage and just had the car taken apart during a stereo installation. Is there any reason I shouldn't open up the glove compartment area and pull out the board?

If you're decent with handling electronics and taking apart stuff then there's no reason you can't open it up to take some pics. If you don't feel comfortable then it's up to you. I'm sure such pics would be featured on Electrek and other news sites so prepare for 15 min of fame.

AP2.0 Cameras: Capabilities and Limitations?

Member

Well-Known Member

Active Member

Active Member

Cool James & Black Teacher

Active Member

Battery cells != scalable

Curious member

Senior Software Engineer

Well-Known Member

Battery cells != scalable

Senior Software Engineer

Well-Known Member

Battery cells != scalable

Well-Known Member

Senior Software Engineer

Well-Known Member

Active Member

Cool James & Black Teacher

Well-Known Member

Similar threads