Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

HW2.5 capabilities

This site may earn commission on affiliate links.
I would love to get some input on this thread about Tesla’s camera specs from the sleuths in this thread. Two-part question: 1) what specs are needed to avoid motion blur and other visual artifacts while driving at high speeds and 2) do Tesla’s Hardware 2 cameras meet those specs?

Trent, there are thousands of posts about the cameras on the Tesla model S and X's. Instead of asking the people in this thread to repeat information that has already been discussed, I would recommend that you use the search feature on TMC to answer your questions.
 
Instead of asking the people in this thread to repeat information that has already been discussed, I would recommend that you use the search feature on TMC to answer your questions.

The paper I'm referring to was published just last month and my post is the first one on the TMC forums to mention the title. My post is also the first to mention "motion blur" in the context of Hardware 2 cameras or autonomy (the rest are about motion blurred photos of the Model 3, etc.). To my knowledge, this information has never been discussed before here.

I tried to search the exact model of cameras used in Hardware 2 but it is not easy info to find. Searching "what cameras does Tesla use", "Hardware 2 cameras", "camera model" etc. doesn't get you anywhere. Thankfully verygreen was kind enough to let me know it's the ON Semiconductor ART0132. (Hopefully now if someone searches those same terms, this post will come up.)
 
  • Helpful
Reactions: scottf200
The paper I'm referring to was published just last month and my post is the first one on the TMC forums to mention the title. My post is also the first to mention "motion blur" in the context of Hardware 2 cameras or autonomy (the rest are about motion blurred photos of the Model 3, etc.). To my knowledge, this information has never been discussed before here.

I tried to search the exact model of cameras used in Hardware 2 but it is not easy info to find. Searching "what cameras does Tesla use", "Hardware 2 cameras", "camera model" etc. doesn't get you anywhere. Thankfully verygreen was kind enough to let me know it's the ON Semiconductor ART0132. (Hopefully now if someone searches those same terms, this post will come up.)
You were using the wrong search terms. Hardware 2 is a term that practically nobody uses in discussion. At most you see HW2, but people typically just say AP2.0.

We have a long thread dedicated to AP2.0 cameras right here:
AP2.0 Cameras: Capabilities and Limitations?
 
You were using the wrong search terms. Hardware 2 is a term that practically nobody uses in discussion. At most you see HW2, but people typically just say AP2.0.

We have a long thread dedicated to AP2.0 cameras right here:
AP2.0 Cameras: Capabilities and Limitations?

"AP 2.0 camera model" still doesn't turn up the answer. Try searching yourself and see if the answer comes up.

I am probably ten or so pages into the AP2.0 Cameras thread. But I think it's reasonable to ask for information that isn't easily searchable rather than have to read a 48-page thread in the hopes of finding information that might not be in there.
 
"AP 2.0 camera model" still doesn't turn up the answer. Try searching yourself and see if the answer comes up.

I am probably ten or so pages into the AP2.0 Cameras thread. But I think it's reasonable to ask for information that isn't easily searchable rather than have to read a 48-page thread in the hopes of finding information that might not be in there.
"ap2.0 camera" turns up the thread, but not the model.

"camera model" turns up this though in the 5th result (would be 4th without our recent posts):

And this follows shortly after.
All cameras, except rearview cam, uses Aptina / ON Semiconductor AR0132. Rearview cam uses OmniVision OV10635.

The datasheet links are also in that thread.
 
Trent, there are thousands of posts about the cameras on the Tesla model S and X's. Instead of asking the people in this thread to repeat information that has already been discussed, I would recommend that you use the search feature on TMC to answer your questions.

Ok please point out the prior discussion on global vs rolling shutters for @Trent Eady and the rest of us because I could not find it after searching the phrase "global shutter" just now. The question he raises is an interesting one - as anyone with some experience in digital filmmaking understands.
 
stopcrazypp, okay, but the actual camera model doesn't appear in the search result. Unless you already knew what to look for, you wouldn't know the answer is on the next page. It would be nice to have this information be more accessible. I don't think asking questions is bad, and I'm personally happy to give people information that can't easily be found.

Anyway, my question wasn't just about the HW2 camera specs. It was also about what specs are required to prevent motion blur and other visual artifacts that might interfere with visual localization and mapping at high speeds. This question has not yet been discussed on this forum, as far as I'm aware.

I heard from a reliable source who has professional experience in computer vision that 60 fps and a global shutter should be enough. I was happy to see Tesla's HW2 cameras are 60 fps. I was a bit concerned to see they have an electronic rolling shutter rather than a global shutter. Motion blur will occur at high speeds with this hardware.

However, motion blur can apparently be counteracted with software. This is the next thing I'm looking into. I want to see how effective this software is.

In case the stakes are not clear here, this is about closing in on hard, quantiative evidence to support Elon Musk's assertion that "you can absolutely be superhuman with just cameras". If a multi-camera system is demonstrably better or at least as good as humans at localization at low driving speeds, and if there is nothing to stop a multi-camera system from being as accurate at high driving speeds, then this is hard evidence that a multi-camera system (plus GPS) provides all the sensor input a self-driving car needs to be better at driving than humans.
 
Last edited:
  • Like
Reactions: scottf200
I heard from a reliable source who has professional experience in computer vision that 60 fps and a global shutter should be enough.
As pretty much anybody serious into videography would tell you, fps does not really matter, it's the actual shutter speed that matter since nothing prevents you from having 30fps video where exposure for every time is 1/120s or less. Lots of cameras internally would have shorter exposure time when the light permits it (o rather, requires it I guess) to maintain correct exposure.
Motion blur also highly depends on how wide your lens is. more zoomed in (longer focal distance) you get more motion blur.

There are several ways of doing global shutter I imagine. One is to physically block the sensor while you are reading it (sensor reading is not infinite speed for obvious reasons and the more pixels there are the slower it is). I don't think there's a physical shutter in the car.
The other one is to somehow block sensor pixels from updating while you are reading the next frame, I am not sure there's anything tht does this in the aptina sensor used.
 
Last edited:
The AR0132AT camera apparently has an electronic rolling shutter, according to the specs on ON Semiconductor's website.

I found this super useful and apropos explanation of what an electronic rolling shutter is:

Most CMOS image sensors (to save one transistor per cell compared to a true "snapshot" shutter) use Electronic Rolling Shutter. Basically it implements two pointers to sensor pixels, both proceeding in the same line-scan order across the sensor.

One is erase pointer, the other one - readout. Erase pointer runs ahead discharging each photosensitive cell, then follows the readout one. Each pixel sees (and accumulates) the light for the same exposure time (from the moment erase pointer passes it till it is read out), but that happens at different time.

If you will make an image of a fast passing car with ERS with short exposure - it will all be sharp (no blurring), but the car will seem to lean backwards. The roof will be captured at earlier time than the wheels and this time difference across the frame can be as long as 1/15 of a second for the full frame of the MT9P001 5MPix sensor (it is equal to the frame readout time that is equal to 1 sec divided by the frame rate in most circumstances).

So, this isn't exactly motion blurring, but it is a similar visual artifacts created by an electronic rolling shutter. Hmmm!

Has anyone seen a blurred image taken from a HW2 camera, or have they all been sharp like described above? I looked through the images posted here, and most seem to be when the car is stopped. But I haven't looked through every image yet.

I wonder how easy or hard it would be to adjust for the fact that e.g. a car's roof is ahead of its wheels in an image.
 
stopcrazypp, okay, but the actual camera model doesn't appear in the search result. Unless you already knew what to look for, you wouldn't know the answer is on the next page. It would be nice to have this information be more accessible. I don't think asking questions is bad, and I'm personally happy to give people information that can't easily be found.
Just a general tip when doing searching (I do it a lot). If using specific and narrow search terms doesn't find what you want, broaden it and using synonyms (the search here is not sophisticated enough to do that automatically). Also searching by putting it as a question in a human format is rarely works, as the extra words are completely irrelevant to the search engine (it would only work if someone worded the question exactly as you did).

Anyway, my question wasn't just about the HW2 camera specs. It was also about what specs are required to prevent motion blur and other visual artifacts that might interfere with visual localization and mapping at high speeds. This question has not yet been discussed on this forum, as far as I'm aware.

I heard from a reliable source who has professional experience in computer vision that 60 fps and a global shutter should be enough. I was happy to see Tesla's HW2 cameras are 60 fps. I was a bit concerned to see they have an electronic rolling shutter rather than a global shutter. Motion blur will occur at high speeds with this hardware.

However, motion blur can apparently be counteracted with software. This is the next thing I'm looking into. I want to see how effective this software is.

In case the stakes are not clear here, this is about closing in on hard, quantiative evidence to support Elon Musk's assertion that "you can absolutely be superhuman with just cameras". If a multi-camera system is demonstrably better or at least as good as humans at localization at low driving speeds, and if there is nothing to stop a multi-camera system from being as accurate at high driving speeds, then this is hard evidence that a multi-camera system (plus GPS) provides all the sensor input a self-driving car needs to be better at driving than humans.
The naysayers about cameras aren't going to be convinced by this argument. I don't think I have seen them question localization in general. Rather it's about situations with glare, poor weather, and low light.

Edit: I looked at that thread and I see the same thing. People aren't worried about rolling shutter artifacts at all, it's about the other things I mentioned.

PS: as someone with a bit of experience in video, motion blur doesn't depend on the directly on fps but rather on the shutter speed. You can shoot something at 60 fps, but use a faster shutter speed to get less motion blur. In the industry this relation between frame rate and shutter speed is described as shutter angle. Here's an article on this, with examples (all shot at the same 24 fps frame rate):
still-045.jpg

24 fps video shot at 1/192 second shutter speed

still-360.jpg

24 fps video shot at 1/24 second shutter speed
Shutter Angles & Creative Control
 
Last edited:
  • Helpful
Reactions: strangecosmos
The naysayers about cameras aren't going to be convinced by this argument. I don't think I have seen them question localization in general. Rather it's about situations with glare, poor weather, and low light.

From what I understand, lidar performs even worse than cameras in poor weather because when a laser pulse hits a raindrop, snowflake, or fog it refracts. Low light is an interesting one. I've seen some examples that suggest deep neural networks might be able to see in the dark better than humans. Even if a human looking at the camera image can't pick out a vehicle, a deep neural network might be able to. Don't know about glare.

The objection to eschewing lidar I hear most often is around visual SLAM. For example, Ben Evans at the venture capital firm Andreessen Horowitz wrote this:

...Tesla will have to use imaging alone to build most of the model of the world around itself, but, as I noted above, we don’t yet know how to do that accurately. This means that Tesla is effectively collecting data that no-one today can read (or at least, read well enough to produce a complete solution). Of course, you would have to solve this problem both to collect the data and actually to drive the car, so Tesla is making a big contrarian bet on the speed of computer vision development. Tesla saves time by not waiting for cheap/practical LIDAR (it would be impossible for Tesla to put LIDAR on all of its cars today), but doing without LIDAR means the computer vision software will have to solve harder problems and so could well take longer. And if all the other parts of the software for autonomy - the parts that decide what the car should actually do - take long enough, then LIDAR might get cheap and practical long before autonomy is working anyway, making Tesla’s shortcut irrelevant. We’ll see.

Paulo Santos, an analyst for Seeking Alpha, wrote this:

In lacking LIDAR, [a Tesla car] also lacks the ability to precisely know positions, directions, speeds, even for the objects it can detect. ... Worse still, in lacking LIDAR it also doesn’t have the ability to position itself exactly in the world.

The paper I found demonstrated visual SLAM with high accuracy, but only at low speeds. If there's no barrier to a similar system achieving high accuracy at high driving speeds, then this objection is overcome.
 
The paper I found demonstrated visual SLAM with high accuracy, but only at low speeds. If there's no barrier to a similar system achieving high accuracy at high driving speeds, then this objection is overcome.
I think speed-measuring devices based purely on video camera were at least somewhat common in the past. This is even without taking potential stereo vision into account.
I suspect given infinite processing speed you can do pretty well even with a single pixel sensor, like here A faster single-pixel camera (some examples here https://www.ams.org/samplings/math-history/hap7-pixel.pdf )
 
Comments in code are super common.

In shipping binaries?

Comments do not make it into the compiled binaries. They remain in the source files for the programmer to remember what he or she did or for the poor sots who inherit the code down the line.

Yes, you'll see lots of readable characters. However, those readable characters are not comments. They are strings used by the binary to log messages, open files, request URLS, etc, etc. You'll also see strings used reference symbols in other modules, and to export symbols from the library you're looking at. If you are super lucky, you'll also have debug symbols (like function names for non-exported functions, variables etc). I've never seen source code (eg, comments) included in a binary.

I suspect @verygreen was talking about something interpreted, like shell, perl, or python, etc.

Code can be obfuscated by encrypting strings, but it adds overhead and makes things more complex. Most programs you can find strings in the compiled code, but they are generally strings used in the code, not comments written by the programmer. For example the text of something shown to a user at some point would be in there. A hacker can guess that the code around it is related to the message in some way, but without a reverse compiler, or a reverse macro assembler (and a lot of time), it's going to be tough to figure out what the code actually does.

Script languages are a different matter. They are written in plain language and anyone who can get to the files can read the code. A lot of web languages like PHP are script languages.

But only a moron of a programmer would write any complex algorithms with a script language. Even in web code there is usually some kind of compiled code for the more complex parts. Script languages are more difficult to make secure (usually you have to use some kind of OS level security to prevent people from accessing) and they are much slower than compiled languages.

For anything "real time", ie needs to do things in a time scale imperceptible to the user, C and C++ are the two most common languages. The super time critical code is written in assembly, but that's very rare these days. I haven't had to write assembly in close to 30 years (I've worked on a number of real time systems). Most games of any complexity are written in C and/or C++ these days too.
 
  • Helpful
  • Informative
Reactions: VT_EE and croman