I’m super excited by this recent paper: “3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection”. The authors used a four-camera system — similar to the eight-camera system in Harware 2 Teslas — to determine the locations of obstacles in real time with an accuracy of under 10 cm (3.9 in). By comparison, lidar has an accuracy of 1.5 cm (0.6 in).
My strong hunch is that an accuracy of under 10 cm is good enough for full self-driving. For reference, a credit card is 8.6 cm (3.4 in) long. At that point, you’re getting to the limit of how accurately human driver can control a car. I found a study where drivers were only able to park with about 10 cm of accuracy at best.
The big caveat here is that the multi-camera system was only tested at low speeds. The experiments occured in a parking garage. I have not been able to find any published research on multi-camera systems at high driving speeds.
Here’s what I’m trying to figure out now. What would it take to adapt a multi-camera system to high driving speeds, while retaining an accuracy of under 10 cm?
Based on my interactions on Quora, Facebook, Twitter, Stack Exchange and by emailing the paper’s first author, the challenge seems to be motion blur and other visual artifacts that occur at higher speeds. Some people I have talked to have suggested that this can be overcome with cameras that use a global shutter (i.e. that capture every pixel simultaneously, as opposed to a rolling shutter which captures pixels line by line) and a high enough frame rate. One person suggested shutter speed is also important.
I’m hoping the community here can help me answer this question conclusively — or as conclusively as possible without running a test of a multi-camera system at high speeds. People here really go deep on the cameras used in Hardware 2 Teslas as well as the software. I don’t have a deep understanding of the technical details. So I’m looking for some help.
My strong hunch is that an accuracy of under 10 cm is good enough for full self-driving. For reference, a credit card is 8.6 cm (3.4 in) long. At that point, you’re getting to the limit of how accurately human driver can control a car. I found a study where drivers were only able to park with about 10 cm of accuracy at best.
The big caveat here is that the multi-camera system was only tested at low speeds. The experiments occured in a parking garage. I have not been able to find any published research on multi-camera systems at high driving speeds.
Here’s what I’m trying to figure out now. What would it take to adapt a multi-camera system to high driving speeds, while retaining an accuracy of under 10 cm?
Based on my interactions on Quora, Facebook, Twitter, Stack Exchange and by emailing the paper’s first author, the challenge seems to be motion blur and other visual artifacts that occur at higher speeds. Some people I have talked to have suggested that this can be overcome with cameras that use a global shutter (i.e. that capture every pixel simultaneously, as opposed to a rolling shutter which captures pixels line by line) and a high enough frame rate. One person suggested shutter speed is also important.
I’m hoping the community here can help me answer this question conclusively — or as conclusively as possible without running a test of a multi-camera system at high speeds. People here really go deep on the cameras used in Hardware 2 Teslas as well as the software. I don’t have a deep understanding of the technical details. So I’m looking for some help.