My perspective on this. When I did my master thesis we could choose between doing Lidar SLAM or Camera SLAM. We choose Lidar because we felt it suited us better compared to the other team who were more suited for the camera project.
The camera team had it somewhat easier because they could pretty much just download ORBSLAM and have a fancy demo running without too much work. We as the Lidar team had struggle with many of the steps ourselves, such as key point extraction and feature descriptors, which back then was far from trivial. But we had an easier time with particle filters etc for positioning.
It seems that we have two fields converging:
- Probabilistic Robotics, Sebastian Thrun et al. Particle filter, graphSLAM and classical hand made tools trying a little bit of Machine Learning
- Computer Vision, Andrej Karpathy et al. CNN and other computer science tools trying a little bit of Robotics
The probabilistic robotics guys love their Lidars, it works in the same bird’s eye framework as they see the world. The Computer Vision guys love their cameras, the input comes in a nice structured matrix, the same way as they see the world.
We are now seeing
deep learning making great depth maps out of camera images and we are seeing
classical point clouds from camera images making great object detections. The first runs great on GPUs/TPUs, the latter will complicate how to pipe the code a lot... But the main takeaway is that the two fields are starting to overlap. A very interesting fusion of domains that will confuse a lot of people in both domains.
We are at a time where we have a lot more computer scientists coding than we have roboticists coding, but we have more roboticists building vehicles than we have computer scientists building vehicles. Cameras are cheaper, they are passive sensors. Lidars are getting cheaper fast but there will likely always be a difference of some magnitudes. Lidars rely less on intelligence, if you don’t get a reading in front of you, you can be pretty certain that there is free space in front of you. But with some clever software and a gigantic amount of data the camera is catching up. Thus the price and power benefits starts to favor the camera.
Imo at this point, cameras are easier to work with but hard to do well. Lidars are hard to work with, but easier to do well. I think Teslas approach will turn out to be the right one and I am very impressed by Elon’s ability of coming to this conclusion much earlier than most other experts. I was wrong on this.