Paper: “Minimizing Supervision for Free-space Segmentation”
Here's an awesome application of weakly supervised learning for semantic segmentation of free space (i.e. unobstructed roadway that a car can safely drive on). The researchers use human driving as a form of automatic labelling instead of manual annotation. They exploit the fact that wherever humans drive is free space (or at least it is 99.99%+ of the time). The researchers note:
If you can automatically label 10,000x as much data (or more) as you can afford to manually label — which is true for a company like Tesla — then I would imagine weakly supervised learning would outperform fully supervised learning. A hybrid approach in which you use a combination of manually labelled and automatically labelled data might outperform both.
According to research from Baidu, on the ImageNet benchmark for image recognition, accuracy improves sub-linearly with labelled training data such that a 10,000x increase in data would yield a roughly 10x to 100x increase in accuracy (assuming the neural network doesn't run into any fundamental limits).
Paper abstract:
A key excerpt:
Open access PDF:
http://openaccess.thecvf.com/conten...inimizing_Supervision_for_CVPR_2018_paper.pdf
Examples of segmentations included in the paper:
Here's an awesome application of weakly supervised learning for semantic segmentation of free space (i.e. unobstructed roadway that a car can safely drive on). The researchers use human driving as a form of automatic labelling instead of manual annotation. They exploit the fact that wherever humans drive is free space (or at least it is 99.99%+ of the time). The researchers note:
“Of course, fully supervised somewhat outperforms our results (0.853 vs 0.835). Nonetheless, it is impressive that our technique achieves 98% of the IoU [Intersection over Union] of the fully supervised model, without requiring the tedious pixel-wise annotations for each image. This indicates that our proposed method is able to perform proper free-space segmentation while using no manual annotations for training the CNN [convolutional neural network].”
If you can automatically label 10,000x as much data (or more) as you can afford to manually label — which is true for a company like Tesla — then I would imagine weakly supervised learning would outperform fully supervised learning. A hybrid approach in which you use a combination of manually labelled and automatically labelled data might outperform both.
According to research from Baidu, on the ImageNet benchmark for image recognition, accuracy improves sub-linearly with labelled training data such that a 10,000x increase in data would yield a roughly 10x to 100x increase in accuracy (assuming the neural network doesn't run into any fundamental limits).
Paper abstract:
“Identifying "free-space," or safely driveable regions in the scene ahead, is a fundamental task for autonomous navigation. While this task can be addressed using semantic segmentation, the manual labor involved in creating pixel-wise annotations to train the segmentation model is very costly. Although weakly supervised segmentation addresses this issue, most methods are not designed for free-space. In this paper, we observe that homogeneous texture and location are two key characteristics of free-space, and develop a novel, practical framework for free-space segmentation with minimal human supervision. Our experiments show that our framework performs better than other weakly supervised methods while using less supervision. Our work demonstrates the potential for performing free-space segmentation without tedious and costly manual annotation, which will be important for adapting autonomous driving systems to different types of vehicles and environments.”
A key excerpt:
“We now describe our technique for automatically generating annotations suitable for training a free-space segmentation CNN. Our technique relies on two main assumptions about the nature of free-space: (1) that free-space regions tend to have homogeneous texture (e.g., caused by smooth road surfaces), and (2) there are strong priors on the location of free-space within an image taken from a vehicle. The first assumption allows us to use superpixels to group similar pixels. ... The second assumption allows us to find “seed” superpixels that are very likely to be free-space, based on the fact that free-space is usually near the bottom and center of an image taken by a front-facing in-vehicle camera.”
Open access PDF:
http://openaccess.thecvf.com/conten...inimizing_Supervision_for_CVPR_2018_paper.pdf
Examples of segmentations included in the paper: