If I was making the system, the car would flag each situation with reliability-score, meaning how sure it was to know what to do, both before it happened and looking back.
All the data where car was almost 100% sure to do well and it did go well, would be filtered out by the big data and NN. It...