Conventional vehicle safety is ~1 fatality per 1M miles. A failure doesn't always equal fatality, so if we assume one fatality per 20 failures that's 50k miles, so it's not a completely unreasonable KPI target for autonomous driving. I'd agree that's definitely in the low end.
Thanks for explaining how you came up with 50k miles per safety critical intervention. I would add that you would also want your AV to avoid non-fatal accidents as well. There can be non-fatal accidents that are still pretty serious (cause some physical damage and injury). So we can't just look at fatal accidents. So yeah, 50k is definitely on the low end.
I like Mobileye's definition of a failure: a failure is a perception error where if the driving policy relies on the perception, there will be an accident. They also make the assumption that the perception error starts no more than 10 seconds before the accident.
So they are defining failure where 1 failure = 1 accident. That is much simpler. It excludes more minor perception errors that don't cause accidents. You don't have to make any assumptions of how many failures per accidents. And the failure rate becomes easier to calculate. The failure rate can simply equal the accident rate that you want to achieve.
FYI, Mobileye has an interesting paper on how you can estimate the accident rate directly from the perception MTBF.
https://arxiv.org/pdf/2205.02621.pdf
Another way to look at this, is to look at behavior capabilities. Here is the NHTSA list of behavior capabilities:
Parking (Note: ODD may include parking garages, surface lots, parallel parking)
•Navigate a parking lot, locate spaces, make appropriate forward and reverse parking maneuvers
Lane Maintenance & Car Following (Note: ODD may include high and low speed roads)
•Car following, including stop and go, lead vehicle changing lanes, and responding to emergency braking
• Speed maintenance, including detecting changes in speed limits and speed advisories
•Lane centering
•Detect and respond to encroaching vehicles
•Enhancing conspicuity (e.g., headlights)
•Detect and respond to vehicles turning at non-signalized junctions
Lane Change(Note: ODD may include high and low speed roads)
•Lane switching, including overtaking or to achieve a minimal risk condition
•Merge for high and low speed
•Detect and respond to encroaching vehicles
•Enhancing conspicuity (e.g., blinkers)
•Detect and respond to vehicles turning at non-signalized junctions
•Detect and respond to no passing zones
Navigate Intersection (Note: ODD may include signalized and non-signalized junctions)
•Navigate on/off ramps
•Navigate roundabouts
•Navigate signalized intersection
•Detect and respond to traffic control devices
•Navigate crosswalk
•U-Turn
•Car following through intersections, including stop and go, lead vehicle changing lanes, and responding to emergency braking
•Navigate rail crossings
•Detect and respond to vehicle running red light or stop sign
•Vehicles turning - same direction
•LTAP/OD at signalized junction and non-signalized junction
•Navigate right turn at signalized and non-signalized junctions
Navigate Temporary or A Typical Condition
•Detect and respond to work zone or temporary traffic patterns, including construction workers directing traffic
•Detect and respond to relevant safety officials that are overriding traffic control devices
•Detect and respond to citizens directing traffic after an incident
•N-point turn
OEDR: Vehicles
•Detect and respond to encroaching, oncoming vehicles
•Vehicle following
•Detect and respond to relevant stopped vehicle, including in lane or on the side of the road
•Detect and respond to lane changes, including unexpected cut ins
•Detect and respond to cut-outs, including unexpected reveals
•Detect and respond to school buses
•Detect and respond to emergency vehicles, including at intersections
•Detect and respond to vehicle roadway entry
•Detect and respond to relevant adjacent vehicles
•Detect and respond to relevant vehicles when in forward and reverse
OEDR: Traffic Control Devices and Infrastructure
•Follow driving laws
•Detect and respond to speed limit changes or advisories
•Detect and respond to relevant access restrictions, including one-way streets, no-turn locations, bicycle lanes, transit lanes, and pedestrian ways (See MUTCD for more complete list))
•Detect and respond to relevant traffic control devices, including signalized intersections, stop signs, yield signs, crosswalks, and lane markings (potentially including faded markings) (See MUTCD for more complete list)
•Detect and respond to infrastructure elements, including curves, roadway edges, and guard rails (See AASHTO Green Book for more complete list)
OEDR: Vulnerable Road Users, Objects, Animals
•Detect and respond to relevant static obstacles in lane
•Detect and respond to pedestrians, pedal cyclists, animals in lane or on side of road
ODD Boundary
•Detect and respond to ODD boundary transition, including unanticipated weather or lighting conditions outside of vehicle's capability
Degraded Performance/Health Monitoring, Including Achieving Minimal Risk Condition
•Detect degraded performance and respond with appropriate fail-safe/fail-operational mechanisms, including detect and respond to conditions involving vehicle, system, or component-level failures or faults (e.g., power failure, sensing failure, sensing obstruction, computing failure, fault handling or response)
•Detect and respond to vehicle control loss (e.g., reduced road friction)
•Detect and respond to vehicle road departure
•Detect and respond to vehicle being involved in incident with another vehicle, pedestrian, or animal
•Non-collision safety situations, including vehicle doors ajar, fuel level, engine overheating
Failure Mitigation Strategy
•Detect and respond to catastrophic event, for example flooding or debilitating cyber attack
Source:
https://www.nhtsa.gov/sites/nhtsa.g...82-automateddrivingsystems_092618_v1a_tag.pdf
I like this list because it gives a good idea of what an AV needs to be able to do. To be "eyes off", the AV needs to be ~99.9999% reliable in the capabilities listed that are relevant to its ODD. So which capabilities on this list is FSD Beta ~99.9999% reliable? That gives us a sense of what capabilities FSD beta still needs to work on in order to achieve "eyes off". I think it shows that FSD Beta is a long ways from "eyes off".