This weekend, I watched Lex Fridman interviews with Elon Musk (part 1- Autopilot), as well as discussions around Moore's Law, Processors, and FSD with Jim Keller. The relevant extracts of these interviews are contained in this video (LINK). For the purposes of this discussion, the last link is the best. If you're an engineer or want to dive deep, watch the first two links.
As someone who has designed communication protocols, as well as studied fundamental truths of human nature and behavior, it appears to me that a lot of the orchestration of vehicles and driving is viewed in an overly simplistic way by those who work on autopilot. But maybe I just don't have enough data about the thought that has gone into those systems. While AP and FSD have no doubt been making great advances, I'm curious as to how these issues are viewed, since I haven't heard them publically discussed, so I wanted to share my thoughts and observations here. Warning, this is a long post written linearly and unedited, so take this as raw opinion.
First, let me give you my background since it certainly colors my views and experience in this area. I was born and learned to drive in Boston 30 years ago, without GPS devices, where the legitimate answer to a navigation query was sometimes "You can't get theyah from heyah." I received my BS and MS in Electrical & Computer Engineering at Carnegie Mellon in Pittsburgh, spent 25 years working in hardware design, network engineering, and hyperscale datacenter infrastructure operations in San Francisco and Silicon Valley. I am also a Reserve Police Officer, went through the full police academy, and worked part-time as a solo patrol officer in the Bay Area for 15 years. So I know the vehicle code very well, and have carefully observed (and given warnings / written citations for) various driving behaviors. I have also traveled to over 70 countries, and driven in more than 10 of them. I now reside in Austin, Texas where I work remote for a Silicon Valley startup that creates advanced data center network automation and orchestration software.
--
During his interview, Lex pushed back pretty hard against Jim Keller regarding whether driving (and FSD) can be easily accomplished by machines. Jim's postulate is that "You don't have to be super smart to drive a car." He later states that it's harder to affix brightwork (chrome trim) to a vehicle in a factory than it is to drive a car and that it (the former) is something that Lex Fridman couldn't do. I found the flippant comments (which Lex alluded to by saying that Jim was mocking him) a bit distasteful, but given the substandard alignment of brightwork on my 2017 Model S, I can understand why Jim thinks it's so hard. Tesla is still learning how complex and difficult manufacturing is. And while AI systems get exponentially better at mimicking human behaviors of various kinds, it's not clear to me that that's "enough" to allow them to mingle, unsupervised, with human drivers in various scenarios. I have to side with Fridman on this topic. (Though Jim is a brilliant computer designer and I loved listening to him in this interview. He made many excellent and amazing points.)
Something that is entirely lacking in Jim's and Elon's dialog regarding autopilot and FSD is a discussion around the etiquette and interactions that occur when driving. There is a tremendous amount of gray area regarding how we navigate our vehicles with respect to others' vehicles, yield or protect our right of way, let others merge, resolve deadlocks in parking lots or driveways when vehicles are blocking each other, and many other driving scenarios where facial expressions, hand gestures, yelling out the window, honking of the horn, and other verbal and nonverbal communication methods are used. While US state laws are clear, they are by no means exhaustive, and there are many gray areas. Here, we rely on human interaction and common sense to resolve situations. Machines do not have common sense; they do not have ambiguity resolution capabilities. None of this has been discussed in any of the interviews I've heard from these guys. My impression given Jim's comments is that you can just "muscle in" to any open spot, or simply create one by forcing others to yield right of way to you.
Now, let's consider that the etiquette on how to resolve these situations, and what is polite. From my perspective, having driven around the country and around the world, it is *highly* differentiated in terms of how you request, and grant, a yielding of the right of way based on the region that you're in and local rules. Even in the Bay Area, I've seen driving norms and etiquette change quite dramatically in the 30+ years I've watched drivers there. What's OK to do in downtown Boston, New York, New Delhi, or Ho Chi Minh City, would inspire anger and even rage, in Texas or Switzerland or Idaho.
How do drivers handle this? What's the psychological aspect? When drivers get enraged, what happens? How does law enforcement evaluate the justness of a maneuver? If you came up to a 4-way intersection, and three other cars stopped at exactly the same time as you, and the three other cars in the intersection had deeply tinted windows and windshield so that you could not see the drivers' faces at all, how would you go about proceeding through the intersection? What if the driver of the deeply tinted windows (e.g. FSD computer) wanted to yield right of way to you, how would it signify this? By waving you through? What does it look like when an FSD computer "waves" you through to yield right of way?
What if a driver is a total "a-hole"? Maybe we are not super surprised if the person is driving a specific model of car, because people with those personality traits tend to buy a car that fits a specific stereotype. Will we expect a Toyota Prius FSD and a BMW M3 FSD to behave differently? What about a Tesla Model 2 and a Tesla Roadster 2020? Will their FSDs also behave and interact with other cars differently?
When a Tesla in FSD mode is pulled over by a police officer (which Elon has commented that they've prepared for and it's pretty easy) because it just got in a crash, will the officer be able to ask questions of the driver as to why they did what they did? Will the software have an API to query the various layers of the deep learning model, so that the officer can understand that the computer thought it saw a cat running into the street and slammed on its brakes, even though what it really saw was the shadow from a plastic bag flying through the air?
Each state has standards about the tests a driver must pass in order to be granted a driver license and be given the authority to drive, and responsibility to be penalized, under civil or criminal penalties, for violating the vehicle codes. How will Tesla FSD pass these tests? Will they be regression tests that are run nightly when the CI/CD pipeline has new code that developers have checked in? What if the software goes wonky and drives like someone who is under the influence, passes over the center divider, and causes a head-on collision with a mom driving her kids home? Who will be criminally responsible for these deaths? The owner of the car who was sleeping in the back seat? The developer who checked in a buggy piece of code two weeks before, that was pushed with last night's software update? Tesla as a corporation?
Now, while a 25 year old PhD graduate on Pete Bannon's team may have sufficient knowledge to work on cutting-edge chip microarchitecture or deep learning pipeline for the next FSD chip, and a software developer in Andrej Karpathy's team may be able to implement the best object recognition or bounding box capabilities in the industry, what I'm really curious about is Stuart Bowers's work regarding the FSD software itself. During Autonomy Day, he said that they review all the times that autopilot is disengaged, crashes, etc, and try to learn from those. But what are they really learning? He said that they choose behaviors from "good" drivers, even when in shadow mode, to train the AI. But what is a good driver? If I take someone who's never been in an accident in rural Idaho, and also never traveled outside the state, and put them into downtown LA traffic, on a busy road where they slow down so much that people are honking at them, angry that they haven't acted, and maybe a fast moving delivery van comes up behind this car at 50 mph, the Idaho driver slowly changes lanes in front of the 50 MPH driver, causing a sudden rear-end collision and two fatalities.
Likewise, the delivery driver going 50 mph (in a 35 mph zone), who has only had two minor fender benders (and 1 traffic citation) in 5 years of driving 60k miles per year, goes to rural Idaho, what's going to happen? If he's so aggressive in this small town that he gains a bad reputation, what are the consequences? What if the AI running in small town Idaho was trained by the vast number of miles of this delivery driver in a Tesla Semi in LA? The point I'm trying to illustrate is that driving is not abstract; it's not in a vacuum; and local norms vary significantly and are context dependent.
If a new playground can be created for FSD, new roads, a town, or an area where we all agree that the protocol people use for acquiring and yielding right of way, signaling intent, and agreeing on norms and opportunistic intensity level ("chill" vs "Mad Max" on changing lanes to get ahead), that will make sense. When all cars are FSD, will they have a communication protocol to exchange navigation and right of way / synchronization / orchestration info? What is the protocol expected to be between humans and FSD computers? If FSD cars are just thrown into the mix all over the country / world, these issues will grind and cause issues, and probably some serious injuries or deaths will emerge, specifically because of this choice.
Note, I'm not talking about the ability of FSD to detect objects, plan routes, respond to condition changes, and follow standard routes in light to moderate traffic. Those will undoubtedly improve at a fast pace. I'm talking about the propensity of FSD to create dangerous situations due to its AI mimicking human behaviors that are not appropriate for the situation, and its, as of yet, inability to decide appropriateness, break ties, and make other complex judgments that are necessary to drive.
Well, those are my thoughts, I hope it is at least somewhat thought provoking or illuminative. I have the utmost respect for Jim and Elon, and I believe Lex brought up some great points which I hope I have elaborated on further in this post. Please share your thoughts, reactions, or additional knowledge in this area that can help develop the conversation further.
As someone who has designed communication protocols, as well as studied fundamental truths of human nature and behavior, it appears to me that a lot of the orchestration of vehicles and driving is viewed in an overly simplistic way by those who work on autopilot. But maybe I just don't have enough data about the thought that has gone into those systems. While AP and FSD have no doubt been making great advances, I'm curious as to how these issues are viewed, since I haven't heard them publically discussed, so I wanted to share my thoughts and observations here. Warning, this is a long post written linearly and unedited, so take this as raw opinion.
First, let me give you my background since it certainly colors my views and experience in this area. I was born and learned to drive in Boston 30 years ago, without GPS devices, where the legitimate answer to a navigation query was sometimes "You can't get theyah from heyah." I received my BS and MS in Electrical & Computer Engineering at Carnegie Mellon in Pittsburgh, spent 25 years working in hardware design, network engineering, and hyperscale datacenter infrastructure operations in San Francisco and Silicon Valley. I am also a Reserve Police Officer, went through the full police academy, and worked part-time as a solo patrol officer in the Bay Area for 15 years. So I know the vehicle code very well, and have carefully observed (and given warnings / written citations for) various driving behaviors. I have also traveled to over 70 countries, and driven in more than 10 of them. I now reside in Austin, Texas where I work remote for a Silicon Valley startup that creates advanced data center network automation and orchestration software.
--
During his interview, Lex pushed back pretty hard against Jim Keller regarding whether driving (and FSD) can be easily accomplished by machines. Jim's postulate is that "You don't have to be super smart to drive a car." He later states that it's harder to affix brightwork (chrome trim) to a vehicle in a factory than it is to drive a car and that it (the former) is something that Lex Fridman couldn't do. I found the flippant comments (which Lex alluded to by saying that Jim was mocking him) a bit distasteful, but given the substandard alignment of brightwork on my 2017 Model S, I can understand why Jim thinks it's so hard. Tesla is still learning how complex and difficult manufacturing is. And while AI systems get exponentially better at mimicking human behaviors of various kinds, it's not clear to me that that's "enough" to allow them to mingle, unsupervised, with human drivers in various scenarios. I have to side with Fridman on this topic. (Though Jim is a brilliant computer designer and I loved listening to him in this interview. He made many excellent and amazing points.)
Something that is entirely lacking in Jim's and Elon's dialog regarding autopilot and FSD is a discussion around the etiquette and interactions that occur when driving. There is a tremendous amount of gray area regarding how we navigate our vehicles with respect to others' vehicles, yield or protect our right of way, let others merge, resolve deadlocks in parking lots or driveways when vehicles are blocking each other, and many other driving scenarios where facial expressions, hand gestures, yelling out the window, honking of the horn, and other verbal and nonverbal communication methods are used. While US state laws are clear, they are by no means exhaustive, and there are many gray areas. Here, we rely on human interaction and common sense to resolve situations. Machines do not have common sense; they do not have ambiguity resolution capabilities. None of this has been discussed in any of the interviews I've heard from these guys. My impression given Jim's comments is that you can just "muscle in" to any open spot, or simply create one by forcing others to yield right of way to you.
Now, let's consider that the etiquette on how to resolve these situations, and what is polite. From my perspective, having driven around the country and around the world, it is *highly* differentiated in terms of how you request, and grant, a yielding of the right of way based on the region that you're in and local rules. Even in the Bay Area, I've seen driving norms and etiquette change quite dramatically in the 30+ years I've watched drivers there. What's OK to do in downtown Boston, New York, New Delhi, or Ho Chi Minh City, would inspire anger and even rage, in Texas or Switzerland or Idaho.
How do drivers handle this? What's the psychological aspect? When drivers get enraged, what happens? How does law enforcement evaluate the justness of a maneuver? If you came up to a 4-way intersection, and three other cars stopped at exactly the same time as you, and the three other cars in the intersection had deeply tinted windows and windshield so that you could not see the drivers' faces at all, how would you go about proceeding through the intersection? What if the driver of the deeply tinted windows (e.g. FSD computer) wanted to yield right of way to you, how would it signify this? By waving you through? What does it look like when an FSD computer "waves" you through to yield right of way?
What if a driver is a total "a-hole"? Maybe we are not super surprised if the person is driving a specific model of car, because people with those personality traits tend to buy a car that fits a specific stereotype. Will we expect a Toyota Prius FSD and a BMW M3 FSD to behave differently? What about a Tesla Model 2 and a Tesla Roadster 2020? Will their FSDs also behave and interact with other cars differently?
When a Tesla in FSD mode is pulled over by a police officer (which Elon has commented that they've prepared for and it's pretty easy) because it just got in a crash, will the officer be able to ask questions of the driver as to why they did what they did? Will the software have an API to query the various layers of the deep learning model, so that the officer can understand that the computer thought it saw a cat running into the street and slammed on its brakes, even though what it really saw was the shadow from a plastic bag flying through the air?
Each state has standards about the tests a driver must pass in order to be granted a driver license and be given the authority to drive, and responsibility to be penalized, under civil or criminal penalties, for violating the vehicle codes. How will Tesla FSD pass these tests? Will they be regression tests that are run nightly when the CI/CD pipeline has new code that developers have checked in? What if the software goes wonky and drives like someone who is under the influence, passes over the center divider, and causes a head-on collision with a mom driving her kids home? Who will be criminally responsible for these deaths? The owner of the car who was sleeping in the back seat? The developer who checked in a buggy piece of code two weeks before, that was pushed with last night's software update? Tesla as a corporation?
Now, while a 25 year old PhD graduate on Pete Bannon's team may have sufficient knowledge to work on cutting-edge chip microarchitecture or deep learning pipeline for the next FSD chip, and a software developer in Andrej Karpathy's team may be able to implement the best object recognition or bounding box capabilities in the industry, what I'm really curious about is Stuart Bowers's work regarding the FSD software itself. During Autonomy Day, he said that they review all the times that autopilot is disengaged, crashes, etc, and try to learn from those. But what are they really learning? He said that they choose behaviors from "good" drivers, even when in shadow mode, to train the AI. But what is a good driver? If I take someone who's never been in an accident in rural Idaho, and also never traveled outside the state, and put them into downtown LA traffic, on a busy road where they slow down so much that people are honking at them, angry that they haven't acted, and maybe a fast moving delivery van comes up behind this car at 50 mph, the Idaho driver slowly changes lanes in front of the 50 MPH driver, causing a sudden rear-end collision and two fatalities.
Likewise, the delivery driver going 50 mph (in a 35 mph zone), who has only had two minor fender benders (and 1 traffic citation) in 5 years of driving 60k miles per year, goes to rural Idaho, what's going to happen? If he's so aggressive in this small town that he gains a bad reputation, what are the consequences? What if the AI running in small town Idaho was trained by the vast number of miles of this delivery driver in a Tesla Semi in LA? The point I'm trying to illustrate is that driving is not abstract; it's not in a vacuum; and local norms vary significantly and are context dependent.
If a new playground can be created for FSD, new roads, a town, or an area where we all agree that the protocol people use for acquiring and yielding right of way, signaling intent, and agreeing on norms and opportunistic intensity level ("chill" vs "Mad Max" on changing lanes to get ahead), that will make sense. When all cars are FSD, will they have a communication protocol to exchange navigation and right of way / synchronization / orchestration info? What is the protocol expected to be between humans and FSD computers? If FSD cars are just thrown into the mix all over the country / world, these issues will grind and cause issues, and probably some serious injuries or deaths will emerge, specifically because of this choice.
Note, I'm not talking about the ability of FSD to detect objects, plan routes, respond to condition changes, and follow standard routes in light to moderate traffic. Those will undoubtedly improve at a fast pace. I'm talking about the propensity of FSD to create dangerous situations due to its AI mimicking human behaviors that are not appropriate for the situation, and its, as of yet, inability to decide appropriateness, break ties, and make other complex judgments that are necessary to drive.
Well, those are my thoughts, I hope it is at least somewhat thought provoking or illuminative. I have the utmost respect for Jim and Elon, and I believe Lex brought up some great points which I hope I have elaborated on further in this post. Please share your thoughts, reactions, or additional knowledge in this area that can help develop the conversation further.