Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Wow. MCU almost killed me.

This site may earn commission on affiliate links.
I just filed NHSTA complaint number 11365827. In a nutshell: all my exterior lights just turned off while driving on a twisty, limited-access road with no shoulder (Garth Woods section of the Bronx River Parkway, for those of you in the NYC area) in the dark. Head and tail. Neither high nor low beams would come on (high beams wouldn't even "blink" when I pulled the stalk). If a driver behind me hadn't noticed and used his high beams to guide me about a mile onto the next safe shoulder, I could easily have died.

I have to say though Tesla's replaced my MCU once (killed early, probably because an engineer turned on extra debugging trying to root-cause a media player problem - back in the days when you could actually email JonMC and someone actually might tap an engineer on the shoulder and ask them to look at your problem!) I have always thought of the MCU1 failures as a nuisance problem with some, but minimal, safety impact. Yes, you lose the rear-view camera. You might be unable to engage the defroster. But I never expected it to plunge my car into complete darkness (and keep it that way) while I was driving. I simply could not see how this could happen, not believing the MCU controlled the exterior lights.

I've been concerned my MCU (remember, this is the _replacement_, but it is several years old) was failing again, for about a month now. Screen corruption while charging, very long car start up times, weird voice control behavior, etc -- all the stuff I saw just before it failed the first time.

Today when the lights went out, even before I was able to pull over I started MCU, then IC reboots. The MCU took nearly 3 minutes to reboot. When both were online the ICU displayed the "driver assistance features unavailable" message. Other stuff was weird -- for example, the "report" voice control function repeatedly said "That feature is not available yet" after recognizing my speech, then eventually failed further and just started displaying "Microphone calibrating" when I hit the speak button.

Lights all stayed off, would not turn back on. The hazards did work, but that was all.

It finally occurred to me to try to turn on the fog lights via the MCU. Navigating to there I saw that the "exterior lights" setting was AUTO, as I expected. The fog lights actually did turn on! Spent a minute trying to decide if I was willing to try to drive all the way home with just the fogs, then inspiration struck. What if the MCU was displaying AUTO but the actual setting -- either on the MCU or some other controller -- had been corrupted?

I moved the slider to "ON". Nothing happened. I moved the slider to "OFF". Nothing happened. Turned off the fogs to be sure -- nope, no lights. Moved the exterior lights slider back to "AUTO" and...light! Everything came back on as normal.

On the last block on my way home, at about 25MPH, I tested to see what would happen if I turned "exterior lights" off via the MCU while the car was moving at speed. Guess what? They turn off. So the MCU can, in fact, kill your head and tail lights if its notion of how they should be gets screwed up. And, it appears, this is likely what happened to me.

I have no idea how to effectively report this to Tesla. It's a severe safety issue. I could have died. But if I call service I have every expectation that they will do nothing but offer to replace my MCU at my expense, since the vehicle is now long out of warranty. Thus the NHTSA report.

If this happens to you -- since, much to my surprise, clearly it can -- and you can't immediately pull over, as I couldn't, all I can advise is to try to very quickly navigate the MCU to the "lighting" screen and turn the exterior lights off, then on again. It seems to have saved my butt and it might save yours in a much more dramatic way.
 
Tesla offers two options for this issue.

(1) MCU/1 eMMC replacement which I think is around the $350 - $400 (and covered by warranty if your vehicle is under warranty)
(2) The optional $2500 Infotainment Ugrade Package upgrade which is NOT a warranty replacement, but an upgrade. The radio option for this is an extra amount on top of the $2500 which is coming in the following months.

I'm just presenting the options that are available to almost all MCU/1 Model S/X out there...
 
  • Informative
  • Disagree
Reactions: recluce and SO16
Tesla offers two options for this issue.
Well, no. You accurately describe the options offered by Tesla for replacing a particular MCU that is exhibiting memory corruption.

However, neither one addresses what I saw tonight: that the overall control system is misdesigned or misimplemented such that memory corruption on the supposedly non-safety-critical entertainment/UI computer can plunge the entire car into darkness while driving at speed, while displaying that the lights were on - even after a reboot.

You don't describe any fix for that issue.

Until tonight, whatever Tesla's faults, I had enough faith in their safety culture and software/system engineering to think this could not happen. No more.
 
Last edited:
Well, no. You accurately describe the options offered by Tesla for replacing a particular MCU that is exhibiting memory corruption.

However, neither one addresses what I saw tonight: that the overall control system is misdesigned or misimplemented such that memory corruption on the supposedly non-safety-critical entertainment/UI computer can plunge the entire car into darkness while driving at speed, while displaying that the lights were on - even after a reboot.

You don't describe any fix for that issue.

Until tonight, whatever Tesla's faults, I had enough faith in their safety culture and software/system engineering to think this could not happen. No more.

I could be wrong, but all those functions (lights, HVAC, radio, etc...) are all controlled by the MCU, so the two options above are the solution.

I think what you want is for them to re-architect it to not be part of the MCU but another computer...
 
Well, no. You accurately describe the options offered by Tesla for replacing a particular MCU that is exhibiting memory corruption.

However, neither one addresses what I saw tonight: that the overall control system is misdesigned or misimplemented such that memory corruption on the supposedly non-safety-critical entertainment/UI computer can plunge the entire car into darkness while driving at speed, while displaying that the lights were on - even after a reboot.

You don't describe any fix for that issue.

Until tonight, whatever Tesla's faults, I had enough faith in their safety culture and software/system engineering to think this could not happen. No more.
Guess you missed the memo on the battery fires and their “fix” for that.

Your safety is not their priority. Their bottom line is.

I believe the NHTSA is also investigating them for the known MCU memory issues specifically because of the type of issue you describe.
 
Can’t even give Tesla $$ if they wanted here. I’ve requested the infotainment upgrade since May of this year. Some parts arrived in June, been getting emails weekly since then.

Last night I had some real frustration with MCU1. My wife was experiencing a stroke and had woken me up to take her to the hospital and wouldn’t you know it, it took several minutes to route.

I really wish I had either had the time to think I needed my phone in the haste to get out the door, or Tesla actually had a good parts and service. The fact that it keeps getting worse really sucks.
 
Twice now I have had the MCU just throw up random panels that I cannot close. Both of these happened luckily when starting again after a short stop. Rebooting took over 10 minutes in both cases and restored all functions.

But during the events no button presses, menu selectors and any other action would close the dialogs and I would get two or three on top of each other.

I reported the last one with a time stamp the other day and created a service request. I am sure they are early signs of MCU going south. Service date is Nov 5th. That date is obviously suspect until diagnosis is done and part received.
 
OP, I can vouch for what you experienced. I just picked up my car yesterday after having the Tegra card replaced. When my MCU died, all vital systems like headlights and turn signals worked. You just wouldn’t here the click click click.
HOWEVER, one night after pulling out of my driveway and signaling to turn, I noticed the green arrow did not show up on the IC. Strange I thought. Surely the actual turn signal is still working I thought? I pulled over at a dark section of the road and sure enough, exterior blinker wasn’t on. Still had full head and taillights though so kept driving to my short distance destination. I tried activating the high beams and like you, they wouldn’t engage. By the time I arrived, everything was back to normal. All of this with a dead/black screen MCU.
So, I think our experiences are extreme corner cases. Throughout all the threads about dead MCU1s, I don’t recall anyone ever reporting critical systems like headlights and signals being affected. Always been HVAC stuck on, can’t turn off PIN-to-drive, and charging fubar’d due to scheduled charging.
I assume everyone leaves their headlights on auto so when the MCU dies, it should still be stuck in auto, but clearly something went wrong in your case.
 
>>Last night I had some real frustration with MCU1. My wife was experiencing a stroke and had woken me up to take her to the hospital and wouldn’t you know it, it took several minutes to route.<<

Dreadful experience: hope she's recovered OK.
Keep safe.
 
  • Like
Reactions: tls
I think what you want is for them to re-architect it to not be part of the MCU but another computer...
And that other computer can experience a similar fault. Cut the computer out of the equation? Then you'll still end up with some part of single point of failure (a fuse, a wire, a battery) that can cause all your lights to turn off.

Here, this was obviously an edge case where the MCU experienced a case in which it turned off the lights, but in memory the state of the lights was that they were still on. It's a software bug, probably a logics bug.
 
I think what you want is for them to re-architect it to not be part of the MCU but another computer...
You are assuming a solution here. It doesn't have to be another computer. If MCU is classified as safety critical, Tesla should do proper FMEA on it (as other auto manufacturers do for safety critical parts) and redesign the MCU to handle failures safely. For example, one option could be to have some error checking for RAM and EMMC, detect failures and handle them appropriately (e.g. fail to lights ON, warn the user if possible). Tesla wants to do autonomous driving, they ought to know how to design safety critical components. Of course safety certifying an entire MCU might be harder (good luck safety certifying a Linux based system) than just adding a microcontroller to control the lights, but that's on Tesla to decide.
 
You are assuming a solution here. It doesn't have to be another computer. If MCU is classified as safety critical, Tesla should do proper FMEA on it (as other auto manufacturers do for safety critical parts) and redesign the MCU to handle failures safely. For example, one option could be to have some error checking for RAM and EMMC, detect failures and handle them appropriately (e.g. fail to lights ON, warn the user if possible). Tesla wants to do autonomous driving, they ought to know how to design safety critical components. Of course safety certifying an entire MCU might be harder (good luck safety certifying a Linux based system) than just adding a microcontroller to control the lights, but that's on Tesla to decide.

Quite. Using a simplex system, however "clever" (my quotes, yes) in a complex environment with potentially fatal failure modes will possibly kill the manufacturer that uses it.
 
And that other computer can experience a similar fault. Cut the computer out of the equation? Then you'll still end up with some part of single point of failure (a fuse, a wire, a battery) that can cause all your lights to turn off.
You don't have to "cut the computer out of the equation". It is quite possible to design safety critical systems which have proper failure modes. Check out ISO26262 standard and different ASIL levels. MCU should be appropriately classified, then designed, implemented and tested according to the automotive safety standards.
 
Last edited:
Can’t even give Tesla $$ if they wanted here. I’ve requested the infotainment upgrade since May of this year. Some parts arrived in June, been getting emails weekly since then.

Last night I had some real frustration with MCU1. My wife was experiencing a stroke and had woken me up to take her to the hospital and wouldn’t you know it, it took several minutes to route.

I really wish I had either had the time to think I needed my phone in the haste to get out the door, or Tesla actually had a good parts and service. The fact that it keeps getting worse really sucks.

And MCU1/HW3 FSD is supposed to work the same just fine. How if the nav system doesn't even work
 
I think the car should have a physical switch for headlights.

It is possible for physical switches to fail also. An open circuit in a mechanical headlight switch would leave you in the same situation, however I believe the failure rate of a well-designed (most likely borrowed from M-B as many parts have been) would be far less than MCU failures which (in various forms) have been rather common.
 
  • Like
Reactions: maximizese
I've done my share of embedded systems work myself though I've never worked in automotive.

I assume a programmer / system architect also without automotive or aerospace experience made the same mistake analysing this system's safety properties I had until it was dramatically brought home to me last night.

Sure, the MCU does not directly control the lights. It doesn't have the I/Os required for that function onboard. The computer controlling the lights is probably externally sourced, safety certified, connected to the MCU by CAN. So the MCU at first glance doesn't appear to be "safety critical" with regard to lighting.

However, the MCU replaces (or overrides) all physical switches that can tell that other safety-critical computer what to do. Again, this doesn't look like a potentially fatal flaw: if the headlights won't turn on, don't drive at night.

However, this discounts the possibility that the MCU may erroneously command that other computer while the vehicle is in motion and then, since there is no other user interface for the lights, the driver may be unable to override. In fact for this function the MCU might be "just a replacement for the knob on the dashboard" but it's actually safety critical because it's just like a knob that breaks inside and gets stuck in the electrically-off state while physically turned to the "on" position.

This had never occurred to me before. One reason is that the control systems I've worked on that have potentially life-ending consequences *have* had bugs of this nature but their rated, safety-critical components have had internal watchdogs and sanity checks that prevent the situation from escalating in this way. Evidently our cars do not.

For obviously safety-critical variables like, say, boiler pressure, the displays / user accessible controls on those systems I've worked on also have always used defensive programming techniques like polling relevant variable values from the actual safety critical low level controller and, if any discrepancy was detected, putting everything in a failsafe state. On the other end, a watchdog in the low level controller should do the same kind of safety reset if that polling isn't detected. This can be designed to MITIGATE the risk that is otherwise ACCEPTED by removing physical switches and thus inherently involving a non-redundant, uncertified component like a LCD touchscreen in the control of a dangerous physical process. And those are exactly the terms we'd discuss it in while building such a thing, anywhere I've worked on controls anyhow.

I'd assumed Tesla was not crazy, not ill intentioned, basically competent at this stuff, and thus had gotten this kind of thing right. That illusion was definitely dispelled for me yesterday.
 
Last edited:
It was a bug. You experienced it once, and chances are you'll never experience it again. You filed a NHSTA complaint and that is it. And the bug is maybe fixed on a next software version.
I have to assume you've never actually worked on a life-safety system. A bug that can leave a 3-ton hunk of metal hurtling into the dark at 45MPH is a bug that isn't supposed to be able to happen. The entire system is supposed to be designed so no single bug can cause a failure of such high consequence. The big issue isn't that it did happen, it's that it even could happen.

Cars have been recalled for mechanical analogues of this failure (e.g. ignition switches that can get stuck "on", throttles that can get stuck open) many many times. Modern safety engineering practices are responsive to exactly that kind of concern and I'd assumed, as I expect most of us with embedded systems experience had, that Tesla was getting it right. Evidently in this case not so.
 
I have to assume you've never actually worked on a life-safety system.
Correct. Just like you, I am a customer of Tesla. So unless you are also an engineer at Tesla, you're limited to alerting Tesla, alerting the NHSTA, posting your one-time encounter with this bug on the internet and what more.

This was a bug that affected the headlights, just like a single fuse is responsible for the lights in my old analogue car, which by the way only had one 12V battery and one alternator, so if one of those things would fail, I would have been left in the dark aswel. Oh, not to mention the single switch to turn on or off the lights, which could also fail.

I am looking forward for the recall of... pretty much nothing? It's a software bug? Or can probably be solved with software? Because, as with my analog cars, Tesla won't install a redundant MCU, switching, 12V battery and HV battery. So it'll probably be resolved with a OTA software update and you'll never be the wiser without insight in the individual commit logs of the code or insight in Tesla's bug tracking system.
 
  • Disagree
Reactions: ladysbff