@Akikiki , nice write up, however you do seem to be making a lot excuses for Tesla's lack of proper design and testing.
2012-2018 S and 2016-2018 X's were still being built on tech that Tesla designed in 2010, (the eMMC) chip that fails. Its not a flash drive or SD medial but its behavior to fail is similar. Its going to fail after a number of writes.
In 2010, the number of P/E cycles (writes) and MFBF for the chip was readily available to Tesla, so this is no excuse. They could done highly accelerated life testing/profiling on the chip, estimate how many writes they need, and completely predict that it will fail in 2-6 years, therefore implement mitigations. I've worked with other manufacturers who do just that with emmc and other parts, and I know of other manufactures which have produced cars with emmc chips in them, which are designed to last 15+ years, and a fix doesn't cost $3,000. If a technology is not suitable for automotive use, you don't just use it and blame the technology later. There are a number of mitigations Tesla could have planned had they actually done proper design, characterization and testing. For example, they could have made the emmc easily replacable - it's a $10 part, a module with just emmc would not cost more than $20. They could have added an SD slot for log writing, or a completely separate emmc, to reduce the number of writes. They could have used 2 emmc chips in RAID-0 configuration to distribute the writes. Tons of possibilities, had they actually done proper design. Or, are you suggesting they did proper design, knew very well that MCU's are going to fail, and thought that a $3,600 replacement was an acceptable maintenance cost for a $100K car?
Its not only logging that writes to the chip. We've also heard that a number of seemingly minor or unrelated settings contribute to the early failures. Things like: not clearing trip meters, because they store enormous energy data,
It sounds a little like you are trying to blame the customer here. How is the customer, who sees 2 trip meters and an average energy consumption, supposed to know that Tesla implemented it in some crazy and/or lazy way without considering how it will kill the underlying emmc? Keeping track of 2 counters and 2 average power consumption number should not take a large amount of emmc or large amount of writes - they could keep 2 mileage counters plus two cumulative energy counters in memory, and write it to emmc every time they are about to restart, or every midnight. 1 block write per day would provide all the functionality that the customer sees today from trip meters.
When Tesla replaces the MCU, they are also replacing the center screen, the eMMC chip mounted on the Tegra board, and upgrading the car from 3G to LTE (that was previously a $500 cost). About a year ago, Tesla was charging $3600+ to replace the MCU, then using refurbished center screens, the price dropped to $2300-2400. In Feb the first reports started appearing the cost was now $1300 plus labor and tax. For these refurbished prices, Tesla is keeping the core (MCU and related items). Some folks want to keep the parts and sell them on eBay or someplace. Tesla is charging a core fee, because they need to recover the core to refurbish and use again.
You make it sounds like they were surprised that the emmc's were failing, then scrambled for a solution, which is consistent with lack of proper design, characterization and testing. I hope they do a better job with the starship going to Mars, so they don't wake up half way there and realize their components are dying because nobody bothered determining their MTBF and it turns out it's less than a full trip to Mars.
We've been collecting data on failed MCUs - reported on Tesla's forum and TMC.
Out of curiosity, who is "we" in this context and is this data available somewhere to view?