I've had no experience of loss of maps for extended periods of time, although several losses for short periods of time. Some of these have gone right back to early ownership days, and seeing the behaviour of entering new areas for the first time.
I've worked with our own tile servers, including building and seeding from scratch. I am wondering if we are seeing something similar to the scenario below, with 6, 7 (and possibly 8) being recent behaviour.
1. A new tile server needs to get a large download of vector data. I believe this is what we see in the 'annual' map updates.
2. Over a period of time, this vector data is then rasterised into tiles. This can happen on demand, or an area be pre 'seeded'/rasterised, or most often, a bit of both, #3
3. Rasterised tiles are created at various zoom levels. One pre-seeding may be a country wide set of tiles but at fairly coarse zoom level. If you then zoom in, a higher level detail tile is created - its often easy to see this as a tile suddenly becomes clearer when the higher zoom level tile replaces the lower resolution proxy view. This should be pretty seamless, but not always. I was especially aware of this just after our MCU got replaced, when I had a reference of previous MCU performance vs new MCU - almost as if whole country was no longer pre rasterised at low zoom level.
4. These raster tiles take up significantly more space than the vector definitions, especially when held at various zoom levels. So tiles would effectively be cached, with more frequently used tiles being instantly available whilst others requiring to go through the rasterization process again.
5. Certainly as far as our MCU2 car is concerned, I have seen tile updates becoming much less fluid over the years, almost as if they are not caching as much tile data, maybe as a result of MCU resources having to be better managed - based on several observations over the years and more recently by functionality to delete some unused resources such as games - ie MCU is running low on disc space.
I believe that the above is approximately what has been happening until recently although not entirely as I have had the very odd occasion where I lost a new tile for a few seconds, although this may just been down to unusually slow update of a low resolution tile is replaced by a higher resolution tile.
6. I think what we are experiencing now is a result of recent changes on how Tesla is using map data, possibly coinciding with 'alternate route' functionality?
7. Since alternate route, Tesla now seem (documented elsewhere) to be sending regular additional map information updates which is then incorporated into the current tile, or even back into the underlying vector data.
8. There may also be a UI change with a slightly different map view also requiring an extra set of tile data which is contributing to the issue.
IF, there was an error in this process, and it was not tolerant of download/update errors (ie poorly implemented), then I can see that you may well lose tiles until the condition had been recovered from, ie after multiple downloads or reboot.
Errors, especially mobile network errors are a fact of life. Handling those cleanly, especially ones that may use a long timeout before error is flagged (TCP may try to recover from a network issue and keep hold of the connection for significant period of time), should hopefully be a software change, possibly a tweak to the timeouts. This would certainly explain some of the extended outages reported here etc.
Just my empirical 2p.