@Knightshade ok, I'll explain again, as I'm patient. I'd also appreciate if you skip the ad hominem attacks, they are rarely constructive in my experience.
The term "latency" is a general term commonly used to mean a delay before a requested operation is actually performed. There are many kinds of latencies. Hard disks, for example, have both seek latency and rotational latency. Flash of course does not suffer from either of these, and so it's internal latency is often very low, measured, as you state, in microseconds. However, it is not ALWAYS that low, for the reasons I've already stated. The internal controllers on flash have to handle a variety of tasks, some of which are quite slow (block erase for example), and, in cheaper drives, these operations are serialized with host read/write requests. That means SOME write operations are delayed for a significant time period before they are completed. This means that, while the majority of writes do indeed have microsecond latency, a small number have a
much greater delay. This is what I was referring to when I discussed
worst case latency. Note the phrase "worst case" here .. not typical, not average, WORST.
Now let's talk about
sustained write speed. What does this mean? It means that. if you measure write speed over some sufficiently long period of time, it will average to a value that indicates the speed the drive can sustain indefinitely (or at least until it wears out) It says nothing at all about the moment-to-moment actual transfer rate of the drive. The drive MIGHT be writing continually at the measured sustained rate, or it MIGHT be writing at double the sustained rate for 5 seconds, then pausing for 5 seconds (not saying this is a real world example). Both drives have the same
sustained rate, but one has a worst-case latency of zero, while the other has a worst case latency of 5 seconds. So, same sustained write speed, but very different latencies. Real drives of course are somewhere between these extremes, and typically the SSD drives have the lowest worst-case latencies, the cheaper USB flash drives the highest.
So why does this matter? Because the cameras are generating a continuous byte stream of data, non-stop, without pause. This is why a buffer is needed; it acts as an elastic temporary holding pen for the data, so that video data can be poured in continually, and then sucked out in blocks if and when the flash drive is ready for it. This buffer must be sized so that it can store all the camera data when the drive hits the worst case latency (actually it would typically be double this if a ping-pong model is used). If it isn't big enough, data loss will occur, which is basically what this entire thread is about .. data loss of DashCam/Sentry.
Now, I'm going to speculate here. I'm guessing that Tesla are not keen on using a big buffer for DashCam. Why? Because the computers have finite resources, and there are a LOT of other contenders for those resources, many (most?) of which are higher priority than the DashCam (autopilot anyone?). So they probably took an ad-hoc approach of figuring out a buffer size that worked with a decent flash drive and left it at that. Which means, the slower cheaper drives are going to overflow that buffer at some point and BANG .. DashCam failure.
What is the fix for this? Make the buffer bigger? How much? Double it. Maybe that makes some of those cheap drives work, but not all. Quadruple it? At some point, you are going to have to decide if "Making DashCam work with cheap crappy USB drives" is more important than "map scrolling is smooth" and "new AP features". And I know where I would vote.
What is probably at fault here ISNT Tesla fixing the system to work with every possible drive. What they SHOULD have done, imho, is test with some of the better flash drive brands, then publish a recommended/tested drive list. But that's just my opinion.
As to your other points...
— Yes, I think I do know what sustained speed and worst-case latency mean. I've defined them above.
— Your sustained speed metrics, as I’ve shown twice now, are irrelevant here. Sustained speed isn't the issue.
— I’m not aware of internal USB use either, but I’d suspect SOME devices are on USB (many devices (e.g. audio) on a mainboard use USB interfaces even though its only via internal signal traces). And yes, I agree that they won’t use all the bandwidth, but drives usually use bulk transfers which are very low bus priority. I dont regard this as a major source of latency, but as I stated we are looking at a perfect storm type model here where each small item adds up.
— You argument about internal house-keeping showing up in sustained write speeds is incorrect, as I have shown. In fact, by definition, sustained write speeds will take into account those house-keeping tasks. See my discussion above about sustained being average over time.
— Thread starvation in this case refers to a thread being starved of CPU resource, that is, unable to run. Given the real-time nature of other critical threads I would speculate this is certainly possible, though again its a small contributing factor. As with the buffer size, Tesla will, I suspect, prioritize driving systems over DashCam (in fact, they had better!).
— This isn’t an argument about software vs hardware. Argument 1 says the software need to buffer enough to support ANY drive. Argument 2 says the drives should not have latency worse than a certain acceptable value. Who is right? Where do you draw the line? As I stated above, the real problem isn’t finger pointing, its that Tesla should have created a demarcation by testing known-compatible devices and then making this list available to customers.
— As for video compression rate, again you confuse average with instantaneous. Let’s say Tesla decide to make the buffer big enough for ½ second of video. How much space is that? You would argue that at 2MB/sec, you need 1MB of buffer. But the compression rate for a video stream varies WILDLY with the content .. still video compresses WAY better than average, fast moving video FAR worse. To the tune of 4x worse compression in many cases. So if you want to be sure to capture video when you have a segment of poorly compressed data, then you will need not 1MB of buffer but 4MB. Otherwise you risk buffer overflow again and, worse, it depends on the actual content of the video being recorded, which makes things VERY sensitive to circumstances.
— I was unaware that HW2.5 now uses H.265. However, so what?
— I dont get your point about Sentry mode. First you say it DOESNT record anything, then you say it MOVES the data (not copies) .. “move” here usually means changing a files location from one folder to another on a drive, which would seem to imply that the video WAS already on the drive to begin with, which contradicts your argument that it wasn’t written? eh?
As I said at the start, ad hominem attacks are rarely useful. Shouting “nonsense” with a refutation that is itself nonsense (in the technical sense of the word) seems to me poor judgement.
The purpose of my original post was to add some speculative insight into a possible root cause of the issues people are having, in the hope that it can assist others in further investigations. I do not have access to the inner workings of the Tesla software stack, hence these are speculations. I do, however, have decades of experience with real-time software and hardware .. I'm not saying that means I'm right, far from it, I'm almost certainly wrong in some of the details, but I think my speculations do align with all the observations posted here.
(My apologies to others in this thread for this absurdly long post.)