That was a fantastic post, and exactly what I would've written myself if I weren't so lazy.
But touching on the "is Tesla really benefiting form this?", "How do we test?", etc. subject...
Let's imagine a scenario where there is a small, pre-defined set of rules that must be followed when streets are built. ALL of the specs of the road, from the width of each lane, the type of asphalt used, the color of the paints used, the markings.... ALL of it are preset rules that must be followed.
Programming the navigation of these streets would be fairly straight forward. But the part that isn't so easy, is the "what is the other driver going to do?" problem. That's hard.
Now add the fact that streets aren't vanilla. There are so many variations that you just can't go by a pre-programmed set of rules. What do you do for these instances? Elon, et al, refer to this as "chasing the nines" or "chasing the long tail." Programming for 99% of circumstances is easy. Programming for 99.999999999999% of circumstances is hard.
The solution to the "what is the other guy going to do?" problem and the navigation problem? AI/NN.
But how do you program an AI? One of the Tesla AI guys (sorry I don't remember his name), used a great example of training an AI to recognize pictures of dogs. The best way to do this is to show the AI as many different, varied pictures of dogs as possible. And the more dogs you show it, the better it becomes at recognizing dogs.
Tesla is using this approach for training the AI about driving; showing it as many variances to every day occurrences as possible so that it becomes better at recognizing its world.
This same thing can be used to help with "what is the other driver going to do?" Show it many instances of that situation, and look at the outcome of what the other driver did. And repeat this millions of times. You end up with a pretty good idea of how the average human is going to react in a given driving situation.
This approach requires mind boggling amounts of data.
Your car isn't wasting bandwidth when it spends the whole night uploading gigabytes of information to the Mothership. It really is needed to teach the AI about the world it sees, and how to interpret it.
And as others have mentioned, Tesla can set triggers for the car to automatically send video clips of things they are currently working on.
So really, the "best way to test" is to just drive the car as much as you can, in as many varied instances as you can find. No need to take my word for it, Tesla has actually said this several times. I'd find an instance of it, but I'm too lazy. Go look for yourself if you're interested, I'm sure it wouldn't take long to find.
Let the car and Tesla determine what they need.
But you are not wasting your time. You really are helping, one gigabyte at a time.
One guy mentioned that there is no way they could have enough engineers to manually watch all of these clips, and mark/catalog them all. That's completely correct... they can't. So they have developed a machine to teach the machine. And how accurately the teaching machine can label all this stuff correctly is, once again, dependent on showing it millions of examples.
I'm no expert by any means, but IMO, this is going to require YEARS worth of both data and manual labor. There's going to need to be a person manually teaching until the machine becomes accurate enough to label stuff without a human's help. Once that goal is reached, things will really take off, as (obviously) a machine can scrub through videos and label stuff far, far faster than a person could.
All of the above is taken from things that Tesla has already stated. This is just a TL/DR version of all of it.