Nothing?
These would likely be the guys literatively testing very specific situations, and testing pre-public-release updates.
Distributing them to more places reduces some of the oft reported cases where it works great in say the bay area because that's where all their paid testers are but not elsewhere that it sees no testing until wide release.
Note the specific asks of:
attention to detail, and ability to work in a fast-paced dynamic environment.
So they might direct these folks "Test this situation, in these specific conditions, 100 times in a row" or "Test this situation, in these 10 different sets of conditions, 10 times a row" or even "Test this situation X times, then push this developer version button to switch software and retest X times, then reset with version C X times, etc...."
That's literally why safety drivers exist. So you are telling me the whole billions of miles of data is a sham? That just like they sent engineers to test Chuck's left turn for months.
You actually have to send people to test in very specific and repeated way? ZOMG!