No, they must want to ensure it is as safe as possible; thus limited numbers of BETA testers initially.
Safety is measured in rate. Incidents per mile, hour, exposure, use, etc.
You do not achieve safety by only giving it to a limited number of people. Systems do not change their level of safety because of how many people are using them. The system is the system, and each copy of it is independent from the others. You would just have fewer instances per calendar time, but also less users and benefit, so there is no net change.
You do lower overall human
risk by exposing less people to an unknown system.
However, if you think you can estimate your level of exposure from 71 drivers over 1 month, you are either bad at statistics, or your target level of safety is remarkably low. All you are doing with 71 people is making sure it's not so atrocious that you'll immediately have bad PR if you were to release it more widely. Hence the NDA.
As a simple rule of thumb FYI,
you use the rule of three. Example:
If we have 71 drivers, and they each drive 2000 miles on City Streets Autosteer. None of them have accidents. That's 142,000 miles with no accidents.
You divide the 142,000 by three- 47,000. You now have a 95% confidence that your accident rate is 47,000 miles or more. But you have
no idea what the actual rate is. It could be 47,001. It could be 17M. In fact, you can never really know what your accident rate is until you have some accidents. But you also don't need an accident if you can get enough data to go beyond your safety target.
In order for Tesla to have a good sense their rate is 1:2.05M or more (better than a human in a Tesla with no AP by Tesla's own numbers), Tesla needs at least 6M miles with no accidents. That's only 85,000 miles per beta tester, all on city streets autosteer (no highways).
They better get crackin', it's going to be a while.
Then, once you're done with that, you now have to deal with the fact that your test was synthetic, because by your own test design, you picked expert and safe drivers. Your normal user population is not this, so your data does not apply. You have no idea what the rate will be when given to your broad user base. This is like testing a new drug on 18 year old men exclusively, and then once they have no adverse reactions claiming that it's safe for pregnant 40 year old women.
Now, you might say that Tesla has 2,071 testers, not just 71. We know those 2,000 testers are Tesla employees. By definition, they are concentrated near Tesla sites. This means their data is less useful than truly broad, random data. You cannot have 2,000 people test autonomy in/near Fremont, and claim that you have any sense of the incident rate to use in Denver.
So basically, these 71 or 2,071 drivers tell you very little about your system if your goal is 1:2M miles. They tell you a LOT if your goal is 1:1000 miles. There's a reason aerospace uses simulation and design assurance, not real world testing to demonstrate safety. Doing it in the real world is unaffordable when your goal is 1:1B hours.