Words of HABIT
Active Member
Now I don't have any background in the behavior of aquatic mammals, but, does anyone know if whale song will attract other whales?
You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
Now I don't have any background in the behavior of aquatic mammals, but, does anyone know if whale song will attract other whales?
Oh, you got it backwards. They have no intention of prosecuting the shorting hedgies. They are attempting to prosecute the WallStreetBets crowd for "collusion" in squeezing the over-shorted stock, they just included Burry, because he was pumping GameStop even before it was picked up by WallStreetBets.I can't image why they'd need to subpoena anybody, the whole thing happened right out in the open. The Interactive Brokers guy came right on CNBC and explained what they did and why.
What facts need to be uncovered here? I guess they just need to give the appearance of an investigation, as if they'd ever prosecute these shorting hedgies or MM's.
Holy smokes we’re on schedule!
What else does Lora K. have to do on a Friday night than to come up with something negative like this:
View attachment 713891
Hilarious that anyone cares about waiting an additional 24 hours when people have been waiting on years for this lol ..........but I still expect to see plenty of twitter whining and complaining.Anyone waiting for the button may not want to wait just to be disappointed... this is from an autopilot SW engineer:
Chuck in a podcast with Emmet said that V10 was probably V9.3 but Elon didn't want to go back on his words. He checked with many other testers and couldn't find this gigantic improvement Elon was alluding to but just felt like a 0.1 upgrade. Now the AP engineers are saying 10.1 has massive changes which is kind of odd when it's just a .1 release.Hilarious that anyone cares about waiting an additional 24 hours when people have been waiting on years for this lol ..........but I still expect to see plenty of twitter whining and complaining.
I have a hunch 10.1 will be a bigger jump that it was from V9 to V10 since it sounds like some important things were held out of V10 that needed more time to cook in the oven.
Good points, thanks jhm. I have a few more questions.Someone should have warned you that you are arguing with a seasoned PhD level statistician.
Yes, point alternative hypotheses are a thing. And, no regulator is requiring that Tesla proves that it 10X safer than average. Tesla may engineer to that level. So it is a valid alternative hypothesis, but not a relevant null hypothesis for regulatory work.
Finally, there are stratification and weighting methods to overcome your objections about representativeness of the test exposure. For example, auto fatality rates do vary by state, and vehicle miles can be stratified or weigh adjusted by state so as to remove any bias created by state distribution of test miles. Much finer analyses are quite doable.
Tesla is still recovering from it's March correction. We are still well off 900 despite a 1 billion profit Q2, and a possible 2 billion profit Q3.
CLF fell 10% on Monday. Market is picking and choosing what it wants to fall past 5%.
TSLA isn't and shouldn't be one of those 10% fail stocks.
Rather than relying on MM's to prop the stock, how about we prefer long funds, investing, buying and holding?
I had 715 short puts expiring this week which I rolled up to 740 last Friday. Those took a huge bath on Monday and only green today.
I could derisk it but I'll put my money where my mouth is by holding.
If you are so confident in a 750 top, you should be shorting.
Saying "I told you so" at 4:00PM on Friday when price is 749.99 or 700 with no position doesn't count.
Here's my real time as of position 09/22/2021 - 12:00PM EST:
View attachment 712729
Let's see your shorts.
10.1 is the one which has single stack for both city streets and highway. It's actually a very big deal.Hilarious that anyone cares about waiting an additional 24 hours when people have been waiting on years for this lol ..........but I still expect to see plenty of twitter whining and complaining.
I have a hunch 10.1 will be a bigger jump that it was from V9 to V10 since it sounds like some important things were held out of V10 that needed more time to cook in the oven.
It also sounds like new abilities are going to be introduced, including the ability for the car to go into reverse……which is a big deal for getting out of complex situations10.1 is the one which has single stack for both city streets and highway. It's actually a very big deal.
1) Different jurisdictions or nations will likely require testing to be done within their road system. They may even require separate NN fitting and had coding. So if 400M to 800M are required per jurisdiction, you could easily need on order of 6B on some 10 or so different jurisdictions. I doubt that Musk would be construing this as one giant test in which Tesla must demonstrate zero fatalities over the course of 6B miles. Indeed as your own calculation has show at 0.12 deaths per 100M miles (implying 7.2 fatalities), Tesla would have less than 0.1% chance of being about to go 6B miles with 0 fatalities. In practical terms, this is an nigh impossible test, doomed to failure. Maybe it would help to follow the implications setting 7.2 fatalities as your proposed null hypothesis. The test statistic, the actually number of fatalities over 6B miles, is Poisson distributed with mean 7.2 under the null hypothesis. It also has a standard deviation 2.68 = sqrt(7.2). With probability about 95%, the test statistic will be between 3 and 13 fatalities. So I think the test you are really try to set up is that your reject this null hypothesis if there are 2 or fewer deaths or 14 or more. In this case, we are talking about a two-sided test. So it is not really clear why Tesla would ever need to show that its fatality rate is so much lower than 0.12 per 100M miles that they'd be able to reject this 0.12 rate. If Tesla merely want to demonstrate they are safer than a human driver, it would be better just to estimate the rate and provide a confidence interval. The hypothesis testing framework is not really the best framing for that public communications objective. Rather, hypothesis testing is relevant when getting approval from a governing body; moreover, that body is likely to state what rate to test against. That is, the regulator sets the null hypothesis, while the Tesla needs to supply sufficient information to reject that regulatory hurdle.Good points, thanks jhm. I have a few more questions.
1) If 6 billion miles is an order of magnitude more data than they need, do you have a guess as to why Elon said "We expect that worldwide regulatory approval will require something on the order of 6 billion miles" ? I am still hung up on the fact that a 6 billion mile basic Poisson test with 0 fatalities is precisely enough to to show 10x better than humans at 99.9% confidence. Maybe I am inferring too much but if so that's a crazy coincidence. Maybe Elon was expecting to need a few hundred million miles of data from different nations or global regions instead of just extrapolating from data in regions like USA China and W Europe where lots of Teslas are currently collecting data?
2) Is the weighting and stratification affected by the potential neural net over fitting, or affected by the differences in the failure modes of human drivers vs FSD? For instance, we've seen that FSD has a much larger differential in performance than a human would have between highway vs city, and Silicon Valley vs North Carolina. Another example is autonomous vehicles perform better if a highway is straight and boring, whereas humans perform better on highways with some curves to keep their attention on the road. Since the risk factors have different effects on FSD than humans, would relying on actuarial data from human drivers to do the weighting & stratification still work? Or would this be done by just doing analysis on the FSD data set instead?
I don't know why people laugh & disagree here.I think other AI systems will quickly have FSD capabilities once Tesla releases their FSD. They could directly copy the "weights" in Tesla neural nets, or use Tesla FSD in a supervised training. Generate lots of simulations and train their own systems to behave like Tesla's by hacking to input the simulation videos into Tesla camera feeds.
Outstanding response! Thank you. I am convinced now. Frankly, this was the best post I can ever remember having seen on this forum.1) Different jurisdictions or nations will likely require testing to be done within their road system. They may even require separate NN fitting and had coding. So if 400M to 800M are required per jurisdiction, you could easily need on order of 6B on some 10 or so different jurisdictions. I doubt that Musk would be construing this as one giant test in which Tesla must demonstrate zero fatalities over the course of 6B miles. Indeed as your own calculation has show at 0.12 deaths per 100M miles (implying 7.2 fatalities), Tesla would have less than 0.1% chance of being about to go 6B miles with 0 fatalities. In practical terms, this is an nigh impossible test, doomed to failure. Maybe it would help to follow the implications setting 7.2 fatalities as your proposed null hypothesis. The test statistic, the actually number of fatalities over 6B miles, is Poisson distributed with mean 7.2 under the null hypothesis. It also has a standard deviation 2.68 = sqrt(7.2). With probability about 95%, the test statistic will be between 3 and 13 fatalities. So I think the test you are really try to set up is that your reject this null hypothesis if there are 2 or fewer deaths or 14 or more. In this case, we are talking about a two-sided test. So it is not really clear why Tesla would ever need to show that its fatality rate is so much lower than 0.12 per 100M miles that they'd be able to reject this 0.12 rate. If Tesla merely want to demonstrate they are safer than a human driver, it would be better just to estimate the rate and provide a confidence interval. The hypothesis testing framework is not really the best framing for that public communications objective. Rather, hypothesis testing is relevant when getting approval from a governing body; moreover, that body is likely to state what rate to test against. That is, the regulator sets the null hypothesis, while the Tesla needs to supply sufficient information to reject that regulatory hurdle.
2) The weighting and stratification I was writing about was in reference to testing for regulatory approval. Collecting data for training FSD is actually a much more complex task to do well. Certainly weighting and stratification can play a role in training a model, but that was not my point. Indeed, for training you want to be sure that you have broad data on the full scope of driving conditions, but you also want to oversample on certain data where critical and rare events happen. For example, you like will want to oversample on collisions, especially on collisions involving injuries and fatalities. Tesla is even using simulations of critical collisions to be able to augment data and revisit the scenario under varying conditions. Basically, Tesla want to learn as much as possible from each critical event so that FSD will never make mistakes again in such scenarios. So these are the sorts of consideration for curating training data.
Regulators will likely be interested in how Tesla curates training data and can do analysis on how representative the coverage is, the quality of the data, and many other issues. This is data review, but it is not road testing. For road testing, the regulator will likely want data on where and when the miles were driven and much more detailed information on critical events: collisions, injuries, fatalities, etc. They will analyze the data for representativeness and consider any weighting methodology deployed. The test day may well be segmented. For example, Tesla may require certain number of beta testers from each state. State or city segmentation helps assure representativeness. Time of day, day of year, weather conditions are other factors that may call for weighting or segmentation. So the regulator will need to be persuaded that the test exposure miles are sufficiently representative and have adequate coverage. They will also want to analyze the critical event data with special attention to any factors that may reveal weakness or flaw in the driving system.
But after all that work, the regulators are confronted with the final counts of types of critical events. The regulators will want data to show that the true frequencies of certain outcomes are far enough below a critical threshold that random error can be ruled out. This is where hypothesis testing comes in. The regulator may say to Tesla, you need to demonstrate that your fatality rate is below 1.2 per 100M miles. That's the null hypothesis. And Tesla might know that it is likely functioning is at or below 0.12 per 100M. So 0.2 is the alternative hypothesis they want to optimize around. The regulator doesn't care what the alternative hypothesis. But for Tesla it gives them the basis for planning how many test miles of exposure they will want to accumulate before they present there result to the regulators. The power of the test is extremely important to Tesla as they want a high probability that they will be able to submit enough data to pass the test, assuming their alternative hypothesis. But it is the regulator who cares about the significance of the test as they want there to be a low likelihood that they are fooled by statistical error. Type I error is an error that the regulator wants to avoid, while Type II is the error that the regulated entity wants to avoid.
Now suppose that Tesla makes significant advances in training its FSD neural nets, enough to convince them that their fatality rate is at or below 0.06 or bellow. In this situation, Tesla could chose 0.06 as there alternative hypothesis. What happens here is that for the same amount of test exposure, the power of their test goes up, they have a higher chance of passing the regulators test. Or put another way, they could proceed with less data and still have sufficient power. The implication here is that the choice of alternative hypothesis for Tesla drives how much exposure data they need. This is why I put the emphasis on Tesla engineering a better FSD system. The better it truly is, the less exposure data will be required to show that it can reject the null hypotheses posed by the regulator. This means Tesla can get through regulatory testing faster and cheaper. So how does Tesla engineer a safer FSD system, primarily it must do a damn careful job of curating training data. Any lack of vital experience in the training data exposes Tesla to incremental risk in beta testing (and full public release).
One other issue, suppose the regulators will be testing multiple outcomes. Say they require demonstration that the fatality rate is significantly below 1.2 per 100M miles, bicycle collisions are significantly below 10 cases per 100M miles, and pedestrian collisions are significantly below 6 cases per 100M miles. Now the power calculations become much more difficult. You need to have enough miles of exposure to have a very good chance of passing all three tests. So this multi-test situation could push Tesla to do more test miles than the fatality test alone would call for. This also could help explain question 1. Some jurisdictions might require multiple outcomes to be tested, and that could drive up the required sample size. Indeed I looked like the NHTSA was going on a fishing expedition, just looking for any outcome that might be higher than average. If a regulator aggressively pursues any finding any fault, they will likely succeed. And no amount of data would have an adequate chance of passing all the tests. But this is veering off into a hostile political situation, not good statistical or regulatory practice. At any rate, the point here is that, if regulators will be testing many end points, that can drive up the amount of exposure miles needed for regulatory approval. But again, even if that is what the regulators are demanding, the best strategy for Tesla is simply to work on improving the FSD along every conceivable test dimension, and to do that Tesla will need to curate substantial training data on every conceivable misstep a driver could make.
Just to get through regulatory approval, to pass all the stated and unstated tests, Tesla may well need FSD to be 10 times better that human drivers.