Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Discussion of statistical analysis of vehicle fires as it relates to Model S

This site may earn commission on affiliate links.
So the average fires per 1000 collisions rate is 2.9, there have been 472 collisions for Teslas (where did you guys get this number?) so if we make the hypothesis, that the Model S has the exact same affinity to collision fires, then the expected number of fires is 1.3688 for the amount of collisions seen. The probability distribution then for the observed events is:

0 -> 25.4%
1 -> 34.8%
2 -> 23.8%
3 -> 10.9%
4+ -> 5%

As I have stated multiple times, the count is too low to draw conclusive results. If we base on this expectation, then the Model S is perfectly compatible with observation because we have 15.9% probability of seeing three or more fires for this amount of collisions.

Now what we can exclude at 95% confidence level is that the Model S has a collision fire rate of below 1.73 fires / 1000 collisions (that would mean 0.818 average fires for 472 collisions that gives 5% chance to observe 3 or more fires). That's about all we can conclude and even that conclusion is wrong 1-out of-20 times.

So assuming the input information is correct (2.9/1000 fires per collisions on average, 472 MS collisions, 3 fires) the maximum I'm willing to conclude is that at 95% confidence level we can exclude a claim that Model S is 1.7x (or higher) less likely to result in a fire after a collision than ICE cars. That is maximum we can conclude at this point. As you can see we can't exclude that the Model S is for example 1.5x less likely to result in a fire nor can we exclude 1x, which would start to make the Model S more prone to fires than ICEs.

Now if someone refutes the input data, then we can redo the exercise.
 
Mario:
If the number of collisions and the fires per 1000 collisions are US-numbers, you are only allowed to take the 2 US-fires into account. You'd need to make a separate calculation for Mexico or you'd need to make a World-wide calculation.
Furthermore, one can argue the 2-fires are not necessarily counted as a "collision" in the fires per 1000 collisions statistic. That depends on how that statistic is calculated. Is driving over a nail and getting a puncture a collision? Is driving over a wooden beam and loosing your exhaust a collision? Is hitting a tree a collision? Or is it only a collision when you hit another vehicle? What is a collision in that statistic anyway?

Can someone give a source of the numbers? It would be nice to know where they come from and understand better what they mean (A real reference, not just "from the NHTSA", please.)
 
luvb2b, others have unsuccessfully tried to explain that 3 fires are not a very dependable estimator for the mean. Sure its the best unbiased estimator we can get but the standard deviation is 1.73 fires with 10820 car-years [sqrt(np(1-p)) where p=3/10820 and n=10820] using your binomial distribution of events.

The standard deviation is the measure of dispersion of outcomes which right now is 1.73/3 = 58% of observed fires

Its easy to see the difference in robustness when we have 30 fires, and assuming the mean value is the same 3/10820 fires per car year, and ten times as many observed car years. But with 30 observed fires our standard deviation is sqrt(108200 *(3/10820)*(1-3/10820 ) = 5.48

5.48/30 = 18% of observed fires

This is measure is in fact the inverse of the signal to noise ratio, the ratio of useful information to irrelevant data.

One fire more or less doesn't have a big impact on our estimations when we have 30 fires but right now we get huge swings in our estimates and hence sensitivity to outliers. This will of course make any predictions into the future just as sensitive to the same volatility.

In probability theory you can explain everything if you have the probability distribution, the mean and the standard deviation. In reality you never have any of them but the formulas swallow our estimated variables like god given truths and spit out 95% certainties :)

I did like your work but we do not have enough data yet for any robust conclusion.
 
I gave a look to some statistical studies and I found out that with a number of 432 incidents (collisions) a stochastic calculation can already be done. Of course, being a statistical calculation, it would be better to have more data. But for the time being the probability of 2.11 fires/1000 collisions for the Model S has a mathematical ground IMO and also according to the literature in this field.
To this concern I would like to point that the difference of almost 1 unit between the same calculation for ICE cars (2.9) is not to be undervalued considering that such a statistical calculation is done between a very consolidated technology (ICE) and a completely new technology (Tesla). In fact in the case of Tesla there are big margins of improvement in the following years that will make this number of 2.11 decrease for sure.
Eventually I would like to point out that collisions and road debris accidents are to be considered different kinds of accidents for which a different statistical calculation should be done IMO.
 
luvb2b, others have unsuccessfully tried to explain that 3 fires are not a very dependable estimator for the mean. Sure its the best unbiased estimator we can get but the standard deviation is 1.73 fires with 10820 car-years [sqrt(np(1-p)) where p=3/10820 and n=10820] using your binomial distribution of events.

The standard deviation is the measure of dispersion of outcomes which right now is 1.73/3 = 58% of observed fires

Not going to quote on the rest, but for this low statistics the one standard deviation is not symmetric because it's Poisson distribution that is asymmetric. So you'd be better off computing the confidence intervals with same probability percentages on either side (i.e. 16% on either side from mean value would give you 1-sigma interval, 2.5% either side would give you central 95% CL etc). That's why I picked a specific hypothesis and showed a 95% CL exclusion that used it as an upper bound. Anyway, for statistics we need more fires and more caryears, for investing we need less fires so let's invest :)

- - - Updated - - -

Mario:
If the number of collisions and the fires per 1000 collisions are US-numbers, you are only allowed to take the 2 US-fires into account. You'd need to make a separate calculation for Mexico or you'd need to make a World-wide calculation.
Furthermore, one can argue the 2-fires are not necessarily counted as a "collision" in the fires per 1000 collisions statistic. That depends on how that statistic is calculated. Is driving over a nail and getting a puncture a collision? Is driving over a wooden beam and loosing your exhaust a collision? Is hitting a tree a collision? Or is it only a collision when you hit another vehicle? What is a collision in that statistic anyway?

Can someone give a source of the numbers? It would be nice to know where they come from and understand better what they mean (A real reference, not just "from the NHTSA", please.)

I was working with a worst case scenario :) If we take 2 fires, then the 95% CL exclusion is for the claim that Model S is 3.8x less likely to catch fire. And the probability to observe two or more events is ~40%. Again, need to go and check the inputs, but with those inputs that's what we get.
 
@Chrisdl
@Bearman
@Mario...

Thank you!

You simply can't be making, or suggesting, any predictable statements at this time. The numbers are too low, some of the input is questionable, and the confidence swings are too wide.

This is what I'm calling out on Luvb2b, that you just can't seem to accept. And try as you might to persuade everyone else to the unbiased research, math and logic you use (some of which we agree on)... it falls down on this fact alone. The numbers are TOO low.

ANY suggestion is pure conjecture at this point.
 
49% of all statistics are meaningless.

Seriously though, I'm in the "need more data" camp. And as everyday goes by with more Model S's on the road and no new fires (boy I hope I'm not jinxing us saying that) the Model S's odds get better. Plus the fact that not every impact or puncture to the battery results in a fire. More media sensationalism as they make it sound like everytime someone runs over something that hits the battery pack it catches on fire. But why bother with facts.
 
More media sensationalism as they make it sound like everytime someone runs over something that hits the battery pack it catches on fire. But why bother with facts.

Well, you have to bear in mind that news in the media today is all about entertainment and ratings. Nothing to do with "objective factual reporting".
 
Well, you have to bear in mind that news in the media today is all about entertainment and ratings. Nothing to do with "objective factual reporting".

That's why I only watch the local news for the weather and traffic reports. The rest of it is pretty much garbage. If I want news I generally turn to the BBC or online sources. Funny did you know there was a whole world on the otherside of the ocean?.
 
Not to beat at dead horse too much but one more thing, a take-away from how medical research is often done.

1. You choose to study something quantitatively
2. You define your problem
3. Many times you want to compare two groups, for example active drug v.s. placebo, operation v.s. not surgery etc. Here we have Teslas v.s. "all ICE-cars grouped togheter"
4. You define what outcome you want to study, for example deaths, change in blood pressure etc. etc. Here we have discussed mostly number of fires per car-years or number of fires per collision.
5. You estimate what type of difference you expect to see between the groups (ball park figures)
6. Using the estimate above you calculate the statistical power needed to show such a difference within a given confidence interval (95% typically used)
7. You size your study accordingly

Some take-aways from this is:
- For small differences or for uncommon events you need to design very large studies to be able to find statistically significant differences.
- Sometimes after point 6 above you conclude that it's impossible/takes too much resources to perform the study, so you change the question and/or outcome variables studied.
- If you design at study and size it (power) to answer one question you should be very careful to draw other conclusions from that data. For example if you design and perform a study to show that a new drug lowers blood-pressure by an average of 10 mm Hg or more better than a previous drug and you calculate that you need 2000 participants studied for 2 years to show this, and the study does show this finding within a 95% confidence interval (p<0.05) BUT at the same time there seems to be a lower number of deaths total in the "new drug group" compared to the "old drug group". Let's say it's 2 deaths in the "new drug group" and 5 in the "old drug group". Applying statistical test this may even come of as significant (p<0.05). HOWEVER, since the study was not designed/powered to answer this question the finding is disregarded.

In this case luvb2b has tried to do the best possible statistical analysis from the data available, but if we were to first define and design a method of studying this we would surely conclude that
1. We need more time and more incidents (higher statistical power)
2. A much better defined control group with much better defined outcomes (as suggested by many others it's very hard to know what's actually hiding behind the numbers we are trying to work with here)
 
the new information i want to add to the thread today is to do a quick comparison with the 2013 ford escape recall that was announced last night.

Ford said it is recalling nearly 140,000 2013 Escape SUVs with 1.6-liter engines in the United States — and 161,333 worldwide — because of fires caused by overheating of the engine cylinder head, which can crack and leak oil. Ford said it had received reports of 13 fires, including one in Canada, stemming from the engine issue.

From The Detroit News: http://www.detroitnews.com/article/20131126/AUTO0102/311260052#ixzz2lrealshA


i would highlight 3 points here - first, they counted a fire that happened in canada in deciding on this recall. so i don't feel so bad including a mexico fire, if the conditions are appropriate.

second, a total of 161,333 2013 escapes experience 13 fires. the escape went on sale first half of 2013. even if we're conservative and say the average escape has only been on the road 6 months, this is 13 fires in 80,667 car-years of experience, or 1 per 6,205 car years of experience. that's still near half the rate of model s collsion-fires. my point here is simply that statistics weaker than what we've seen for model s collision-fires have been enough to confirm the presence of a legitimate problem.

third, the fact remains that model s has had far fewer electrical and mechanical fires than ices.

ok and since i'm here, a few followups to others...

luvb2b, others have unsuccessfully tried to explain that 3 fires are not a very dependable estimator for the mean. Sure its the best unbiased estimator we can get but the standard deviation is 1.73 fires with 10820 car-years [sqrt(np(1-p)) where p=3/10820 and n=10820] using your binomial distribution of events.

The standard deviation is the measure of dispersion of outcomes which right now is 1.73/3 = 58% of observed fires

i saw mario replied to you already, and his reply is correct. standard deviation is not an accurate measure as when normal distribution approximations apply.

what you did is attempt to calculate the confidence interval for number of observed fires given that you know what the rate of fires is *estimated* to be. that figure gives you no comparative information from which to draw an inference.

what i had done was estimate a confidence interval for the probability of a model s collision-fire, based on the observation of 3 fires, and then test whether the confidence interval for this estimate was above that for the observed rate of ice collision-fires.

as long as i'm posting on this thread today, i'll go over two other user comments as well.

Actually, these numbers originate from this Automotive News article:
http://www.autonews.com/article/20131108/BLOG06/131109827/tesla-firetraps-numbers-dont-back-it-up#axzz2lOMLFo26
My improved(?) calculation:
Facts:
250,000,000 vehicles total
190,000,000 passenger cars (short wheel base) [!]
20,000 Model Ses

=> 1 in 9,500 cars is a Model S

190,000 car fires / year
3 Model S fires / year
Chance to catch fire (not related to collisions):
Other cars => 190,000 / 190,000,000 = 0.001 = 1 in 1,000 cars catches fire
Model S => 3 / 20,000 = 0.00015 = 1 in 6,667 Model Ses catches fire
...

Then let's look at the Model S:
=> 6,000,000 / 9,500 = 632 Model S accidents predicted per year (linear extrapolation based the numbers above)


632 accidents / year [!]
3 car fires / year
1 car fires caused by an accident (33%) [!]


Chance that an accident causes a fire in a Model S:
=> 1 / 632 = 1 in 632 accidents cause a fire


the user above has a few different errors, but one glaring one is that the number of passenger cars is way too high. the 4% of car-fires happen in collisions comes from the nfpa reports, and those reports use 128-130 million as the number of cars. here the user has mixed a statistic from the nfpa data which comes form a known pool of vehicles, and then applied to that a much larger number. also the 190,000 car fires per vehicle figure is incorrect. by quoting a journalist instead of using the original source, the user doesn't realize the journalist has used the figure for total passenger road vehicle fires. i posted the document where specifically car fires are analyzed in great detail, and you can see pretty clearly that in 2010 the number of car fires is 135,000 (see bottom half of page 9).
http://www.nfpa.org/~/media/Files/Research/NFPA reports/Vehicles/osautomobilefires.pdf

you can go through my original post with all the source data. i was very careful in trying to use best data available from original sources. use of second-hand sources with mixed definitions will lead to garbage results, imo.

another error i've seen this and other users make is that they attempt to estimate collisions, then estimate collisions for model s, and then fires. by introducing an estimate of number of collisions, and then estimating collisions for model s, more uncertainty is introduced into the statistics that are used to draw inference. that reduces your ability to have confidence in the conclusion unless the introduced error is taken into account. this is very difficult to do, because we really have little idea of the true rate of model s collisions.

that's why when i did this analysis, i minimized the sources of error by basing the analysis on just 3 numbers: number of model s collision-fires, model s car-years on the road, and the probability of a collision fire in an ice. the two model s numbers i used we know pretty much for sure: number of tesla collision fires and number of tesla car-years on the road. even if you want to argue 2 tesla collision-fires or 3 tesla collision-fires, that is a debate about the definition. as for the tesla model s car-years estimate i'm sure that's conservative as i used a linear model for deliveries, and we know the quarters were back-end loaded.

the main unknown that i used was my "p" which is the probability of a collision-fire in an ice. i mentioned numerous times how this figure came from a report on about 130 million vehicles tracked over many years, which would give it a relatively low standard deviation. finally, my analysis also showed that this figure could have been quite a bit higher than my estimate, and the results would still hold.

I'll look into it a bit more, but a simple Poisson probability for an average expected event occurrence of 0.42 gives 5.8% probability for 2 fires and 6.7% probability for >= 2 fires. I'm assuming the collision stats you took were for US therefore it would not be statistically quite valid to include the Mexican fire. Which means that we're around the 93% region so can't exclude at 95% confidence level nor can we claim it significant because we're still far from 3 sigma. Even if we include the third fire and I'm not 100% sure that'd be quite valid with the mean expectation as it may well be the crash ratio with fire is far higher in Mexico, then we get that the probability of >=3 is 0.9% and < 3 is 99.1%. The 3-sigma level is 99.7% and we have so far not accounted for any uncertainties therefore the real significance is smaller for sure so can't even claim 100% that we could really exclude at 95% CL as the uncertainties might very well shift the outcome. If I get time I'll try to add some uncertainties to the estimates and run it through the Higgs exclusion and significance estimator tools to find some more precise numbers.

regarding mr. kadastik's post above, i verified and agreed with his figures and we had a brief pm discussion about them. the way he wrote this post is a bit difficult to read, but i will try to clarify what he's saying (without distorting it).

first, he believes the poisson distribution is the correct one to use. i had used the binomial distribution. the binomial distribution starts to approximate the poisson distribution as the number of observations gets large, so this is fine.

mario believes the mexico fire should be excluded, and if it is, the significance is around the "93% region". he says therefore the result is not significant at 95% with 2 fires (i can agree). our difference is that i find a 93% confidence to be meaningful, but to him it is not.

mario points out that if the third fire is included, "
then we get that the probability of >=3 is 0.9% and < 3 is 99.1%". so he agrees that 3 model s collision-fires is significant to the 99% level, but he feels that significance to the 99.7% level is required to have a valid result. as he wrote, "The 3-sigma level is 99.7%." in my pm with him, i came to understand that his line of work generally requires 99.7% significance to have valid results. personally i am more than happy with 90%+ odds when i can find them.

finally, he mentions that "we have so far not accounted for any uncertainties therefore the real significance is smaller". i addressed this point above and in one of my middle posts, that the only place i introduced additional uncertainty into my analysis was in my estimate of p for the probability of ice collision-fires. in my view, the estimate of probability of an ice collision fire i used is too low, because it includes many older cars. and i also feel the accuracy of the estimate is high, because it's based on well over 100 million of car-years of observation.

so for those who wanted mario to verify the work - he more or less agrees with my numbers, but he wants a 99.7% confidence level before he's convinced and i can show only 93-99% confidence. for those who doubted that 3 observations could yield 99% confidence, he is confirming it. i'm back off into the sunset!
 
Last edited: