Q: The read-across prediction is equal to 30.8 (the mean value of: 72.1, 50.0, 25.0, 3.787, 3.230). With a 95% level of significance (default value) the left and right endpoints of this prediction are computed as being -60.4 and 122. Given that the standard deviation (unbiased estimation, use of “n-1” at the denominator) of the five EC3 values associated with the five closest neighbors is equal to 30, I expected the endpoints to be equal to:
Left endpoint (95% level) = 30.8 - 1.96*30 = -28
Right endpoint (95% level) = 30.8 + 1.96*30 = 89.6
On the other hand, it would seem that values approaching a 99.7% interval are computed instead:
Left endpoint (99.7% level) = 30.8 - 3*30 = -59.2
Right endpoint (99.7% level) = 30.8 + 3*30 = 120.8
Do you have a clearer understanding about the computation of prediction confidence range?
A: The confidence interval for predictions based on samples coming from distributions with unknown mean and unknown variance is:
After calculating you will see that:
EC3min ≈ -60.4
EC3max ≈ 122.0
The term Tn-1 takes into account the sample size and is important for small samples.
If sample size goes to infinity Tn-1 ≈ 1.96 as it is in your calculations.
(This question is also posted at the Toolbox Discussion forum: https://community.oecd.org/thread/27296 )