Mean comparison using Johnson distribution

35 views Asked by At

I have to use Johnson distribution to see how different means are when modifying the skewness value (.3 in the code below):

library(moments)
library(SuppDists)

k <- 500

parms <-JohnsonFit(c(0, 1, .3, 6))
sJohnson(parms)
poblacion <- rJohnson(1000, parms)

mu.pob <- mean(poblacion)
sd.pob <- sd(poblacion)

p <- vector(length=k)
for (i in p){
  muestra <- poblacion[rJohnson(1000, parms)]
  p[i] <- t.test(muestra, mu = mu.pob)$p.value
}

a_teo = 0.05
a_emp = length(p[p<a_teo])/k
sprintf("alpha_teo = %.3f <-> alpha_emp = %.3f", a_teo, a_emp)

If I change the .3 for 1, I got different values for mean and standard deviation, but I got exactly the same empirical value for alpha: 1.000. What is wrong with my code?

1

There are 1 answers

0
Rui Barradas On BEST ANSWER

There are a few errors in your code.

  • Instead of for(i in p) it should be for(i in seq.int(k));
  • The way you sample/resample, poblacion[rJohnson(1000, parms)], is wrong. The index rJohnson is not integer and it will not index ppoblacion. The correct ways should be any of the ways in the for loop below.
  • Though not wrong your calculation of a_emp is too complicated, see the equivalent mean(p < a_teo).

And here is the complete code corrected.

library(moments)
library(SuppDists)

set.seed(2024)

k <- 500

parms <-JohnsonFit(c(0, 1, .3, 6))
# sJohnson(parms)
poblacion <- rJohnson(1000, parms)

mu.pob <- mean(poblacion)
sd.pob <- sd(poblacion)

p <- numeric(k)
for(i in seq.int(k)){
  # muestra <- sample(poblacion, 1000, TRUE)
  muestra <- rJohnson(1000, parms)
  p[i] <- t.test(muestra, mu = mu.pob)$p.value
}

a_teo <- 0.05
a_emp <- length(p[p<a_teo])/k
mean(p < a_teo)
#> [1] 0.042
sprintf("alpha_teo = %.3f <-> alpha_emp = %.3f", a_teo, a_emp)
#> [1] "alpha_teo = 0.050 <-> alpha_emp = 0.042"

Created on 2024-03-10 with reprex v2.1.0


Edit

Instead of the for loop above, you can use replicate. From help("replicate"), my emphasis:

replicate is a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).

p <- replicate(k, {
  muestra <- rJohnson(1000, parms)
  t.test(muestra, mu = mu.pob)$p.value
})

Note that with replicate you do not need to create p with the required length beforehand, it will be replicate's return value.