dummy_test combines the p-values from dummy_test_matrix by Fisher's method.

dummy_test_matrix generates a matrix of p-values for dummy t-chi-test. The null hypothesis(H0) is that the missing mechanism is MCAR. The position [i,j] of this matrix shows the p-value of the test that the missigness in Yi does not depend on the value of Yj.

We note Yj_1 as the part of Yj where Yi is missing, and Yj_0 as the part of Yj where Yi is observed. Mj_1 and Mj_0 correspond to the mask of missingness where Yi is missing or observed. Mi is the mask of missingness for Yi. For example, if Yi[3] is missing and Yj[3] is observed, then Mj_1[3]=0, Mi[3]=1.

There are four situations:

  • Yj is completely missing. In this case, no test will be done.

  • Yj is partially observed, but Yj_1 (or Yj_0) is completely missing. In this case, a t-test is performed to test if the mean of Mj_0 (or Mj_1) is 1.

  • Yj is numerical, Yj_1 and Yj_0 are both partially observed. In this case, a paired t-test is performed to test if Yj_1 and Yj_0 have the same mean.

  • Yj is categorical, Yj_1 and Yj_0 are both partially observed. In this case, a chi-squared test is performed to test if Yj and Mi are independent.

dummy_test(df, col_cat = c())

Arguments

df

An incomplete dataframe.

col_cat

The categorical columns index.

Value

p.matrix A matrix of p-value, where the position [i,j] shows the p-value of the test that the missigness in Yi does not depend on the value of Yj. dof Degree of freedom for the chi-squared statistics in Fisher's method. chi2stat Chi-squared statistics by Fisher's method. p.value Combined p-value for the MCAR test.

References

Missing value analysis & Data imputation, G. David Garson, 2015