MIFAMD: modified multiple imputation with FAMD

MIFAMD is a modified multiple imputation function with FAMD (Factorial Analysis of Mixed Data) that returns categorical columns results both in factor and in onehot probability vector form. Please find the detailed documentation of MIFAMD in the 'missMDA' package. Only the modifications are explained on this page.

With MIFAMD, not only the multiple imputation results are returned, but the disjunctive multiple imputation results are also returned (The categorical columns are in form of onehot probability vector). Besides, instead of returning the final imputed dataset by performing one time FAMD imputation, MIFAMD returns the final imputed dataset by combining the multiple imputation results with Rubin's Rule.

MIFAMD(
  X,
  ncp = 2,
  method = c("Regularized", "EM"),
  coeff.ridge = 1,
  threshold = 1e-06,
  seed = NULL,
  maxiter = 1000,
  nboot = 20,
  verbose = T
)

Arguments

X: Data frame with missing values.
ncp: Number of components used to reconstruct data with the FAMD reconstruction formula.
method: "Regularized" by default or "EM"
coeff.ridge: 1 by default to perform the regularized imputeFAMD algorithm. Other regularization terms can be implemented by setting the value to less than 1 in order to regularized less (to get closer to the results of an EM method) or more than 1 to regularized more (to get closer to the results of the proportion imputation).
threshold: Threshold for the criterion convergence.
seed: integer, by default seed = NULL implies that missing values are initially imputed by the mean of each variable for the continuous variables and by the proportion of the category for the categorical variables coded with indicator matrices of dummy variables. Other values leads to a random initialization.
maxiter: Maximum number of iterations for the algorithm.
nboot: Number of multiple imputations.
verbose: verbose=TRUE for screen printing of iteration numbers.

Value

res.MI A list of imputed dataset after mutiple imputation. res.MI.disj A list of disjunctive imputed dataset after mutiple imputation. ximp Final imputed dataset by combining res.MI.disjwith Rubin's Rule. ximp.disj Disjunctive imputed data matrix of same type as 'ximp' for the numeric columns. For the categorical columns, the prediction of probability for each category is shown in form of onehot probability vector. res.imputeFAMD Output obtained with the function imputeFAMD (single imputation). call The matched call.

References

Audigier, V., Husson, F. & Josse, J. (2015). A principal components method to impute mixed data. Advances in Data Analysis and Classification, 10(1), 5-26. <doi:10.1007/s11634-014-0195-1>

Audigier, V., Husson, F., Josse, J. (2017). MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis. <doi:10.1007/s11222-016-9635-4>

Little R.J.A., Rubin D.B. (2002) Statistical Analysis with Missing Data. Wiley series in probability and statistics, New-York