Estimate the number of dimensions for the Factorial Analysis of Mixed Data by cross-validation. This function is nearly identical with estim_ncpFAMD function in 'missMDA' package. The only difference is that in this function imputeFAMD is used, which then calls imputeMFA, then impute_mod. In impute_mod, some changes have been made to avoid the convergence error.

estim_ncpFAMD(
  don,
  ncp.min = 0,
  ncp.max = 5,
  method = c("Regularized", "EM"),
  method.cv = c("Kfold", "loo"),
  nbsim = 100,
  pNA = 0.05,
  ind.sup = NULL,
  sup.var = NULL,
  threshold = 1e-04,
  verbose = TRUE,
  maxiter = 1000
)

Arguments

don

a data.frame with categorical variables; with missing entries or not.

ncp.min

integer corresponding to the minimum number of components to test.

ncp.max

integer corresponding to the maximum number of components to test.

method

"Regularized" by default or "EM".

method.cv

"Kfold" for cross-validation or "loo" for leave-one-out.

nbsim

number of simulations, useful only if method.cv="Kfold".

pNA

percentage of missing values added in the data set, useful only if method.cv="Kfold.

ind.sup

a vector indicating the indexes of the supplementary individuals.

sup.var

a vector indicating the indexes of the supplementary variables (quantitative and categorical).

threshold

the threshold for assessing convergence

verbose

boolean. TRUE means that a progressbar is writtent.

maxiter

max iteration number for imputeFAMD

Value

ncp the number of components retained for the FAMD. criterion the criterion (the MSEP) calculated for each number of components.

References

Audigier, V., Husson, F. & Josse, J. (2014). A principal components method to impute mixed data. Advances in Data Analysis and Classification.