imputeFAMD.Rd
Impute the missing values of a mixed dataset (with continuous and categorical variables) using the principal component method "factorial analysis for mixed data" (FAMD). Can be used as a preliminary step before performing FAMD on an incomplete dataset.
This function is nearly identical with imputeFAMD
function in
'missMDA' package. The only difference is that in this function
imputeMFA
is used, which then calls impute_mod
.
In impute
, some changes have been made to avoid the convergence error.
imputeFAMD(
X,
ncp = 2,
method = c("Regularized", "EM"),
row.w = NULL,
coeff.ridge = 1,
threshold = 1e-06,
ind.sup = NULL,
sup.var = NULL,
seed = NULL,
maxiter = 1000,
...
)
a data.frame with continuous and categorical variables containing missing values.
integer corresponding to the number of components used to predict the missing entries.
"Regularized" by default or "EM".
row weights (by default, uniform row weights).
1 by default to perform the regularized imputeFAMD algorithm; useful only if method="Regularized". Other regularization terms can be implemented by setting the value to less than 1 in order to regularized less (to get closer to the results of the EM method) or more than 1 to regularized more.
the threshold for assessing convergence
a vector indicating the indexes of the supplementary individuals.
a vector indicating the indexes of the supplementary variables (quantitative and categorical).
integer, by default seed = NULL implies that missing values are initially imputed by the mean of each variable for the continuous variables and by the proportion of the category for the categorical variables coded with indicator matrices of dummy variables. Other values leads to a random initialization.
max iteration number for imputeFAMD
further arguments passed to or from other methods.
tab.disj
the imputed matrix; the observed values are kept
for the non-missing entries and the missing values are replaced by the
predicted ones. The categorical variables are coded with the indicator matrix
of dummy variables. In this indicator matrix, the imputed values are real
numbers but they met the constraint that the sum of the entries corresponding
to one individual and one variable is equal to one. Consequently they can be
seen as degree of membership to the corresponding category.
completeObs
the mixed imputed dataset; the observed values are
kept for the non-missing entries and the missing values are replaced by the
predicted ones. For the continuous variables, the values are the same as in
the tab.disj output; for the categorical variables missing values are imputed
with the most plausible categories according to the values in the tab.disj
output.
call
the matched call.
Audigier, V., Husson, F. & Josse, J. (2013). A principal components method to impute mixed data. Advances in Data Analysis and Classification, 10(1), 5-26. https://arxiv.org/abs/1301.4797#'