missRanger_mod_draw — missRanger_mod

missRanger_mod_draw create one imputation result for multiple imputation with missRanger method. Please find the detailed explanation of missRanger single imputation method in the documentation of missRanger in 'missRanger' package. In this document, only the differences will be explained.

missRanger is an imputation method based on random forest. In missRanger_mod_draw, during the last iteration, for a certain prediction, instead of taking average of the prediction result from each tree of the random forest, we draw one result from the empirical distribution constructed by predictions of trees. The other steps of the imputation are identical as those of missRanger.

missRanger_mod_draw(
  data,
  formula = . ~ .,
  pmm.k = 0L,
  maxiter = 10L,
  seed = NULL,
  verbose = 1,
  returnOOB = FALSE,
  case.weights = NULL,
  col_cat = c(),
  num_mi = 5,
  ...
)

Arguments

data: A data.frame or tibble with missing values to impute.
formula: A two-sided formula specifying variables to be imputed (left hand side) and variables used to impute (right hand side). Defaults to . ~ ., i.e. use all variables to impute all variables. If e.g. all variables (with missings) should be imputed by all variables except variable "ID", use . ~ . - ID. Note that a "." is evaluated separately for each side of the formula. Further note that variables with missings must appear in the left hand side if they should be used on the right hand side.
pmm.k: Number of candidate non-missing values to sample from in the predictive mean matching steps. 0 to avoid this step.
maxiter: Maximum number of chaining iterations.
seed: Integer seed to initialize the random generator.
verbose: Controls how much info is printed to screen. 0 to print nothing. 1 (default) to print a "." per iteration and variable, 2 to print the OOB prediction error per iteration and variable (1 minus R-squared for regression). Furthermore, if verbose is positive, the variables used for imputation are listed as well as the variables to be imputed (in the imputation order). This will be useful to detect if some variables are unexpectedly skipped.
returnOOB: Logical flag. If TRUE, the final average out-of-bag prediction error is added to the output as attribute "oob". This does not work in the special case when the variables are imputed univariately.
case.weights: Vector with non-negative case weights.
col_cat: Indices of categorical columns.
num_mi: Number of multiple imputations.
...: Arguments passed to ranger(). If the data set is large, better use less trees (e.g. num.trees = 20) and/or a low value of sample.fraction. The following arguments are e.g. incompatible with ranger: write.forest, probability, split.select.weights, dependent.variable.name, and classification.

Value

ls_ximp List of imputed datasets for multiple imputation in MI_missRanger. ls_ximp.disj List of disjunctive imputed datasets for multiple imputation in MI_missRanger.