List of MSE — ls_F1 • MissImp

ls_F1 is a function that returns a list of F1-Score corresponding to the given list of imputed datasets. resample_method is needed because with 'bootstrap' method, we could have repeated lines in the imputed datasets, and with both 'jackknife' and 'bootstrap', the imputed datasets could not cover all the lines.

ls_F1(
  df_comp,
  ls_df_imp,
  mask,
  col_cat_comp,
  col_cat_imp,
  resample_method = "bootstrap",
  combine_method = "onehot",
  dict_cat = NULL
)

Arguments

df_comp: The original complete dataset.
ls_df_imp: List of imputed dataset.
mask: Mask of missingness (1 means missing value and 0 means observed value).
col_cat_comp: Indices of categorical columns in the complete dataset.
col_cat_imp: Indices of categorical columns in the imputed dataset.
resample_method: Default value is 'bootstrap', could also be 'jackknife' or 'none'.
combine_method: When resample_method = 'bootstrap', combine_method could be 'factor' or 'onehot'. When method = 'onehot', ls_F1 takes the average of the one-hot probability vector for each observation, then choose the position of maximum probability as the predicted category. When method = 'factor', or each observation, ls_F1 chooses the mode value over the imputed dataframes as the predicted category.
dict_cat: The dictionary of categorical columns names if "onehot" method is applied. For example, it could be list("Y7"=c("Y7_1","Y7_2"), "Y8"=c("Y8_1","Y8_2","Y8_3")).

Value

list_F1 List of F1 corresponding to the given list of imputed datasets. Mean_F1 Mean value of F1. Variance_F1 Variance of F1.