generation of missing values on complete or incomplete data according to different missingness mechanisms and patterns

produce_NA(
  data,
  mechanism = "MCAR",
  perc.missing = 0.5,
  self.mask = NULL,
  idx.incomplete = NULL,
  idx.covariates = NULL,
  weights.covariates = NULL,
  by.patterns = FALSE,
  patterns = NULL,
  freq.patterns = NULL,
  weights.patterns = NULL,
  use.all = FALSE,
  logit.model = "RIGHT",
  seed = NULL
)

Arguments

data: [data.frame, matrix] (mixed) data table (n x p)
mechanism: [string] either one of "MCAR", "MAR", "MNAR"; default is "MCAR"
perc.missing: [positive double] proportion of missing values, between 0 and 1; default is 0.5
self.mask: [string] either NULL or one of "sym", "upper", "lower"; default is NULL
idx.incomplete: [array] indices of variables to generate missing values for; if NULL then missing values in all variables are possible; default is NULL
idx.covariates: [matrix] binary matrix such that entries in row i that are equal to 1 indicate covariates that incluence missingness of variable i (sum(idx.incomplete) x p); if NULL then all covariates contribute; default is NULL
weights.covariates: [matrix] matrix of same size as idx.covariates with weights in row i for contribution of each covariate to missingness model of variable i; if NULL then a (regularized) logistic model is fitted; default is NULL
by.patterns: [boolean] generate missing values according to (pre-specified) patterns; default is FALSE
patterns: [matrix] binary matrix with 1=observed, 0=missing (n_pattern x p); default is NULL
freq.patterns: [array] array of size n_pattern containing desired proportion of each pattern; if NULL then mice::ampute.default.freq will be called ; default is NULL
weights.patterns: [matrix] weights used to calculate weighted sum scores (n_pattern x p); if NULL then mice::ampute.default.weights will be called; default is NULL
use.all: [boolean] use all observations, including incomplete observations, for amputation when amputing by patterns (only relevant if initial data is incomplete and by.pattern=T); default is FALSE
logit.model: [string] either one of "RIGHT","LEFT","MID","TAIL"; default is "RIGHT"
seed: [natural integer] seed for random numbers generator; default is NULL

Value

A list with the following elements

data.init: original data.frame
data.incomp: data.frame with the newly generated missing values, observed values correspond to the values from the initial data.frame
idx_newNA: a boolean data.frame indicating the indices of the newly generated missing values