bootsample function generates several dataframes by drawing samples of size n by simple random sampling with replacement.

In average, if we want to cover all the n rows in the original dataframe, the number of samples should be greater than log(n).

In this function, to avoid that some rows are uncovered during resampling process (even if the number of samples is smaller than log(n)), the last bootstrap sample will make sure to cover all those uncovered rows.

bootsample(df, num_sample)

Arguments

df

A complete or incomplete dataframe

num_sample

Number of bootstrapped samples. We suggest that num_sample > Log(n), where n is the number of rows.

Value

A list of bootstrapped dataframes.

References

Statistical Analysis with Missing Data, by Little and Rubin, 2002

Examples

n <- 10000
mu.X <- c(1, 2, 3)
Sigma.X <- matrix(c(9, 3, 2, 3, 4, 0, 2, 0, 1), nrow = 3)
X.complete.cont <- MASS::mvrnorm(n, mu.X, Sigma.X)
rs <- generate_miss(X.complete.cont, 0.5, mechanism = "MNAR2")
ls_boot <- bootsample(rs$X.incomp, 4)