This function performs Reject Inference using the Twins technique. Note that this technique has no theoretical foundation.

twins(xf, xnf, yf)

## Arguments

xf The matrix of financed clients' characteristics to be used in the scorecard. The matrix of not financed clients' characteristics to be used in the scorecard (must be the same in the same order as xf!). The matrix of financed clients' labels

## Value

List containing the model using financed clients only, the model of acceptance and the model produced using the Twins method.

## Details

This function performs the Twins method on the data. When provided with labeled observations $$(x^\ell,y)$$, it first fits the logistic regression model $$p_\theta$$ of $$x^\ell$$ on $$y$$, then fits the logistic regression model $$p_\omega$$ of $$X$$ on the binomial random variable denoting the observation of the data $$Z$$. We use predictions of both models on the labeled observations to construct a "meta"-score based on logistic regression which predicted probabilities are used to reweight samples and construct the final score $$p_\eta$$.

## References

Enea, M. (2015), speedglm: Fitting Linear and Generalized Linear Models to Large Data Sets, https://CRAN.R-project.org/package=speedglm Ehrhardt, A., Biernacki, C., Vandewalle, V., Heinrich, P. and Beben, S. (2018), Reject Inference Methods in Credit Scoring: a rational review,

## See also

glm, speedglm

Adrien Ehrhardt

## Examples

# We simulate data from financed clients
df <- generate_data(n = 100, d = 2)
xf <- df[, -ncol(df)]
yf <- df$y # We simulate data from not financed clients (MCAR mechanism) xnf <- generate_data(n = 100, d = 2)[, -ncol(df)] twins(xf, xnf, yf) #> Generalized Linear Model of class 'speedglm': #> #> Call: speedglm::speedglm(formula = labels ~ score_acc + score_def, data = df[df$acc == 1, -which(names(df) %in% c("acc"))],      family = stats::binomial(link = "logit"))
#>
#> Coefficients:
#> (Intercept)    score_acc    score_def
#>   -9.73e-15     3.05e-13     1.00e+00
#>