The *robust2sls* package provides an implementation of several
algorithms for estimating outlier robust two-stage least squares (2SLS)
models. All currently implemented algorithms are based on trimmed 2SLS,
i.e. procedures where certain observations are identified as outliers
and subsequently removed from the sample for re-estimation.

The algorithms can be customized using a range of arguments. After
estimation, corrected standard errors can be computed, which take false
outlier detection into account. Moreover, the *robust2sls*
package provides statistical tests for whether (a subset of) the
parameters between the full and the trimmed sample are statistically
significant. Currently, tests for the presence of outliers in the
original sample are being developed and will be added in the future.

The model setting is a standard 2SLS setup. Suppose we have cross-sectional iid data \(\{(y_{i}, x_{i}, z_{i})\}\), where \(y_{i}\) is the outcome variable, \(x_{i}\) is a vector of regressors of which some can be endogenous, and \(z_{i}\) is a vector of instruments. The set of regressors can be decomposed into the exogenous regressors, \(x_{1i}\), and the endogenous regressors \(x_{2i}\). The total dimension of \(x_{i}\) is then \(d_{x} = d_{x_{1}} + d_{x_{2}}\). The instruments \(z_{i}\) consist of both the exogenous regressors, \(z_{1i} = x_{1i}\), and a set of excluded instruments \(z_{2i}\). The total dimension of \(z_{i}\) is then \(d_{z} = d_{x_{1}} + d_{z_{2}}\).

The structural equation is:

\[y_{i} = x_{i}^{\prime}\beta + u_{i} = x_{1i}^{\prime}\beta_{1} + x_{2i}^{\prime}\beta_{2} + u_{i},\]

where \(\mathsf{E}(x_{1i}u_{i}) = 0_{d_{x_{1}}}\) for the exogenous regressors and \(\mathsf{E}(x_{2i}u_{i}) \neq 0_{d_{x_{2}}}\) for the endogenous regressors.

The first stage equation is:

\[\begin{pmatrix} x_{1i} \\ x_{2i} \end{pmatrix} = \Pi^{\prime} \begin{pmatrix} x_{1i} \\ z_{2i} \end{pmatrix} + \begin{pmatrix} r_{1i} \\ r_{2i} \end{pmatrix} = \begin{pmatrix} I_{d_{x_{1}}} & 0_{d_{x_{1}} \times d_{z_{2}}} \\ \Pi_{1}^{\prime} & \Pi_{2}^{\prime} \end{pmatrix} \begin{pmatrix} x_{1i} \\ z_{2i} \end{pmatrix} + \begin{pmatrix} 0_{d_{x_{1}}} \\ r_{2i} \end{pmatrix}.\] The instruments \(z_{i}\) are assumed to be valid and informative (rank condition fulfilled). The projection errors for the exogenous regressors \(r_{1i}\) are zero because the exogenous regressors are included themselves as regressors and can therefore be fit perfectly.

The implemented algorithms are all forms of trimmed 2SLS. Their basic principle is as follows: First, an initial estimate of the model is obtained and the standardized residuals are calculated for each observation. These residuals are then compared against a reference distribution that the researchers has chosen. Observations with an absolute standardized residual beyond the cut-off value are classified as outliers and removed from the sample. Next, the model is re-estimated on the trimmed sample of non-outlying observations. The procedure can end after the initial classification or can be iterated.

The workhorse command for different types of trimmed 2SLS algorithms
in the *robust2sls* package is `outlier_detection()`

.
The algorithms differ in the following respect:

- which initial estimator to use
- how the sample is trimmed, which is governed by
- the reference distribution against which the errors are judged to be outliers or not
- the cut-off value \(c\) that determines the size of the standardized errors beyond which observations are labeled as outliers and subsequently removed

- how often the algorithm is iterated, which is represented by the parameter \(m\).

The following subsections briefly explain the different choices.

The initial estimator can be set in `outlier_detection()`

through the `initial_est`

argument.

The most common choice for the initial estimator is to simply use the
2SLS estimator based on the full sample. The corresponding argument is
`"robustified"`

.

Alternatively, the initial estimation can consist of two estimations
based on two sub-samples. For that purpose, the sample is partitioned
into two parts and separate models are estimated on each of them.
However, the estimates of **one** sub-sample are used to
calculate the residuals of observations in the **other**
sub-sample. For example, the estimates of sub-sample 2 are used to
calculate the residuals in sub-sample 1: \(\widehat{u}_{i} = y_{i} - x_{i}^{\prime}
\widehat{\beta}_{2}\) for observations \(i\) in sub-sample 1 and where \(\widehat{\beta}_{2}\) denotes the parameter
estimate obtained from the second sub-sample. The corresponding argument
is `"saturated"`

. The `split`

argument governs in
what proportions the sample is split.

Lastly, the user can specified their own initial estimator if the
argument is set to `"user"`

. In this case, a model object of
class `ivreg`

must be given to the `user_model`

argument.

Whether an observation is an outlier, that means unusual in a certain sense, can only be decided relative to a certain distribution - the reference distribution. While it is unusual to obtain a value of 5 under the standard normal distribution, \(\mathcal{N}(0,1)\), it is not unusual for a normal distribution with mean 5, \(\mathcal{N}(5,1)\), or a uniform distribution over the interval \([2,6]\), \(\mathcal{U}_{[2,6]}\). The reference distribution is the distribution of the standardized structural error, i.e. of \(u_{i}/\sigma\).

Currently, `outlier_detection()`

only allows for the
normal distribution, which is the distribution that is usually chosen in
practice. The theoretical foundation allows for a broader range of
distributions as long as they fulfil certain moment conditions but they
are not yet implemented. The argument `ref_dist`

therefore
currently only accepts the value `"normal"`

.

In addition to the reference distribution, the user needs to decide the cut-off value \(c\), which is the value beyond which absolute standardized residuals are considered as unusual. This is linked to the probability of drawing a standardized residual larger than \(c\) and therefore of classifying an observation as an outlier if there are actually no outliers in the sample (null hypothesis). The probability of falsely classifying an observation as an outlier when there are in fact none is called \(\gamma_{c}\). Together with the reference distribution, it determines the cut-off value \(c\).

Suppose we assume a standard normal distribution of the standardized residuals. A cut-off value of \(c = 1.96\) corresponds to a probability of false outlier detection of approximately 5%. Remember that we classify observations as outliers if the absolute value of the standardized residual is larger than the cut-off value \(c\), i.e. for values that are larger than \(1.96\) or smaller than \(-1.96\). Instead, we could target the expected false outlier detection rate, for example by targeting 1%. Together with the reference distribution being standard normal, this yields a cut-off value of approximately \(c = 2.58\).

It is usually easier to think about the expected false outlier
detection rate than the cut-off itself, which is why the user actually
sets \(\gamma_{c}\) in
`outlier_detection()`

using the argument
`sign_level`

. Remember that the expected false outlier
detection rate refers to the scenario in which we have no actual
outliers in the sample. This is our null hypothesis. It just happens
that sometimes we have unusual draws of the errors, in accordance with
the assumed reference distribution.

The procedure of classifying observations as outlier with subsequent
re-estimation of the model can be applied iteratively. The argument
`iterations`

determines whether the algorithm is iterated
(value \(\geq 1\)) or not (value \(0\)).

The algorithm can also be iterated until convergence, that is until
the selected sub-sample of non-outlying observations does not change
anymore or until the parameter estimates do not change much anymore. For
this, the argument `iterations`

must be set to
`"convergence"`

. The user can also set the
`convergence_criterion`

argument, which is the stopping
criterion. If the L2 norm of the difference of the coefficients between
iteration \(m\) and \(m-1\) is smaller than the criterion, the
algorithm is terminated. The stopping criterion can also be used when
`iterations`

is set to a numeric value \(m\), in which case the algorithm is
iterated at most \(m\) times but stops
earlier if the stopping criterion is fulfilled.

In this section, we create a toy example using artificial data to showcase some of the different algorithms and statistical tests. We first load the necessary packages.

We choose a simple example with an intercept, one endogenous
regressor, and one excluded instrument. The function
`generate_param()`

takes the dimensions of the elements of
the 2SLS model and returns a parameter setup that fulfills the
assumption, such as the validity and informativeness of the instruments.
Based on the random parameter setup, we also create some random
artificial data `generate_data()`

, with which we will work
throughout this section. We create a sample of 1,000 data points.

```
<- generate_param(dx1 = 1, dx2 = 1, dz2 = 1, intercept = TRUE, beta = c(2,-1), sigma = 1, seed = 2)
param <- generate_data(parameters = param, n = 1000)$data
data_full
# have a look at the data
head(data_full)
#> y x1 x2 u x1 z2 r1 r2
#> 1 -0.8047759 1 1.4984397 -1.3063362 1 0.59545164 0 0.1380329
#> 2 0.2101148 1 1.4894080 -0.3004772 1 0.02161768 0 0.5823300
#> 3 1.5154352 1 0.8977204 0.4131555 1 0.93773336 0 -0.7330890
#> 4 2.1006106 1 0.8447688 0.9453794 1 0.49601221 0 -0.4370809
#> 5 0.3822698 1 1.2154664 -0.4022639 1 -0.06204087 0 0.3744787
#> 6 1.4367715 1 2.3234290 1.7602006 1 0.26107165 0 1.2271824
```

In reality, the researcher would only observe \(y\), \(x_{1}\) (intercept), \(x_{2}\) (endogenous), and \(z_{2}\) (instrument). We have created data from the following structural equation:

\[ y_{i} = x_{i}^{\prime} \beta + u_{i} = \beta_{1} x_{1} + \beta_{2} x_{2} + u_{i} = 2 - x_{2} + u_{i}.\] We can quickly check whether the assumptions are approximately fulfilled in our finite sample.

```
cor(data_full$u, data_full$x2) # correlation for endogenous regressor
#> [1] 0.3262246
cor(data_full$u, data_full$z2) # close to zero correlation for valid instrument
#> [1] 0.03581895
cor(data_full$x2, data_full$z2) # correlation for informativeness
#> [1] 0.6392398
```

We can also compare 2SLS to OLS and see how close their estimates are to the true parameters \(\beta^{\prime} = \begin{pmatrix} 2 & -1 \end{pmatrix}\).

```
<- y ~ x2
fml_ols <- lm(formula = fml_ols, data = data_full[, 1:3])
ols print(ols$coefficients)
#> (Intercept) x2
#> 1.4885804 -0.6018424
<- y ~ x2 | z2
fml_tsls <- ivreg(formula = fml_tsls, data = data_full[, c(1:3, 6)])
tsls print(tsls$coefficients)
#> (Intercept) x2
#> 1.9521846 -0.9316108
```

As expected, we clearly see that 2SLS provides estimates closer to the true DGP than OLS.

We have generated the data without any outliers in the sense that none of the structural errors has been drawn from a different distribution, such as a fat-tailed distribution. In our setting, the structural error follows a marginal normal distribution with variance \(\sigma^{2} = 1\), such that \(u_{i} / \sigma \sim \mathcal{N}(0,1)\). So we are working under the null hypothesis of no outliers.

In our first example, we use the full sample 2SLS estimator as the initial estimator. Let us use an expected false outlier detection rate, \(\gamma_{c}\), of 1% so that we expect to find \(1000 \times 0.01 = 10\) outliers even though there are technically none in the sample.

```
# extract the variables we observe
<- data.frame(data_full[, c(1:3, 6)])
data
# not iterating the algorithm
<- outlier_detection(data = data, formula = fml_tsls, ref_dist = "normal",
robustified_0 sign_level = 0.01, initial_est = "robustified", iterations = 0)
print(robustified_0)
#> Outlier-Robust 2SLS Model
#> Initial estimator: robustified
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 0
#> Final selection: Outliers found: 13 Outliers proportion: 0.013
# iterating algorithm until convergence
<- outlier_detection(data = data, formula = fml_tsls, ref_dist = "normal",
robustified_conv sign_level = 0.01, initial_est = "robustified",
iterations = "convergence", convergence_criterion = 0)
print(robustified_conv)
#> Outlier-Robust 2SLS Model
#> Initial estimator: robustified
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 3
#> Final selection: Outliers found: 15 Outliers proportion: 0.015
```

`outlier_detection()`

returns an object of class
`"robust2sls"`

, which is a list that saves - inter alia - the
settings of the algorithm, the 2SLS models for each iteration, and
information on which observations were classified as outliers in which
iteration. The `print()`

method for this class summarizes the
most important results.

Both the non-iterated version that stops after the initial classification of outliers and the iterated version detect a bit more outliers than we would have expected (13 and 15, respectively). This can be due to the estimation error, i.e. we make the classification based on the residuals \(\widehat{u}_{i}\) instead of the true errors \(u_{i}\); or simply because there just happened to be more unusual draws of the error in our finite sample than expected. Since we are working with artificial data that we have created ourselves, we can actually check how many true errors were larger than \(2.58\) or smaller than \(-2.58\).

```
sum(abs(data_full[, "u"]) > 2.58)
#> [1] 15
```

We find that there were 15 true errors that we would technically
classify as outliers, of which we detected 13 or 15, respectively. We
can also check whether we identified those observations with large true
errors as outliers or whether we classified some observations as
outliers that actually had true errors that were smaller than the
cut-off. The `"robust2sls"`

object stores a vector that
records for each observation whether it has been classified as an
outlier (`0`

) or not (`1`

) or whether the
observation was not used in the estimation for other reasons, such as a
missing value in \(y\), \(x\), or \(z\) (`-1`

). This information is
stored for each iteration.

```
# which observations had large true errors?
<- which(abs(data_full[, "u"]) > 2.58)
large_true
# which observations were detected as outliers
# both step 0 and iterated version had same starting point, so same initial classification
<- which(robustified_conv$type$m0 == 0)
large_detected_0 <- which(robustified_conv$type$m3 == 0)
large_detected_3
# how much do the sets of true and detected large errors overlap?
sum(large_detected_0 %in% large_true) # 12 of the 13 detected outliers really had large errors
#> [1] 12
sum(large_detected_3 %in% large_true) # 13 of the 15 detected outliers really had large errors
#> [1] 13
# which were wrongly detected as having large errors?
<- large_detected_3[which(!(large_detected_3 %in% large_true))]
ind # what were their true error values?
"u"] # relatively large values even though technically smaller than cut-off
data_full[ind, #> [1] 2.443443 -2.512298
# which large true errors were missed?
<- large_true[which(!(large_true %in% large_detected_3))]
ind2 # what were their true error values?
"u"] # one of them is close to the cut-off
data_full[ind2, #> [1] -2.580791 2.675392
```

All in all, the algorithm performed as expected.

As explained above, the Saturated 2SLS estimator partitions the
sample into two sub-samples and estimates separate models on each of
them. The estimates of one sub-sample are then used to calculate
residuals of observations in the other sub-sample. Again, the
standardized residuals are compared to the chosen cut-off value. The
`split`

argument of `outlier_detection()`

determines the relative size of each sub-sample. Especially with small
samples, the split should be chosen such that none of the sub-samples is
*too small*.

```
# not iterating the algorithm
<- outlier_detection(data = data, formula = fml_tsls, ref_dist = "normal",
saturated_0 sign_level = 0.01, initial_est = "saturated", iterations = 0,
split = 0.5)
print(saturated_0)
#> Outlier-Robust 2SLS Model
#> Initial estimator: saturated
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 0
#> Final selection: Outliers found: 10 Outliers proportion: 0.01
# iterating algorithm until convergence
<- outlier_detection(data = data, formula = fml_tsls, ref_dist = "normal",
saturated_conv sign_level = 0.01, initial_est = "saturated", split = 0.5,
iterations = "convergence", convergence_criterion = 0)
print(saturated_conv)
#> Outlier-Robust 2SLS Model
#> Initial estimator: saturated
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 4
#> Final selection: Outliers found: 15 Outliers proportion: 0.015
# which did it find in final selection?
<- which(saturated_conv$type$m4 == 0)
large_detected_4 identical(large_detected_3, large_detected_4)
#> [1] TRUE
```

Saturated 2SLS finds 10 outliers initially, which is exactly what we would expect. Once we iterate, it also finds 15 outliers as Robustified 2SLS did. In fact, the converged Saturated 2SLS classified exactly the same observations as outliers as Robustified 2SLS.

After using the trimmed 2SLS algorithms, we want to do inference on
the structural parameters. Since the procedures exclude observations
with large errors, we would underestimate the error variance. Jiao
(2019) has derived the correction factor for the estimator of the
variance, which is both implemented in the algorithms for the selection
and in the command `beta_inf()`

function, which provides
corrected standard errors. It returns both the original and corrected
standard errors, t-statistics, and p-values. The corrected values are
calculated under the null hypothesis of no outliers in the sample and
are indicated by `H0`

. We look at the results from the
converged Robust 2SLS algorithm. The algorithm converged after 3
iterations, so we can either use the correction factor for iteration 3
or for the fixed point.

```
# final model (iteration 3) without corrected standard errors
summary(robustified_conv$model$m3)$coef
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.892708 0.08397759 22.53825 5.226824e-91
#> x2 -0.890967 0.05571580 -15.99128 2.454974e-51
#> attr(,"df")
#> [1] 983
#> attr(,"nobs")
#> [1] 985
# final model with corrected standard errors, m = 3 (subset of output)
beta_inf(robustified_conv, iteration = 3, exact = TRUE)[, 1:5]
#> Estimate Std. Error H0Std. Error t value H0t value
#> (Intercept) 1.892708 0.08397759 0.09057768 22.53825 20.89596
#> x2 -0.890967 0.05571580 0.06009470 -15.99128 -14.82605
# final model with corrected standard errors, m = "fixed point" / "converged" (subset of output)
beta_inf(robustified_conv, iteration = 3, exact = TRUE, fp = TRUE)[, 1:5]
#> Estimate Std. Error H0Std. Error t value H0t value
#> (Intercept) 1.892708 0.08397759 0.09058094 22.53825 20.89521
#> x2 -0.890967 0.05571580 0.06009686 -15.99128 -14.82552
```

We can see that the standard inference leads to smaller standard errors, potentially overstating statistical significance because it ignores the preceding selection. For the corrected inference, there is almost no difference between using the correction factor for the exact iteration 3 or using the correction factor for the fixed point.

Instead of corrected standard errors, the researcher could also use bootstrapped standard errors for inference. This functionality is still under development.

```
# use case re-sampling (to save time, use low iteration m = 1)
<- case_resampling(robustified_conv, R = 1000, m = 1)
resampling <- evaluate_boot(resampling, iterations = 1)
beta_boot <- matrix(beta_boot[, 1:2], ncol = 1, nrow = 2) # show subset, put into column vector
mat colnames(mat) <- "bootStd. Error"
cbind(beta_inf(robustified_conv, iteration = 1, exact = TRUE)[, 1:3], mat)
#> Estimate Std. Error H0Std. Error bootStd. Error
#> (Intercept) 1.899757 0.08448197 0.09065763 0.09817717
#> x2 -0.8922874 0.05611039 0.06021208 0.06326314
```

Jiao (2019) also devises a Hausman-type test for whether the difference between the full sample and trimmed sample coefficient estimates is statistically significant. The test can be used on subsets of the coefficient vector or the whole vector itself.

```
# t-test on a single coefficient
# testing difference in beta_2 (coef on endogenous regressor x2), m = 3
beta_t(robustified_conv, iteration = 3, element = "x2")
#> robust m = 3 full se diff t value Pr(>|z|) Pr(>z) Pr(<z)
#> x2 -0.890967 -0.9316108 0.01746118 2.327666 0.01992983 0.009964916 0.9900351
#> attr(,"type of avar")
#> [1] "iteration m = 3"
# testing difference in beta_2 (coef on endogenous regressor x2), m = "fixed point"
beta_t(robustified_conv, iteration = 3, element = "x2", fp = TRUE)
#> robust m = 3 full se diff t value Pr(>|z|) Pr(>z) Pr(<z)
#> x2 -0.890967 -0.9316108 0.01746862 2.326675 0.01998259 0.009991294 0.9900087
#> attr(,"type of avar")
#> [1] "fixed point"
# Hausman-type test on whole coefficient vector, m = 3
beta_hausman(robustified_conv, iteration = 3)
#> Hausman test p value
#> [1,] 5.475185 0.06472597
#> attr(,"type of avar")
#> [1] "iteration m = 3"
#> attr(,"coefficients")
#> [1] "(Intercept)" "x2"
```

While the difference between \(\widehat{\beta}_{2}^{(0) = full}\) and \(\widehat{\beta}_{2}^{(3)}\) is statistically significant at the 5% significance level using a t-test, the difference is not significant at 5% when testing the whole coefficient vector.

We now create a contaminated data set from the previous data. To ensure that we know where the outliers are, we populate the data with deterministic outliers, i.e. by setting their errors to a large value. We populate 3% of the sample with outliers, i.e. 30 outliers. The location of the outliers is random. The size of their errors is either -3.5 or 3.5, which is also determined randomly.

```
<- data_full
data_full_contaminated <- sample(1:NROW(data_full), size = 0.03*NROW(data_full), replace = FALSE)
outlier_location <- sample(c(-3.5, 3.5), size = length(outlier_location), replace = TRUE)
outlier_size
# replace errors
"u"] <- outlier_size
data_full_contaminated[outlier_location, # replace value of dependent variable
"y"] <- data_full_contaminated[outlier_location, "y"] -
data_full_contaminated[outlier_location, "u"] +
data_full[outlier_location, "u"]
data_full_contaminated[outlier_location,
# extract the data that researcher would actually collect
<- data.frame(data_full_contaminated[, c(1:3, 6)]) data_cont
```

We use the same algorithm setup as before.

```
# not iterating the algorithm
<- outlier_detection(data = data_cont, formula = fml_tsls, ref_dist = "normal",
robustified_0 sign_level = 0.01, initial_est = "robustified", iterations = 0)
print(robustified_0)
#> Outlier-Robust 2SLS Model
#> Initial estimator: robustified
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 0
#> Final selection: Outliers found: 31 Outliers proportion: 0.031
# iterating algorithm until convergence
<- outlier_detection(data = data_cont, formula = fml_tsls, ref_dist = "normal",
robustified_conv sign_level = 0.01, initial_est = "robustified",
iterations = "convergence", convergence_criterion = 0)
print(robustified_conv)
#> Outlier-Robust 2SLS Model
#> Initial estimator: robustified
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 7
#> Final selection: Outliers found: 45 Outliers proportion: 0.045
```

The initial selection finds 31 outliers, close to the 30 deterministic outliers that we put into the model. The algorithm converges after 7 iterations and ultimately classifies 45 observations as outliers. Again, this might well be because even under the normal distribution of the regular errors, we have draws that are beyond the cut-off.

```
# which observations were classified as outliers?
<- which(robustified_conv$type$m0 == 0)
detected_0 <- which(robustified_conv$type$m7 == 0)
detected_7
# overlap between deterministic outliers
sum(outlier_location %in% detected_0) # initial found all 30 (+1 in addition)
#> [1] 30
sum(outlier_location %in% detected_7) # converged found all 30 (+ 15 in addition)
#> [1] 30
```

Let us now compare the coefficient estimates between the contaminated full sample model and the Robustified 2SLS estimates.

```
# full sample model
<- ivreg(formula = fml_tsls, data = data_cont)
tsls_cont $coefficients
tsls_cont#> (Intercept) x2
#> 1.9100942 -0.8984509
# updated estimates after initial classification and removal of outliers
$model$m1$coefficients
robustified_conv#> (Intercept) x2
#> 1.9622405 -0.9373524
# updated estimates after convergence
$model$m7$coefficients
robustified_conv#> (Intercept) x2
#> 1.9083319 -0.9030388
```

In the present setting, the contamination of the model made the full sample estimates of the model slightly worse compared to the estimates based on the non-contaminated sample. The model that applies the algorithm once is closer to the true DGP values of \(\beta^{\prime} = \begin{pmatrix} 2 & -1 \end{pmatrix}\) while the converged model performs similarly to the full sample estimates.

We now also see a clearer difference between the corrected and the uncorrected standard errors. The corrected standard errors are close to the bootstrap standard error.

```
# use case re-sampling (to save time, use low iteration m = 1)
<- case_resampling(robustified_conv, R = 1000, m = 1)
resampling <- evaluate_boot(resampling, iterations = 1)
beta_boot <- matrix(beta_boot[, 1:2], ncol = 1, nrow = 2) # show subset, put into column vector
mat colnames(mat) <- "bootStd. Error"
cbind(beta_inf(robustified_conv, iteration = 1, exact = TRUE)[, 1:3], mat)
#> Estimate Std. Error H0Std. Error bootStd. Error
#> (Intercept) 1.962241 0.09142263 0.09720696 0.09693482
#> x2 -0.9373524 0.06086466 0.06471558 0.06556386
```

Finally, we also check the performance of the Saturated 2SLS algorithm.

```
# not iterating the algorithm
<- outlier_detection(data = data_cont, formula = fml_tsls, ref_dist = "normal",
saturated_0 sign_level = 0.01, initial_est = "saturated", iterations = 0,
split = 0.5)
print(saturated_0)
#> Outlier-Robust 2SLS Model
#> Initial estimator: saturated
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 0
#> Final selection: Outliers found: 32 Outliers proportion: 0.032
# iterating algorithm until convergence
<- outlier_detection(data = data_cont, formula = fml_tsls, ref_dist = "normal",
saturated_conv sign_level = 0.01, initial_est = "saturated", split = 0.5,
iterations = "convergence", convergence_criterion = 0)
print(saturated_conv)
#> Outlier-Robust 2SLS Model
#> Initial estimator: saturated
#> Reference distribution: normal
#> Two-stage Least-Squares Model: y ~ x2 | z2
#> Iterations: 7
#> Final selection: Outliers found: 45 Outliers proportion: 0.045
# which did it find in final selection?
<- which(saturated_conv$type$m7 == 0)
detected_7_sat identical(detected_7, detected_7_sat)
#> [1] TRUE
```

Saturated 2SLS leads to a similar outcome. In the initial step, one more observation is classified as an outlier compared to Robust 2SLS but their converged classification is again the same.