In this vignette, we illustrate the use of the sequential quadratic
programming (SQP) algorithm implemented in `mixsqp`

, and we compare
its runtime and accuracy against an interior-point (IP) solver
implemented by the MOSEK commercial software (it is called by
the “KWDual” function in the REBayes package).

If you do not have the Rmosek and REBayes packages installed on your computer, you may skip over these steps in the vignette.

Load the `mixsqp`

package.

```
library(mixsqp)
```

Next, initialize the sequence of pseudorandom numbers.

```
set.seed(1)
```

We begin with a small example to show how `mixsqp`

works.

```
L <- simulatemixdata(1000,20)$L
dim(L)
# [1] 1000 20
```

This call to `simulatemixdata`

created an \(n \times m\) conditional
likelihood matrix for a mixture of zero-centered normals, with \(n =
1000\) and \(m = 20\). By default, `simulatemixdata`

normalizes the rows
of the likelihood matrix so that the maximum entry in each row is 1.

Now we fit the mixture model using the SQP algorithm:

```
fit.sqp <- mixsqp(L)
# Running mix-SQP algorithm 0.1-97 on 1000 x 20 matrix
# convergence tol. (SQP): 1.0e-08
# conv. tol. (active-set): 1.0e-10
# zero threshold (solution): 1.0e-06
# zero thresh. (search dir.): 1.0e-08
# l.s. sufficient decrease: 1.0e-02
# step size reduction factor: 5.0e-01
# minimum step size: 1.0e-04
# max. iter (SQP): 1000
# max. iter (active-set): 21
# iter objective max(rdual) nnz stepsize max.diff nqp nls
# 1 +7.434938010e-01 +2.102e-01 20 NA NA NA NA
# 2 +7.117754572e-01 +2.699e+00 3 1.00e+00 4.50e-01 21 1
# 3 +6.626763592e-01 +1.122e+00 4 1.00e+00 5.00e-01 7 1
# 4 +6.457951080e-01 +4.098e-01 5 1.00e+00 2.42e-01 10 1
# 5 +6.360610080e-01 +1.208e-01 4 1.00e+00 3.45e-02 3 1
# 6 +6.319543790e-01 +2.485e-02 5 1.00e+00 1.98e-01 4 1
# 7 +6.288012408e-01 +3.648e-03 4 1.00e+00 8.65e-02 3 1
# 8 +6.282103967e-01 +6.952e-05 4 1.00e+00 8.95e-03 2 1
# 9 +6.281978283e-01 -1.570e-08 4 1.00e+00 1.40e-04 2 1
# Convergence criteria met---optimal solution found.
```

In this example, the SQP algorithm converged to a solution in a small number of iterations.

By default, `mixsqp`

outputs information on its progress. It begins by
summarizing the optimization problem and the algorithm settings used.
(Since we did not change these settings in the `mixsqp`

call, all the
settings shown here are the default settings.)

After that, it outputs, at each iteration, information about the current solution, such as the value of the objective (“objective”) and the number of nonzeros (“nnz”).

The “max(rdual)” column shows the quantity used to assess convergence.
It reports the maximum value of the “dual residual”; the SQP solver
terminates when the maximum dual residual is less than `conv.tol`

,
which by default is \(10^{-8}\). In this example, we see that the dual
residual shrinks rapidly toward zero.

Another useful indicator of convergence is the “max.diff” column—it reports the maximum difference between the solution estimates at two successive iterations. We normally expect these differences to shrink as we approach the solution, which is precisely what we see in this example.

This information is also provided in the return value, which we can use, for example, to create a plot of the objective value at each iteration of the SQP algorithm:

```
numiter <- nrow(fit.sqp$progress)
plot(1:numiter,fit.sqp$progress$objective,type = "b",
pch = 20,lwd = 2,xlab = "SQP iteration",
ylab = "objective",xaxp = c(1,numiter,numiter - 1))
```

To assess the accuracy of the SQP solution, we can compare against the solution computed by the IP algorithm. (If you do not have the REBayes package installed, you can skip this step.)

```
fit.ip <- mixkwdual(L)
```

If you run the IP algorithm, you should see that the IP and SQP solutions achieve nearly the same objective value.

```
cat(sprintf("Objective at SQP solution: %0.16f\n",fit.sqp$value,digits = 16))
cat(sprintf("Objective at IP solution: %0.16f\n",fit.ip$value,digits = 16))
cat(sprintf("Difference in objectives: %0.4e\n",fit.sqp$value - fit.ip$value))
# Objective at SQP solution: 0.6281978495824518
# Objective at IP solution: 0.6281978456083795
# Difference in objectives: 3.9741e-09
```

We observed that the SQP and IP methods achieve nearly the same solution quality in the example above. Here, we explore the computational properties of the SQP and IP algorithms in a larger data set.

As before, we compute the \(n \times m\) likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of \(m = 100\) normal densities, as well as many more samples.

```
L <- simulatemixdata(1e5,100)$L
dim(L)
# [1] 100000 100
```

Now we fit the model using the SQP algorithm:

```
timing <- system.time(fit.sqp <- mixsqp(L))
cat(sprintf("Computation took %0.2f seconds\n",timing["elapsed"]))
# Running mix-SQP algorithm 0.1-97 on 100000 x 100 matrix
# convergence tol. (SQP): 1.0e-08
# conv. tol. (active-set): 1.0e-10
# zero threshold (solution): 1.0e-06
# zero thresh. (search dir.): 1.0e-08
# l.s. sufficient decrease: 1.0e-02
# step size reduction factor: 5.0e-01
# minimum step size: 1.0e-04
# max. iter (SQP): 1000
# max. iter (active-set): 101
# iter objective max(rdual) nnz stepsize max.diff nqp nls
# 1 +7.790692288e-01 +2.447e-01 100 NA NA NA NA
# 2 +6.927578980e-01 +2.219e+01 4 1.00e+00 3.62e-01 101 1
# 3 +6.371645565e-01 +9.824e+00 7 1.00e+00 4.91e-01 47 1
# 4 +6.233189089e-01 +4.545e+00 4 1.00e+00 2.07e-01 14 1
# 5 +6.177060975e-01 +1.251e+00 5 1.00e+00 2.97e-01 13 1
# 6 +6.160189730e-01 +2.574e-01 5 1.00e+00 2.73e-01 14 1
# 7 +6.152400931e-01 +4.100e-02 5 1.00e+00 2.19e-01 35 1
# 8 +6.150662462e-01 +4.111e-03 7 1.00e+00 1.02e-01 24 1
# 9 +6.150526241e-01 +2.241e-04 7 1.00e+00 2.80e-03 2 1
# 10 +6.150522124e-01 +9.288e-07 7 1.00e+00 9.53e-04 2 1
# 11 +6.150522078e-01 -1.543e-08 7 1.00e+00 1.14e-05 2 1
# Convergence criteria met---optimal solution found.
# Computation took 3.31 seconds
```

If you have the REBayes package, you can also run the IP method:

```
timing <- system.time(fit.ip <- mixkwdual(L))
cat(sprintf("Computation took %0.2f seconds\n",timing["elapsed"]))
# Computation took 21.16 seconds
```

If you run the IP algorithm, you should find that the SQP algorithm was considerably faster than the IP solver, and it converged to a solution with nearly the same objective value as the IP solution.

```
cat(sprintf("Objective at SQP solution: %0.16f\n",fit.sqp$value,digits = 16))
cat(sprintf("Objective at IP solution: %0.16f\n",fit.ip$value,digits = 16))
cat(sprintf("Difference in objectives: %0.4e\n",fit.sqp$value - fit.ip$value))
# Objective at SQP solution: 0.6150522313978172
# Objective at IP solution: 0.6150522313757819
# Difference in objectives: 2.2035e-11
```

This next code chunk gives information about the computing environment used to generate the results contained in this vignette, including the version of R and the packages used.

```
sessionInfo()
# R version 3.5.1 (2018-07-02)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Scientific Linux 7.4 (Nitrogen)
#
# Matrix products: default
# BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
#
# locale:
# [1] C
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] mixsqp_0.1-97
#
# loaded via a namespace (and not attached):
# [1] compiler_3.5.1 Matrix_1.2-15 magrittr_1.5 tools_3.5.1
# [5] Rcpp_1.0.0 Rmosek_8.0.69 stringi_1.2.4 highr_0.7
# [9] grid_3.5.1 REBayes_1.4 knitr_1.20 stringr_1.3.1
# [13] lattice_0.20-38 evaluate_0.12
```