Abstract

This is a demonstration of Dirk Schumacher’s ompr package (https://dirkschumacher.github.io/ompr/). We will exercise it on the MIP discriminant model from an old paper of mine [1], applied to the Anderson’s iris data. Specifically, we’ll try to discriminate between each individual species (setosa, versicolor or virginica) and the other two, using a linear discriminant function that minimizes the number of misclassifications on the overall sample.

For general amusement, we’ll also look at how a support vector machine does on the same three classification problems. This is not intended to be a test of MIP classification v. other methods, though, and we won’t get into separate training and testing sample or k-fold cross-validation. The focus of this notebook is to exercise the ompr package (using CPLEX 12.6.3 as the solver) on a (somewhat) real-world statistical optimization problem.

Before starting, I’ll just note one minor glitch. As of this writing, you cannot name a variable “c” in your optimization model. Dirk has been very responsive in checking into issues with ompr, but this one issue traces back to a library used by ompr, so while Dirk has reported it, he cannot fix it himself. So far, “c” seems to be the only variable name that causes any trouble.

About this document

This document is an the HTML output from an R notebook. The control in the upper right will let you show or hide code (although, in this case, the code is pretty much the point of the notebook). It will also allow you to extract and save the notebook (R source file) itself.

Solver

As mentioned, the solver used here is CPLEX 12.6.3. One of the virtues of ompr (and the ROI package it depends on) is that, armed with appropriate plug-ins, you can switch to alternative solvers. The connection to CPLEX is handled by the ROI.plugin.cplex package, loaded below. What is not immediately visible is that, in addition to CPLEX itself, this package also depends on the Rcplex library.

Preparation

We will use the 1-norm of the observations to set some model parameters, so we start by defining that.

norm.1 <- function(x) sum(abs(x))

We will arbitrarily set a minimum absolute discriminant value (delta) of 0.01 for a correct classification (so that discriminant values between -0.01 and +0.01 will be considered inconclusive).

delta <- 0.01

Before creating any models, we need to load the required libraries.

library(ROI)               # general solver interface
ROI.plugin.cplex: R Optimization Infrastructure
Registered solver plugins: nlminb, cplex.
Default solver: auto.
library(ROI.plugin.cplex)  # connects the ROI package to CPLEX
library(ompr)              # MIP modeling package
library(ompr.roi)          # connects OMPR to ROI
library(magrittr)          # for the pipe operator
library(kernlab)           # for SVMs

Function definitions

Partitioning the data into positive/negative samples

We will be partitioning the original data set into positive and negative samples more than once, so we create a function to do that. The function takes as its sole argument the name (“pos”) of the species in the positive sample and returns a named list of samples. The (now redundant) “Species” column is removed from each subsample.

partition <- function(pos) {
  list(positive = subset(iris, subset = (Species == pos), select = -Species),
       negative = subset(iris, subset = (Species != pos), select = -Species)
  )
}

Building and running MIP discriminant models

Since we want to generate several different discriminant functions, we will create a function that takes our list of subsamples, uses it to generate a MIP model, solves the model and returns the result in a list.

Variable names generally conform to the 1990 paper. The variables in the model have “.var” appended to distinguish them from the R variables containing their values (e.g., w.var is the discriminant coefficient vector in the model, while w is the R variable receiving the values of w.var in the solution).

mipDiscriminant <- function(data) {
  positive <- data$positive
  negative <- data$negative
  #
  # We need the maximum 1-norm of any observation in each sample.
  #
  Delta1 <- max(apply(negative, 1, norm.1))
  Delta2 <- max(apply(positive, 1, norm.1))
  Deltabar <- (Delta1 + Delta2) / 2
  #
  # Those values are used to compute a "big M" coefficient for each sample.
  #
  M1 <- 2 * Delta1 + Deltabar  # big M for negative observations
  M2 <- 2 * Delta2 + Deltabar  # big M for positive observations
  #
  # We also need a coefficient for a term in the objective function
  # that rewards greater separation between the two groups.
  #
  epsilon <- 1 / (2 * Deltabar)
  #
  # Next, we compute the dimensions of the data.
  #
  npos <- nrow(positive)  # size of positive sample
  nneg <- nrow(negative)  # size of negative sample
  nvar <- ncol(positive)  # number of features
  #
  # Now we can build the model.
  #
  model <-
    MIPModel() %>% 
    add_variable(y.var[i], i = 1:nneg, type = "binary") %>%   # indicators for negative misclassifications
    add_variable(z.var[i], i = 1:npos, type = "binary") %>%   # indicators for positive misclassifications
    add_variable(w.var[i], i = 1:nvar, type = "continuous", lb = -1, ub = 1) %>%  # coefficients of discriminant function
    add_variable(c.var, type = "continuous") %>%   # constant term of discriminant function (unbounded)
    add_variable(d.var, type = "continuous", lb = delta) %>%   # minimum absolute score of any successfully classified observation (bounded away from zero)
    set_objective(sum_expr(y.var[i], i = 1:nneg) + sum_expr(z.var[i], i = 1:npos) - epsilon * d.var, sense = "min") %>%   # minimize number of misclassifications with credit for higher spacing from the classification boundary
    add_constraint(sum_expr(negative[i, j] * w.var[j], j = 1:nvar) + c.var + d.var - M1 * y.var[i] <= 0, i = 1:nneg) %>%   # determine negative misclassifications
    add_constraint(sum_expr(positive[i, j] * w.var[j], j = 1:nvar) + c.var - d.var + M2 * z.var[i] >= 0, i = 1:npos)  # determine positive misclassifications
  #
  # Finally, we solve the model and return the result.
  #
  modelObject <<- as_ROI_model(model)
  model %>% solve_model(with_ROI(solver = "cplex"))
}

Creating discriminant functions

We will also want a function that takes the solution to our MIP model and turns it into a linear discriminant function, to facilitate testing the function.

extractFunction <- function(mipResult) {
  a <- (mipResult %>% get_solution(w.var[j]))[, "value"]
  a0 <- mipResult %>% get_solution(c.var)
  f <- function(x) as.numeric(a %*% as.numeric(x) + a0)
  f
}

Testing the MIP discriminant model

Setosa v. Versicolor/Virginica

Let’s start by trying to discriminate between setosa (“positive”) and versicolor/virginica (“negative”). The sink() function is used to avoid cluttering the page with a rather lengthy progress report from the solver.

samples <- partition("setosa")
sink("/dev/null")
result <- mipDiscriminant(samples)
CPLEX environment opened
Closed CPLEX environment
sink()

The first thing to do is check whether the model solved successfully.

cat("Status = ", result$status)
Status =  optimal

Next, we can take a look at the objective value.

cat("Objective value = ", result$objective_value)
Objective value =  -0.04166667

Values for scalar model variables are returned as numeric vectors of length 1.

c <- result %>% get_solution(c.var)
c
c.var 
 3.65 
d <- result %>% get_solution(d.var)
d
d.var 
 1.35 

Values for indexed model variables are returned in data frames.

w <- result %>% get_solution(w.var[j])
y <- result %>% get_solution(y.var[i])
z <- result %>% get_solution(z.var[i])
w

Let’s see, based on the y and z indicators, how many observations the model thinks it misclassified.

cat(sprintf("Negative misclassifications: %d\n", nrow(y[y$value >= 0.1, ])))
Negative misclassifications: 0
cat(sprintf("Positive misclassifications: %d\n", nrow(z[z$value >= 0.1, ])))
Positive misclassifications: 0

We allegedly have no errors (100% correct classification for this split). Just to be sure, let’s extract the discriminant function and apply it directly to the data, finding its smallest score on the positive sample (hopefully > 0) and its largest score on the negative sample (hopefully < 0).

dFunc <- extractFunction(result)
cat("Mininum clearance for positive sample: ", min(apply(samples$positive, 1, dFunc)))
Mininum clearance for positive sample:  1.35
cat("\nMinimum clearance for negative sample: ", max(apply(samples$negative, 1, dFunc)))

Minimum clearance for negative sample:  -1.35

So all scores are correct (as expected), and the closest any observation comes to the “boundary” score of 0 is 1.35 (which is the value of the d variable in the model).

For completeness, since we will be doing it below, we create a “confusion matrix” comparing predictions to actual values.

predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "setosa", "other"))  # predicted class
table(predicted, iris$Species)
         
predicted setosa versicolor virginica
   other       0         50        50
   setosa     50          0         0

Vericolor v. Setosa/Virginica

Next, we repeat the experiment, trying to distinguish versicolor from setosa and virginica.

samples <- partition("versicolor")
sink("/dev/null")
result2 <- mipDiscriminant(samples)
CPLEX environment opened
Closed CPLEX environment
sink()
cat("Status = ", result2$status)
Status =  optimal
cat("\nObjective value = ", result2$objective_value)

Objective value =  25.99967

This takes considerably longer to solve, because 100% accuracy is not attainable (so the MIP solver has to do a fair bit of branching). The good news is that we have an optimal solution; the bad news is that we apparently misclassified 26 observations. We can break that down by group.

dFunc <- extractFunction(result2)
predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "versicolor", "other"))  # predicted class
table(predicted, iris$Species)
            
predicted    setosa versicolor virginica
  other          50          8        32
  versicolor      0         42        18

The best solution misclassifies 8 versicolor (positive) specimens and 18 specimens from the other two species (negative). We can easily check which they are.

y <- result2 %>% get_solution(y.var[i])
z <- result2 %>% get_solution(z.var[i])
e1 <- which(y$value > 0.5)
e2 <- which(z$value > 0.5)
cat("Negative:\n")
Negative:
e1
 [1] 53 54 56 58 59 62 67 68 69 70 73 76 80 81 82 84 85
[18] 88
cat("Positive:\n")
Positive:
e2
[1]  2  7 10 12 15 21 36 49

We can take a look at the discriminant values for those observations, just to confirm what we are seeing.

cat("Negative:\n")
Negative:
apply(samples$negative[e1, ], 1, dFunc)
       103        104        106        108        109 
0.03900046 0.24529895 0.35739388 0.55365130 0.45506618 
       112        117        118        119        120 
0.06451392 0.16337289 0.05438156 0.42010497 0.48674121 
       123        126        130        131        132 
0.56428571 0.32193063 0.49007303 0.38879507 0.10960749 
       134        135        138 
0.34169329 0.74228663 0.13439069 
cat("\nPositive:\n")

Positive:
apply(samples$positive[e2, ], 1, dFunc)
         52          57          60          62 
-0.06579188 -0.10093565 -0.06825650 -0.13452761 
         65          71          86          99 
-0.17902784 -0.21499315 -0.21973984 -0.12654039 

Virginica v. Setosa/Versicolor

Finally, let’s see if we can discriminate between virginica and setosa/versicolor.

samples <- partition("virginica")
sink("/dev/null")
result3 <- mipDiscriminant(samples)
CPLEX environment opened
Closed CPLEX environment
sink()
cat("Status = ", result3$status)
Status =  optimal
cat("\nObjective value = ", result3$objective_value)

Objective value =  0.9991571

It looks as though we got away with just one misclassification.

dFunc <- extractFunction(result3)
predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "virginica", "other"))  # predicted class
table(predicted, iris$Species)
           
predicted   setosa versicolor virginica
  other         50         49         0
  virginica      0          1        50
# one of the negative samples was classified positive - which?
y <- result3 %>% get_solution(y.var[i])
e1 <- which(y$value > 0.5)
cat("Error in observation ", e1)
Error in observation  84

So observation 84 from the negative sample, which happens to be a versicolor specimen, seems to be the only one misclassified. Let’s check its discriminant score.

dFunc <- extractFunction(result3)
apply(samples$negative[e1, ], 1, dFunc)
       84 
0.2365741 

It is indeed a positive (incorrect) score.

Classification using an SVM

Just for fun, let’s try using a classification support vector machine (SVMs), via the kernlab package. To keep things simple, we will leave all options at default settings.

Setosa v. Versicolor/Virginica

First, let’s try to discriminate setosa from the others.

class <- factor(iris$Species == "setosa", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
          
           setosa versicolor virginica
  negative      0         50        50
  positive     50          0         0

The SVM matches the 100% accuracy of the MIP model on setosa.

Versicolor v. Setosa/Virginica

class <- factor(iris$Species == "versicolor", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
          
           setosa versicolor virginica
  negative     50          1        48
  positive      0         49         2

Attempting to distinguish versicolor from the other two species, the SVM model misclassifies one or two veriscolor instances and one or two virginica instances, much better than the MIP model did. (This is due to the SVM not being restricted to a linear discriminant function.) The reason I say “one or two” and not specifically what you see in the confusion matrix is that the SVM fitting algorithm appears to use a somewhat randomized algorithm. If you keep refitting the model, the results periodically change. You can stabilize the results by setting a random seed before calling ksvm(), but which results you get will depend on how lucky you are picking the seed.

Virginica v. Setosa/Versicolor

class <- factor(iris$Species == "virginica", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
          
           setosa versicolor virginica
  negative     50         48         2
  positive      0          2        48

The SVM has a total of three or four errors when trying to discriminate between virginica and the other two species, worse than the single error the MIP model achieved. Again, the total error count varies depending on the random seed used.

Reference

[1] Rubin, P. A. Heuristic solution procedures for a mixed-integer programming discriminant model. Managerial and Decision Economics, Vol. 11, No. 4, October 1990.

---
title: "OMPR Demo"
output: html_notebook
author: Paul A. Rubin
date: November 9, 2016
---

# Abstract

This is a demonstration of Dirk Schumacher's **ompr** package (https://dirkschumacher.github.io/ompr/). We will exercise it on the MIP discriminant model from an old paper of mine [1], applied to the Anderson's iris data. Specifically, we'll try to discriminate between each individual species (setosa, versicolor or virginica) and the other two, using a linear discriminant function that minimizes the number of misclassifications on the overall sample.

For general amusement, we'll also look at how a support vector machine does on the same three classification problems. This is not intended to be a test of MIP classification v. other methods, though, and we won't get into separate training and testing sample or k-fold cross-validation. The focus of this notebook is to exercise the ompr package (using CPLEX 12.6.3 as the solver) on a (somewhat) real-world statistical optimization problem.

Before starting, I'll just note one minor glitch. As of this writing, you cannot name a variable "c" in your optimization model. Dirk has been very responsive in checking into issues with `ompr`, but this one issue traces back to a library used by `ompr`, so while Dirk has reported it, he cannot fix it himself. So far, "c" seems to be the only variable name that causes any trouble.

# About this document

This document is an the HTML output from an [R notebook](http://rmarkdown.rstudio.com/r_notebooks.html). The control in the upper right will let you show or hide code (although, in this case, the code is pretty much the point of the notebook). It will also allow you to extract and save the notebook (R source file) itself.

# Solver

As mentioned, the solver used here is CPLEX 12.6.3. One of the virtues of ompr (and the `ROI` package it depends on) is that, armed with appropriate plug-ins, you can switch to alternative solvers. The connection to CPLEX is handled by the `ROI.plugin.cplex` package, loaded below. What is not immediately visible is that, in addition to CPLEX itself, this package also depends on the `Rcplex` library.

# Preparation

We will use the 1-norm of the observations to set some model parameters, so we start by defining that.

```{r}
norm.1 <- function(x) sum(abs(x))
```

We will arbitrarily set a minimum absolute discriminant value (delta) of 0.01 for a correct classification (so that discriminant values between -0.01 and +0.01 will be considered inconclusive).

```{r}
delta <- 0.01
```

Before creating any models, we need to load the required libraries.

```{r}
library(ROI)               # general solver interface
library(ROI.plugin.cplex)  # connects the ROI package to CPLEX
library(ompr)              # MIP modeling package
library(ompr.roi)          # connects OMPR to ROI
library(magrittr)          # for the pipe operator
library(kernlab)           # for SVMs
```

# Function definitions

### Partitioning the data into positive/negative samples

We will be partitioning the original data set into positive and negative samples more than once, so we create a function to do that. The function takes as its sole argument the name ("pos") of the species in the positive sample and returns a named list of samples. The (now redundant) "Species" column is removed from each subsample.

```{r}
partition <- function(pos) {
  list(positive = subset(iris, subset = (Species == pos), select = -Species),
       negative = subset(iris, subset = (Species != pos), select = -Species)
  )
}
```

### Building and running MIP discriminant models

Since we want to generate several different discriminant functions, we will create a function that takes our list of subsamples, uses it to generate a MIP model, solves the model and returns the result in a list.

Variable names generally conform to the 1990 paper. The variables in the model have ".var" appended to distinguish them from the R variables containing their values (e.g., `w.var` is the discriminant coefficient vector in the model, while `w` is the R variable receiving the values of `w.var` in the solution).

```{r}
mipDiscriminant <- function(data) {
  positive <- data$positive
  negative <- data$negative
  #
  # We need the maximum 1-norm of any observation in each sample.
  #
  Delta1 <- max(apply(negative, 1, norm.1))
  Delta2 <- max(apply(positive, 1, norm.1))
  Deltabar <- (Delta1 + Delta2) / 2
  #
  # Those values are used to compute a "big M" coefficient for each sample.
  #
  M1 <- 2 * Delta1 + Deltabar  # big M for negative observations
  M2 <- 2 * Delta2 + Deltabar  # big M for positive observations
  #
  # We also need a coefficient for a term in the objective function
  # that rewards greater separation between the two groups.
  #
  epsilon <- 1 / (2 * Deltabar)
  #
  # Next, we compute the dimensions of the data.
  #
  npos <- nrow(positive)  # size of positive sample
  nneg <- nrow(negative)  # size of negative sample
  nvar <- ncol(positive)  # number of features
  #
  # Now we can build the model.
  #
  model <-
    MIPModel() %>% 
    add_variable(y.var[i], i = 1:nneg, type = "binary") %>%   # indicators for negative misclassifications
    add_variable(z.var[i], i = 1:npos, type = "binary") %>%   # indicators for positive misclassifications
    add_variable(w.var[i], i = 1:nvar, type = "continuous", lb = -1, ub = 1) %>%  # coefficients of discriminant function
    add_variable(c.var, type = "continuous") %>%   # constant term of discriminant function (unbounded)
    add_variable(d.var, type = "continuous", lb = delta) %>%   # minimum absolute score of any successfully classified observation (bounded away from zero)
    set_objective(sum_expr(y.var[i], i = 1:nneg) + sum_expr(z.var[i], i = 1:npos) - epsilon * d.var, sense = "min") %>%   # minimize number of misclassifications with credit for higher spacing from the classification boundary
    add_constraint(sum_expr(negative[i, j] * w.var[j], j = 1:nvar) + c.var + d.var - M1 * y.var[i] <= 0, i = 1:nneg) %>%   # determine negative misclassifications
    add_constraint(sum_expr(positive[i, j] * w.var[j], j = 1:nvar) + c.var - d.var + M2 * z.var[i] >= 0, i = 1:npos)  # determine positive misclassifications
  #
  # Finally, we solve the model and return the result.
  #
  model %>% solve_model(with_ROI(solver = "cplex"))
}
```

### Creating discriminant functions

We will also want a function that takes the solution to our MIP model and turns it into a linear discriminant function, to facilitate testing the function.

```{r}
extractFunction <- function(mipResult) {
  a <- (mipResult %>% get_solution(w.var[j]))[, "value"]
  a0 <- mipResult %>% get_solution(c.var)
  f <- function(x) as.numeric(a %*% as.numeric(x) + a0)
  f
}
```

# Testing the MIP discriminant model

### Setosa v. Versicolor/Virginica

Let's start by trying to discriminate between setosa ("positive") and versicolor/virginica ("negative"). The `sink()` function is used to avoid cluttering the page with a rather lengthy progress report from the solver.

```{r}
samples <- partition("setosa")
sink("/dev/null")
result <- mipDiscriminant(samples)
sink()
```


The first thing to do is check whether the model solved successfully.

```{r}
cat("Status = ", result$status)
```

Next, we can take a look at the objective value.

```{r}
cat("Objective value = ", result$objective_value)
```

Values for scalar model variables are returned as numeric vectors of length 1.

```{r}
c <- result %>% get_solution(c.var)
c
d <- result %>% get_solution(d.var)
d
```

Values for indexed model variables are returned in data frames.

```{r}
w <- result %>% get_solution(w.var[j])
y <- result %>% get_solution(y.var[i])
z <- result %>% get_solution(z.var[i])
w
```

Let's see, based on the y and z indicators, how many observations the model thinks it misclassified.

```{r}
cat(sprintf("Negative misclassifications: %d\n", nrow(y[y$value >= 0.1, ])))
cat(sprintf("Positive misclassifications: %d\n", nrow(z[z$value >= 0.1, ])))
```

We allegedly have no errors (100% correct classification for this split). Just to be sure, let's extract the discriminant function and apply it directly to the data, finding its smallest score on the positive sample (hopefully > 0) and its largest score on the negative sample (hopefully < 0).

```{r}
dFunc <- extractFunction(result)
cat("Mininum clearance for positive sample: ", min(apply(samples$positive, 1, dFunc)))
cat("\nMinimum clearance for negative sample: ", max(apply(samples$negative, 1, dFunc)))
```

So all scores are correct (as expected), and the closest any observation comes to the "boundary" score of 0 is 1.35 (which is the value of the d variable in the model).

For completeness, since we will be doing it below, we create a "confusion matrix" comparing predictions to actual values.

```{r}
predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "setosa", "other"))  # predicted class
table(predicted, iris$Species)
```

### Vericolor v. Setosa/Virginica

Next, we repeat the experiment, trying to distinguish versicolor from setosa and virginica.

```{r}
samples <- partition("versicolor")
sink("/dev/null")
result2 <- mipDiscriminant(samples)
sink()
cat("Status = ", result2$status)
cat("\nObjective value = ", result2$objective_value)
```

This takes considerably longer to solve, because 100% accuracy is not attainable (so the MIP solver has to do a fair bit of branching). The good news is that we have an optimal solution; the bad news is that we apparently misclassified 26 observations. We can break that down by group.

```{r}
dFunc <- extractFunction(result2)
predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "versicolor", "other"))  # predicted class
table(predicted, iris$Species)
```

The best solution misclassifies 8 versicolor (positive) specimens and 18 specimens from the other two species (negative). We can easily check which they are.

```{r}
y <- result2 %>% get_solution(y.var[i])
z <- result2 %>% get_solution(z.var[i])
e1 <- which(y$value > 0.5)
e2 <- which(z$value > 0.5)
cat("Negative:\n")
e1
cat("Positive:\n")
e2
```

We can take a look at the discriminant values for those observations, just to confirm what we are seeing.

```{r}
cat("Negative:\n")
apply(samples$negative[e1, ], 1, dFunc)
cat("\nPositive:\n")
apply(samples$positive[e2, ], 1, dFunc)
```

### Virginica v. Setosa/Versicolor

Finally, let's see if we can discriminate between virginica and setosa/versicolor.

```{r}
samples <- partition("virginica")
sink("/dev/null")
result3 <- mipDiscriminant(samples)
sink()
cat("Status = ", result3$status)
cat("\nObjective value = ", result3$objective_value)
```

It looks as though we got away with just one misclassification.

```{r}
dFunc <- extractFunction(result3)
predicted <- as.factor(ifelse(apply(iris[, 1:4], 1, dFunc) > 0, "virginica", "other"))  # predicted class
table(predicted, iris$Species)
# one of the negative samples was classified positive - which?
y <- result3 %>% get_solution(y.var[i])
e1 <- which(y$value > 0.5)
cat("Error in observation ", e1)
```

So observation 84 from the negative sample, which happens to be a versicolor specimen, seems to be the only one misclassified. Let's check its discriminant score.

```{r}
dFunc <- extractFunction(result3)
apply(samples$negative[e1, ], 1, dFunc)
```

It is indeed a positive (incorrect) score.

# Classification using an SVM

Just for fun, let's try using a classification support vector machine (SVMs), via the kernlab package. To keep things simple, we will leave all options at default settings.

### Setosa v. Versicolor/Virginica

First, let's try to discriminate setosa from the others.

```{r}
class <- factor(iris$Species == "setosa", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
```

The SVM matches the 100% accuracy of the MIP model on setosa.

### Versicolor v. Setosa/Virginica

```{r}
class <- factor(iris$Species == "versicolor", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
```

Attempting to distinguish versicolor from the other two species, the SVM model misclassifies one or two veriscolor instances and one or two virginica instances, much better than the MIP model did. (This is due to the SVM not being restricted to a linear discriminant function.) The reason I say "one or two" and not specifically what you see in the confusion matrix is that the SVM fitting algorithm appears to use a somewhat randomized algorithm. If you keep refitting the model, the results periodically change. You can stabilize the results by setting a random seed before calling `ksvm()`, but which results you get will depend on how lucky you are picking the seed.

### Virginica v. Setosa/Versicolor

```{r}
class <- factor(iris$Species == "virginica", labels = c("negative", "positive"))
svm <- ksvm(class ~ . - Species, data = iris)
table(predict(svm, iris), iris$Species)
```

The SVM has a total of three or four errors when trying to discriminate between virginica and the other two species, worse than the single error the MIP model achieved. Again, the total error count varies depending on the random seed used.

# Reference

[1] Rubin, P. A. Heuristic solution procedures for a mixed-integer programming discriminant model. *Managerial and Decision Economics*, Vol. 11, No. 4, October 1990.