This notebook looks at an assignment problem posed on OR Stack Exchange. We keep the same notation used in the post where possible. The problem involves assigning users to providers (“service points” in the original question – “users” and “service points” was subsequently edited out) subject to upper bounds on how many users a provider handles and how many service providers a user gets, along with a requirement that every user be served by at least one provider. The author of the question provided the following integer programming formulation: \[\begin{align*}
\max_{d_{u,c}} & \sum_{u=1}^{U}\sum_{c=1}^{C}\omega_{u,c}d_{u,c}\\
\text{s.t. } & \sum_{c=1}^{C}d_{u,c}\le C_{\max}\ \forall u\in\left\{ 1,\dots,U\right\} \\
& \sum_{c=1}^{C}d_{u,c}\ge1\ \forall u\in\left\{ 1,\dots,U\right\} \\
& \sum_{u=1}^{U}d_{u,c}\le U_{\max}\ \forall c\in\left\{ 1,\dots,C\right\} \\
& d_{u,c}\in\left\{ 0,1\right\} \ \forall u,c.
\end{align*}\]Everything other than \(d_{u,c}\) is a parameter, and I have replaced specific numeric limits with parameters \(U_\max\) (maximum number of users a provider can take) and \(C_\max\) (maximum number of providers a user can take).
An answer posted on OR SE asserted that the constraint matrix has the “integrality property”, which means that the LP relaxation of the problem produces an integer solution. I do not have a proof for that, but in experiments the assertion appears to hold up. In this notebook, we will experiment with solving the problem via Lagrangean relaxation (LR). This is feasible only because the TUM property means that there is no duality gap, so the LR optimum should coincide with the optimum of the original problem.
In preparation we load several libraries.
library(magrittr)
# Libraries used to find the optimal solution via LP.
library(ompr)
Registered S3 method overwritten by 'data.table':
method from
print.data.table
library(ompr.roi)
library(ROI)
ROI: R Optimization Infrastructure
Registered solver plugins: nlminb, cplex.
Default solver: auto.
library(ROI.plugin.cplex)
# Library used to find the LR solution via gradient-free optimization.
library(dfoptim)
Problem
We need to create a problem instance. First we set the problem dimensions.
U <- 10 # number of users
C <- 5 # number of providers
Umax <- 4 # maximum number of users any provider can serve
Cmax <- 3 # maximum number of providers that can serve any one user
Note that the maximum number of possible assignments is \(N = \min(U * C_\max, C * U_\max)\). If you experiment with larger problem sizes (larger U
and C
), be sure to adjust Umax
so that Umax * C >= U
.
cat ("Maximum number of assignments = ", min(U * Cmax, C * Umax), "\n")
Maximum number of assignments = 20
Next, we set a random seed and then randomly generate service quality coefficients (\(\omega\)).
set.seed(20210214)
omega <- matrix(runif(U * C), nrow = U, ncol = C)
We will also create a function that takes as input a matrix of assignments (users on rows, providers on columns), calculates the correct objective value, and confirms that the solution is feasible.
report <- function(x) {
# Calculate the objective value.
cat("Objective value = ", sum(omega * x), "\n")
# Confirm constraint satisfaction.
rsums <- apply(x, 1, sum)
csums <- apply(x, 2, sum)
cat("Minimum # of providers for any user = ", min(rsums), "\n")
cat("Maximum # of providers for any user = ", max(rsums), " (limit = ", Cmax, ")\n")
cat("Maximum # of users for any provider = ", max(csums), " (limit = ", Umax, ")\n")
cat("Total number of assignments made = ", sum(x), "\n")
}
Optimal Solution
To find the optimal assignments, we solve the LP relaxation of the binary integer program specified in the post, relying on the integrality property of the constraint matrix to get a binary solution.
model <-
MIPModel() %>%
add_variable(d[u, c], u=1:U, c=1:C, type = "continuous", lb = 0, ub = 1) %>%
set_objective(sum_expr(omega[u, c] * d[u, c], u = 1:U, c = 1:C)) %>%
add_constraint(sum_expr(d[u, c], c = 1:C) <= Cmax, u = 1:U) %>%
add_constraint(sum_expr(d[u, c], c = 1:C) >= 1, u = 1:U) %>%
add_constraint(sum_expr(d[u, c], u = 1:U) <= Umax, c = 1:C)
system.time(
opt.sol <- model %>% solve_model(with_ROI(solver = "cplex"))
)
CPLEX environment opened
Closed CPLEX environment
user system elapsed
0.128 0.011 0.330
cat("Final solver status = ", opt.sol$status, "\n")
Final solver status = optimal
report(matrix(opt.sol$solution, nrow = U, ncol = C, byrow = FALSE))
Objective value = 15.03994
Minimum # of providers for any user = 1
Maximum # of providers for any user = 3 (limit = 3 )
Maximum # of users for any provider = 4 (limit = 4 )
Total number of assignments made = 20
Lagrangean Relaxation
To use Lagrangean relaxation, we treat \(d\) as binary and relax all the constraints, obtaining the following problem: \[\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\omega_{u,c}d_{u,c}-\sum_{u}\lambda_{u}\left[\sum_{c}d_{u,c}-C_{\max}\right]\\+\sum_{u}\mu_{u}\left[\sum_{c}d_{u,c}-1\right]-\sum_{c}\nu_{c}\left[\sum_{u}d_{u,c}-U_{\max}\right]\right)\]where \(\lambda\), \(\mu\) and \(\nu\) are duals of the upper limit on assignments to users, the lower limit on assignments to users, and the upper limit on assignments to providers respectively. (We reverse the lower limit constraints so that all multipliers are nonnegative, for convenience.) The inner maximization can be simplified somewhat to \[\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\left[\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}\right]d_{u,c}\\+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}\right).\]The solution to the inner problem is trivial: \(d_{u,c}=1\) if \(\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}>0\), 0 otherwise. (The value of \(d_{u,c}\) when the coefficient is zero is arbitrary, but we can assume 0 for simplicity.) So the value of the function \(LR(\lambda, \mu, \nu)\) is simply \[LR(\lambda,\mu,\nu)=\sum_{u}\sum_{c}\left(\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}\right)^{+}\\+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}.\] The function is piecewise-linear.
To test Lagrangean relaxation, we first define the \(LR\) function as above. To use it with the optim()
function from the stats
library, we need it to be a function of a single vector variable, so we concatenate the multipliers into a vector \((\lambda, \mu, \nu)\).
LR <- function(duals) {
# Split the input vector in lambda, mu and nu.
lambda <- duals[1:U]
mu <- duals[(U + 1):(2 * U)]
nu <- duals[(2 * U + 1):(2 * U + C)]
# Calculate the coefficient matrix for d in the inner optimization.
rc <- t(t(omega - lambda + mu) - nu)
# Sum only the positive coefficients (d will be 1 if the coefficient is positive, 0 if negative), add the other terms and return the value.
sum(pmax(rc, 0)) + Cmax * sum(lambda) - sum(mu) + Umax * sum(nu)
}
We will also need a function that splits apart a dual vector and computes the corresponding assignment matrix. The following function computes the coefficient matrix \(R\) for \(d\), where \[R_{u,v} = \omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c},\] and then sets \(d_{u,v} = 1\) if and only if \(r_{u,c} > 0\). This should work when \(R\) contains no zeros (or components within rounding error of zero). More generally, we might need to look at zero components and make decisions about whether \(d\) should be 0 or 1 based on satisfaction of the constraints.
amatrix <- function(duals) {
# Split up the multiplier vector.
lambda <- duals[1:U]
mu <- duals[(U + 1):(2 * U)]
nu <- duals[(2 * U + 1):(2 * U + C)]
# Compute the coefficient matrix for d.
r <- t(t(omega - lambda + mu) - nu)
# We set set d = 1 only if rc > 0.
a <- matrix(0, nrow = U, ncol = C)
a[which(r > 0, arr.ind = TRUE)] <- 1
return(a)
}
Now we can try to optimize it. If you experiment with larger problem instances, you may need to alter default iteration limits or other parameters for the various methods below.
Derivative-based optimization
We need to impose a lower bound of zero on the variables, and optim()
only supports that for the L- BFGS-B quasi-Newton method. Since the LR function is only piecewise differentiable, we let optim()
approximate gradients by finite difference. We use (1, …, 1) as a starting solution.
# Run the search.
system.time(
bfgs.sol <- optim(rep.int(1, 2 * U + C), LR, method = "L-BFGS-B", lower = 0)
)
user system elapsed
0.091 0.000 0.111
# Print the final objective value.
cat("The Lagrangean objective value = ", bfgs.sol$value, "\n")
The Lagrangean objective value = 15.03994
# Extract and report on the assignment matrix.
report(amatrix(bfgs.sol$par))
Objective value = 15.03994
Minimum # of providers for any user = 1
Maximum # of providers for any user = 3 (limit = 3 )
Maximum # of users for any provider = 4 (limit = 4 )
Total number of assignments made = 20
The BFGS solution objective value matches the LP objective value, and all constraints are satisfied. Experiments with larger problem dimensions, however, have resulted in BFGS terminating with a suboptimal solution to the outer problem (and an infeasible solution to the inner problem).
Derivative-free optimization
We can also use a derivative-free method for optimizing LR (which would avoid any potential issues with LR not being continuously differentiable). We first try the Nelder-Mead “simplex” method (with lower bounds of 0 for the multipliers) as implemented in the dfoptim
package. As before, we use (1, …, 1) as a starting solution.
system.time(
nm.sol <- nmkb(rep.int(1, 2 * U + C), LR, lower = 0)
)
user system elapsed
0.249 0.000 0.258
# Print the final objective value.
cat("Nelder-Mead reports an objective value of ", nm.sol$value, "\n")
Nelder-Mead reports an objective value of 16.96005
# Extract and report on the assignment matrix.
report(amatrix(nm.sol$par))
Objective value = 16.96005
Minimum # of providers for any user = 1
Maximum # of providers for any user = 3 (limit = 3 )
Maximum # of users for any provider = 10 (limit = 4 )
Total number of assignments made = 26
Nelder-Mead did not do well. We can retry it using the solution from the derivative-based BFGS method with some added noise as a starting point. Since Nelder-Mead needs strictly positive starting values and the BFGS solution might have some duals within rounding error of zero, we adjust any tiny duals upward before running Nelder-Mead.
# Get the BFGS solution and tweak any values too close to zero.
init <- pmax(bfgs.sol$par + runif(2 * U + C, -0.1, 0.1), 1e-6)
# Retry Nelder-Mead.
system.time(
nm.sol <- nmkb(init, LR, lower = 0)
)
user system elapsed
0.503 0.000 0.506
rm(init)
# Print the final objective value.
cat("Nelder-Mead reports an objective value of ", nm.sol$value, "\n")
Nelder-Mead reports an objective value of 15.04775
# Extract and report on the assignment matrix.
report(amatrix(nm.sol$par))
Objective value = 15.03994
Minimum # of providers for any user = 1
Maximum # of providers for any user = 3 (limit = 3 )
Maximum # of users for any provider = 4 (limit = 4 )
Total number of assignments made = 20
So Nelder-Mead may work (or at least get close) if started near the optimum, but does not seem very robust. Also, if you try larger problem instances, you may see a a warning message that “Nelder-Mead should not be used for high-dimensional optimization”.
Another method to try is the Hooke-Jeeves algorithm. We will again start the search from (1, …, 1).
system.time(
hj.sol <- hjkb(rep.int(1, 2 * U + C), LR, lower = 0)
)
user system elapsed
0.037 0.000 0.038
# Print the final objective value.
cat("Hooke-Jeeves reports an objective value of ", hj.sol$value, "\n")
Hooke-Jeeves reports an objective value of 15.03994
# Extract and report on the assignment matrix.
report(amatrix(hj.sol$par))
Objective value = 15.03994
Minimum # of providers for any user = 1
Maximum # of providers for any user = 3 (limit = 3 )
Maximum # of users for any provider = 4 (limit = 4 )
Total number of assignments made = 20
The Hooke-Jeeves solution matches the LP objective value with no constraint violations, so at least on this problem it seems to be preferable to Nelder-Mead. It also matched the LP solution on larger instances that were tested.
---
title: "Binary Assignment Problem"
author: Paul A. Rubin
date: 15 February 2021
output: html_notebook
---

This notebook looks at an assignment problem posed on [OR Stack Exchange](https://or.stackexchange.com/questions/5698/how-can-i-find-the-optimal-assignments-for-this-milp-problem-heuristically). We keep the same notation used in the post where possible. The problem involves assigning users to providers ("service points" in the original question -- "users" and "service points" was subsequently edited out) subject to upper bounds on how many users a provider handles and how many service providers a user gets, along with a requirement that every user be served by at least one provider. The author of the question provided the following integer programming formulation: $$\begin{align*}
\max_{d_{u,c}} & \sum_{u=1}^{U}\sum_{c=1}^{C}\omega_{u,c}d_{u,c}\\
\text{s.t. } & \sum_{c=1}^{C}d_{u,c}\le C_{\max}\ \forall u\in\left\{ 1,\dots,U\right\} \\
 & \sum_{c=1}^{C}d_{u,c}\ge1\ \forall u\in\left\{ 1,\dots,U\right\} \\
 & \sum_{u=1}^{U}d_{u,c}\le U_{\max}\ \forall c\in\left\{ 1,\dots,C\right\} \\
 & d_{u,c}\in\left\{ 0,1\right\} \ \forall u,c.
\end{align*}$$Everything other than $d_{u,c}$ is a parameter, and I have replaced specific numeric limits with parameters $U_\max$ (maximum number of users a provider can take) and $C_\max$ (maximum number of providers a user can take).

An answer posted on OR SE asserted that the constraint matrix has the "integrality property", which means that the LP relaxation of the problem produces an integer solution. I do not have a proof for that, but in experiments the assertion appears to hold up. In this notebook, we will experiment with solving the problem via Lagrangean relaxation (LR). This is feasible only because the TUM property means that there is no duality gap, so the LR optimum should coincide with the optimum of the original problem.

In preparation we load several libraries.

```{r}
library(magrittr)
# Libraries used to find the optimal solution via LP.
library(ompr)
library(ompr.roi)
library(ROI)
library(ROI.plugin.cplex)
# Library used to find the LR solution via gradient-free optimization.
library(dfoptim)
```

# Problem

We need to create a problem instance. First we set the problem dimensions.

```{r}
U <- 10    # number of users
C <- 5     # number of providers
Umax <- 4  # maximum number of users any provider can serve
Cmax <- 3  # maximum number of providers that can serve any one user
```

Note that the maximum number of possible assignments is $N = \min(U * C_\max, C * U_\max)$. If you experiment with larger problem sizes (larger `U` and `C`), be sure to adjust `Umax` so that `Umax * C >= U`.

```{r}
cat ("Maximum number of assignments = ", min(U * Cmax, C * Umax), "\n")
```

Next, we set a random seed and then randomly generate service quality coefficients ($\omega$).

```{r}
set.seed(20210214)
omega <- matrix(runif(U * C), nrow = U, ncol = C)
```

We will also create a function that takes as input a matrix of assignments (users on rows, providers on columns), calculates the correct objective value, and confirms that the solution is feasible.

```{r}
report <- function(x) {
  # Calculate the objective value.
  cat("Objective value = ", sum(omega * x), "\n")
  # Confirm constraint satisfaction.
  rsums <- apply(x, 1, sum)
  csums <- apply(x, 2, sum)
  cat("Minimum # of providers for any user = ", min(rsums), "\n")
  cat("Maximum # of providers for any user = ", max(rsums), " (limit = ", Cmax, ")\n")
  cat("Maximum # of users for any provider = ", max(csums), " (limit = ", Umax, ")\n")
  cat("Total number of assignments made = ", sum(x), "\n")
}
```

# Optimal Solution

To find the optimal assignments, we solve the LP relaxation of the binary integer program specified in the post, relying on the integrality property of the constraint matrix to get a binary solution.

```{r}
model <- 
  MIPModel() %>%
  add_variable(d[u, c], u=1:U, c=1:C, type = "continuous", lb = 0, ub = 1) %>% 
  set_objective(sum_expr(omega[u, c] * d[u, c], u = 1:U, c = 1:C)) %>% 
  add_constraint(sum_expr(d[u, c], c = 1:C) <= Cmax, u = 1:U) %>% 
  add_constraint(sum_expr(d[u, c], c = 1:C) >= 1, u = 1:U) %>% 
  add_constraint(sum_expr(d[u, c], u = 1:U) <= Umax, c = 1:C)
```

```{r}
system.time(
  opt.sol <- model %>% solve_model(with_ROI(solver = "cplex"))
)
cat("Final solver status = ", opt.sol$status, "\n")
report(matrix(opt.sol$solution, nrow = U, ncol = C, byrow = FALSE))
```

# Lagrangean Relaxation

To use Lagrangean relaxation, we treat $d$ as binary and relax all the constraints, obtaining the following problem: $$\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\omega_{u,c}d_{u,c}-\sum_{u}\lambda_{u}\left[\sum_{c}d_{u,c}-C_{\max}\right]\\+\sum_{u}\mu_{u}\left[\sum_{c}d_{u,c}-1\right]-\sum_{c}\nu_{c}\left[\sum_{u}d_{u,c}-U_{\max}\right]\right)$$where $\lambda$, $\mu$ and $\nu$ are duals of the upper limit on assignments to users, the lower limit on assignments to users, and the upper limit on assignments to providers respectively. (We reverse the lower limit constraints so that all multipliers are nonnegative, for convenience.) The inner maximization can be simplified somewhat to $$\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\left[\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}\right]d_{u,c}\\+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}\right).$$The solution to the inner problem is trivial: $d_{u,c}=1$ if $\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}>0$, 0 otherwise. (The value of $d_{u,c}$ when the coefficient is zero is arbitrary, but we can assume 0 for simplicity.) So the value of the function $LR(\lambda, \mu, \nu)$ is simply $$LR(\lambda,\mu,\nu)=\sum_{u}\sum_{c}\left(\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}\right)^{+}\\+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}.$$ The function is piecewise-linear.

To test Lagrangean relaxation, we first define the $LR$ function as above. To use it with the `optim()` function from the `stats` library, we need it to be a function of a single vector variable, so we concatenate the multipliers into a vector $(\lambda, \mu, \nu)$.

```{r}
LR <- function(duals) {
  # Split the input vector in lambda, mu and nu.
  lambda <- duals[1:U]
  mu <- duals[(U + 1):(2 * U)]
  nu <- duals[(2 * U + 1):(2 * U + C)]
  # Calculate the coefficient matrix for d in the inner optimization.
  rc <- t(t(omega - lambda + mu) - nu)
  # Sum only the positive coefficients (d will be 1 if the coefficient is positive, 0 if negative), add the other terms and return the value.
  sum(pmax(rc, 0)) + Cmax * sum(lambda) - sum(mu) + Umax * sum(nu)
}
```

We will also need a function that splits apart a dual vector and computes the corresponding assignment matrix. The following function computes the coefficient matrix $R$ for $d$, where $$R_{u,v} = \omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c},$$ and then sets $d_{u,v} = 1$ if and only if $r_{u,c} > 0$. This should work when $R$ contains no zeros (or components within rounding error of zero). More generally, we might need to look at zero components and make decisions about whether $d$ should be 0 or 1 based on satisfaction of the constraints.

```{r}
amatrix <- function(duals) {
  # Split up the multiplier vector.
  lambda <- duals[1:U]
  mu <- duals[(U + 1):(2 * U)]
  nu <- duals[(2 * U + 1):(2 * U + C)]
  # Compute the coefficient matrix for d.
  r <- t(t(omega - lambda + mu) - nu)
  # We set set d = 1 only if rc > 0.
  a <- matrix(0, nrow = U, ncol = C)
  a[which(r > 0, arr.ind = TRUE)] <- 1
  return(a)
}
```

Now we can try to optimize it. If you experiment with larger problem instances, you may need to alter default iteration limits or other parameters for the various methods below.

## Derivative-based optimization

We need to impose a lower bound of zero on the variables, and `optim()` only supports that for the L- BFGS-B quasi-Newton method. Since the LR function is only piecewise differentiable, we let `optim()` approximate gradients by finite difference. We use (1, ..., 1) as a starting solution.

```{r}
# Run the search.
system.time(
  bfgs.sol <- optim(rep.int(1, 2 * U + C), LR, method = "L-BFGS-B", lower = 0)
)
# Print the final objective value.
cat("The Lagrangean objective value = ", bfgs.sol$value, "\n")
# Extract and report on the assignment matrix.
report(amatrix(bfgs.sol$par))
```

The BFGS solution objective value matches the LP objective value, and all constraints are satisfied. Experiments with larger problem dimensions, however, have resulted in BFGS terminating with a suboptimal solution to the outer problem (and an infeasible solution to the inner problem).

## Derivative-free optimization

We can also use a derivative-free method for optimizing LR (which would avoid any potential issues with LR not being continuously differentiable). We first try the Nelder-Mead "simplex" method (with lower bounds of 0 for the multipliers) as implemented in the `dfoptim` package. As before, we use (1, ..., 1) as a starting solution.

```{r}
system.time(
  nm.sol <- nmkb(rep.int(1, 2 * U + C), LR, lower = 0)
)
# Print the final objective value.
cat("Nelder-Mead reports an objective value of ", nm.sol$value, "\n")
# Extract and report on the assignment matrix.
report(amatrix(nm.sol$par))
```

Nelder-Mead did not do well. We can retry it using the solution from the derivative-based BFGS method with some added noise as a starting point. Since Nelder-Mead needs strictly positive starting values and the BFGS solution might have some duals within rounding error of zero, we adjust any tiny duals upward before running Nelder-Mead.

```{r}
# Get the BFGS solution and tweak any values too close to zero.
init <- pmax(bfgs.sol$par + runif(2 * U + C, -0.1, 0.1), 1e-6)
# Retry Nelder-Mead.
system.time(
  nm.sol <- nmkb(init, LR, lower = 0)
)
rm(init)
# Print the final objective value.
cat("Nelder-Mead reports an objective value of ", nm.sol$value, "\n")
# Extract and report on the assignment matrix.
report(amatrix(nm.sol$par))
```

So Nelder-Mead may work (or at least get close) if started near the optimum, but does not seem very robust. Also, if you try larger problem instances, you may see a a warning message that "Nelder-Mead should not be used for high-dimensional optimization".

Another method to try is the Hooke-Jeeves algorithm. We will again start the search from (1, ..., 1).

```{r}
system.time(
  hj.sol <- hjkb(rep.int(1, 2 * U + C), LR, lower = 0)
)
# Print the final objective value.
cat("Hooke-Jeeves reports an objective value of ", hj.sol$value, "\n")
# Extract and report on the assignment matrix.
report(amatrix(hj.sol$par))
```

The Hooke-Jeeves solution matches the LP objective value with no constraint violations, so at least on this problem it seems to be preferable to Nelder-Mead. It also matched the LP solution on larger instances that were tested.
