This notebook uses a heuristic to compute a nonconvex polygon of “minimal” area containing a specified set of points. It is motivated by a question on OR Stack Exchange.

We are going to assume that the solution must be a Hamiltonian tour of a subset of the given points such that all remaining points are contained in the closed polygon defined by that tour. The original question is a bit unspecific as to what is sought. A single piecewise-linear path visiting all points (possibly more than once) would generate a set of area 0 containing the points, but this is likely not what was intended.

library(ggplot2)   # used for plotting
library(plotly)    # for plots with "tooltips" identifying points
library(interp)    # for generating triangulations

To start, we generate a set of points and plot it.

# Set the random number seed for reproducibility.
seedValue <- 20211020
set.seed(seedValue)
# Set the size of the point set.
npoints <- 100
# Generate a data frame of points.
points <- runif(2*npoints) |>
          matrix(nrow = npoints) |>
          data.frame()
colnames(points) <- c("x", "y")
points$index <- 1:npoints
# Plot the points.
basePlot <- ggplot(data = points, mapping = aes(x, y, label = index)) + geom_point()
ggplotly(basePlot)

We compute a triangulation of the points and plot that.

triangulation <- tri.mesh(points[, 1:2])
plot(triangulation)

The triangulation provides the convex hull (as a “tour” of the original points), which we identify and add it to the plot.

# Compute the convex hull.
boundary <- triangulation$chull
# Close the tour by repeating the first point.
tour <- c(boundary, boundary[1])
# Compute the area of the convex hull, using the triangle areas computed by tri.mesh.
cat("Hull area = ", sum(triangulation$cclist[, "area"]), "\n")
Hull area =  0.8651773 
# Add the tour to the point plot and display it.
hullPlot <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
plot(hullPlot)

Before continuing, we build some convenience functions. The first one takes as input the indices of the endpoints of an edge and returns the index of that edge in the edge list belonging to the triangulation. Note that, in the triangulation, edges are oriented (arcs). To compensate, we try both orientations of the edge.

# Inputs: i, j = indices of two points
# Output: the index in triangulation$arcs of the arc between those points
find.edge <- function(i, j) {
  temp <- apply(triangulation$arcs, 1, function(x) (x[1] == i && x[2] == j) || (x[1] == j && x[2] == i))
  which(temp)[1]
}

The next function takes an edge index and finds the indices of all triangles containing that edge

# Input: the index of an edge
# Output: the indices of all triangles containing that edge
find.triangles <- function(i) {
  c(which(triangulation$trlist[, "k1"] == i),
    which(triangulation$trlist[, "k2"] == i),
    which(triangulation$trlist[, "k3"] == i))
}

The following function takes a triangle index and returns the indices of the vertices of that triangle.

# Input the index of a triangle
# Output: the indices of the vertices of the triangle
find.vertices <- function(i) {
  triangulation$trlist[i, c("i1", "i2", "i3")]
}

In what follows, we will be removing edges and triangles. In order to avoid corrupting the original triangulation, we maintain a separate vector showing the number of triangles currently sharing each edge, along with vectors recording the indices of edges and triangles targeted for deletion.

deleted.edges <- c()            # edges to delete
deleted.triangles <- c()        # triangles to delete
n.edges <- triangulation$narcs  # number of edges in the original triangulation
# Count how often each edge is shared.
sharing <- sapply(1:n.edges, function(x) length(find.triangles(x)))
# For future use, list the columns in the triangle matrix that give the indices of the edges in a triangle.
edge.cols <- c("k1", "k2", "k3")

The logic of our heuristic is as follows. Consider an edge [a, b] belonging to the current boundary. It must belong to a single triangle, say [a, b, c]. Now assume that c is not currently on the boundary. If we replace [a, b] with [a, c] and [c, b] on the boundary and delete triangle [a, b, c] from the triangulation (adding point c to the boundary), we reduce the area of the polygon without losing any points. The requirement that c not already be on the boundary stems from the need for the boundary to be a Hamiltonian path (each point visited exactly once).

We now define a function that takes a lap around the boundary. For each qualifying edge (belonging to only one triangle, with the third vertex not yet on the boundary), it identifies the triangle containing it, replaces the qualifying edge with the other two edges of the triangle, and deletes the qualifying edge and its parent triangle. The function returns true if any changes were made, false if not.

# Input: none
# Output: true if at least one change to the boundary was made
shrink <- function() {
  change <- FALSE
  ix <- 1  # index of the first boundary point in the segment being considered
  while (ix < length(tour)) {
    # Find the index of the edge in question.
    ie <- find.edge(tour[ix], tour[ix + 1])
    # Find the parent triangle, removing any deleted triangles.
    owner <- setdiff(find.triangles(ie), deleted.triangles)
    # Sanity check: a boundary edge should be owned by exactly one triangle.
    if (length(owner) != 1) {
      stop("Boundary edge ", ie, " owned by triangle(s) ", as.character(owner))
    }
    # Identify the third vertex of the triangle.
    v <- setdiff(find.vertices(owner), c(tour[ix], tour[ix + 1]))
    # If the third vertex is already on the boundary, punt.
    if (v %in% boundary) {
      ix <- ix + 1
      next
    }
    # Mark the edge and the parent triangle for deletion.
    deleted.edges <<- c(deleted.edges, ie)
    deleted.triangles <<- c(deleted.triangles, owner)
    # Insert the new vertex in the tour.
    tour <<- c(tour[1:ix], v, tour[-(1:ix)])
    # Update the boundary.
    boundary <<- head(tour, -1)
    # Note the change.
    change <- TRUE
    # Move on.
    ix <- ix + 1
  }
  change
}

We repeatedly call the shrink function until it takes a full lap around the boundary without making any changes.

ct <- 1                # counts changes
while (shrink()) {
  cat("Pass ",  ct, " changed the boundary\n")
  ct <- ct + 1
}
Pass  1  changed the boundary
Pass  2  changed the boundary
Pass  3  changed the boundary
Pass  4  changed the boundary
Pass  5  changed the boundary
Pass  6  changed the boundary
Pass  7  changed the boundary
cat("No change after pass ", ct, "\n")
No change after pass  8 

Plot the revised boundary.

hullPlot2 <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
plot(hullPlot2)

Copy the original triangulation, remove the deleted elements, and display the result.

# Make a copy of the triangulation.
temp <- triangulation
# Remove the deleted triangles.
temp$trlist <- temp$trlist[-deleted.triangles, ]
temp$nt <- nrow(temp$trlist)
temp$cclist <- temp$cclist[-deleted.triangles, ]
# Removing arcs would cause inconsistencies in arc numbering, so instead we set them to NA, which prevents them from plotting.
temp$arcs[deleted.edges, ] <- NA
# Compute the area contained by the surviving triangles.
cat("Revised area = ", sum(temp$cclist[, "area"]), "\n")
Revised area =  0.5106936 
# Plot the surviving triangles.
plot(temp)

Plot what is left.

# Start with a copy of the polygon plot.
shadedPlot <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
# Shade the surviving triangles.
tlist <- setdiff(1:triangulation$nt, deleted.triangles)
for (i in tlist) {
  temp <- triangulation$trlist[i, 1:3]
  temp <- c(temp, temp[1])
  shadedPlot <- shadedPlot + geom_polygon(points[temp, ], mapping = aes(x, y), fill = "lightblue", alpha = 0.5)
}
plot(shadedPlot)

---
title: "Hull"
output: html_notebook
---

This notebook uses a heuristic to compute a nonconvex polygon of "minimal" area containing a specified set of points. It is motivated by a [question](https://or.stackexchange.com/questions/7158/how-to-find-the-point-on-the-exterior-of-a-given-set-of-points) on OR Stack Exchange.

We are going to assume that the solution must be a Hamiltonian tour of a subset of the given points such that all remaining points are contained in the closed polygon defined by that tour. The original question is a bit unspecific as to what is sought. A single piecewise-linear path visiting all points (possibly more than once) would generate a set of area 0 containing the points, but this is likely not what was intended.

```{r}
library(ggplot2)   # used for plotting
library(plotly)    # for plots with "tooltips" identifying points
library(interp)    # for generating triangulations
```

To start, we generate a set of points and plot it.

```{r}
# Set the random number seed for reproducibility.
seedValue <- 20211020
set.seed(seedValue)
# Set the size of the point set.
npoints <- 100
# Generate a data frame of points.
points <- runif(2*npoints) |>
          matrix(nrow = npoints) |>
          data.frame()
colnames(points) <- c("x", "y")
points$index <- 1:npoints
# Plot the points.
basePlot <- ggplot(data = points, mapping = aes(x, y, label = index)) + geom_point()
ggplotly(basePlot)
```


We compute a triangulation of the points and plot that.

```{r}
triangulation <- tri.mesh(points[, 1:2])
plot(triangulation)
```

The triangulation provides the convex hull (as a "tour" of the original points), which we identify and add it to the plot.

```{r}
# Compute the convex hull.
boundary <- triangulation$chull
# Close the tour by repeating the first point.
tour <- c(boundary, boundary[1])
# Compute the area of the convex hull, using the triangle areas computed by tri.mesh.
cat("Hull area = ", sum(triangulation$cclist[, "area"]), "\n")
# Add the tour to the point plot and display it.
hullPlot <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
```

```{r}
plot(hullPlot)
```

Before continuing, we build some convenience functions. The first one takes as input the indices of the endpoints of an edge and returns the index of that edge in the edge list belonging to the triangulation. Note that, in the triangulation, edges are oriented (arcs). To compensate, we try both orientations of the edge.

```{r}
# Inputs: i, j = indices of two points
# Output: the index in triangulation$arcs of the arc between those points
find.edge <- function(i, j) {
  temp <- apply(triangulation$arcs, 1, function(x) (x[1] == i && x[2] == j) || (x[1] == j && x[2] == i))
  which(temp)[1]
}
```

The next function takes an edge index and finds the indices of all triangles containing that edge

```{r}
# Input: the index of an edge
# Output: the indices of all triangles containing that edge
find.triangles <- function(i) {
  c(which(triangulation$trlist[, "k1"] == i),
    which(triangulation$trlist[, "k2"] == i),
    which(triangulation$trlist[, "k3"] == i))
}
```

The following function takes a triangle index and returns the indices of the vertices of that triangle.

```{r}
# Input the index of a triangle
# Output: the indices of the vertices of the triangle
find.vertices <- function(i) {
  triangulation$trlist[i, c("i1", "i2", "i3")]
}
```

In what follows, we will be removing edges and triangles. In order to avoid corrupting the original triangulation, we maintain a separate vector showing the number of triangles currently sharing each edge, along with vectors recording the indices of edges and triangles targeted for deletion.

```{r}
deleted.edges <- c()            # edges to delete
deleted.triangles <- c()        # triangles to delete
n.edges <- triangulation$narcs  # number of edges in the original triangulation
# Count how often each edge is shared.
sharing <- sapply(1:n.edges, function(x) length(find.triangles(x)))
# For future use, list the columns in the triangle matrix that give the indices of the edges in a triangle.
edge.cols <- c("k1", "k2", "k3")
```

The logic of our heuristic is as follows. Consider an edge [a, b] belonging to the current boundary. It must belong to a single triangle, say [a, b, c]. Now assume that c is not currently on the boundary. If we replace [a, b] with [a, c] and [c, b] on the boundary and delete triangle [a, b, c] from the triangulation (adding point c to the boundary), we reduce the area of the polygon without losing any points. The requirement that c not already be on the boundary stems from the need for the boundary to be a Hamiltonian path (each point visited exactly once).

We now define a function that takes a lap around the boundary. For each qualifying edge (belonging to only one triangle, with the third vertex not yet on the boundary), it identifies the triangle containing it, replaces the qualifying edge with the other two edges of the triangle, and deletes the qualifying edge and its parent triangle. The function returns true if any changes were made, false if not.

```{r}
# Input: none
# Output: true if at least one change to the boundary was made
shrink <- function() {
  change <- FALSE
  ix <- 1  # index of the first boundary point in the segment being considered
  while (ix < length(tour)) {
    # Find the index of the edge in question.
    ie <- find.edge(tour[ix], tour[ix + 1])
    # Find the parent triangle, removing any deleted triangles.
    owner <- setdiff(find.triangles(ie), deleted.triangles)
    # Sanity check: a boundary edge should be owned by exactly one triangle.
    if (length(owner) != 1) {
      stop("Boundary edge ", ie, " owned by triangle(s) ", as.character(owner))
    }
    # Identify the third vertex of the triangle.
    v <- setdiff(find.vertices(owner), c(tour[ix], tour[ix + 1]))
    # If the third vertex is already on the boundary, punt.
    if (v %in% boundary) {
      ix <- ix + 1
      next
    }
    # Mark the edge and the parent triangle for deletion.
    deleted.edges <<- c(deleted.edges, ie)
    deleted.triangles <<- c(deleted.triangles, owner)
    # Insert the new vertex in the tour.
    tour <<- c(tour[1:ix], v, tour[-(1:ix)])
    # Update the boundary.
    boundary <<- head(tour, -1)
    # Note the change.
    change <- TRUE
    # Move on.
    ix <- ix + 1
  }
  change
}
```

We repeatedly call the `shrink` function until it takes a full lap around the boundary without making any changes.

```{r}
ct <- 1                # counts changes
while (shrink()) {
  cat("Pass ",  ct, " changed the boundary\n")
  ct <- ct + 1
}
cat("No change after pass ", ct, "\n")
```

Plot the revised boundary.

```{r}
hullPlot2 <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
plot(hullPlot2)
```

Copy the original triangulation, remove the deleted elements, and display the result.

```{r}
# Make a copy of the triangulation.
temp <- triangulation
# Remove the deleted triangles.
temp$trlist <- temp$trlist[-deleted.triangles, ]
temp$nt <- nrow(temp$trlist)
temp$cclist <- temp$cclist[-deleted.triangles, ]
# Removing arcs would cause inconsistencies in arc numbering, so instead we set them to NA, which prevents them from plotting.
temp$arcs[deleted.edges, ] <- NA
# Compute the area contained by the surviving triangles.
cat("Revised area = ", sum(temp$cclist[, "area"]), "\n")
```
```{r}
# Plot the surviving triangles.
plot(temp)
```

Plot what is left.

```{r}
# Start with a copy of the polygon plot.
shadedPlot <- basePlot + geom_path(points[tour, ], mapping = aes(x, y))
# Shade the surviving triangles.
tlist <- setdiff(1:triangulation$nt, deleted.triangles)
for (i in tlist) {
  temp <- triangulation$trlist[i, 1:3]
  temp <- c(temp, temp[1])
  shadedPlot <- shadedPlot + geom_polygon(points[temp, ], mapping = aes(x, y), fill = "lightblue", alpha = 0.5)
}
plot(shadedPlot)
```

