**Update on 20120917**: added W-S normality tests on
residual and changed the plotting code to be compliant with the latest
releases of ggplot

**Update on 20130828**: Added a graph stacking
the estimated results of the second round attribution given
first round preferences.

Certain interesting public opinion traits can be extracted from analyzing double round election results, and the mathematical tools required to do so are freely available. In particular I am going to analyze the results of the 2012 Finnish presidential election with the R numerical analysis software.

The Finnish 2012 presidential election (like all recent Finnish presidential elections) is made by two rounds of voting. In the first one a number of more than two candidates are electable. If the most voted candidate in the first round get less than 50% of the votes a second round of voting takes place where only the two most voted candidates in the first round compete for the win. For completion sake: just two candidates on the first round would make it redundant; 50% or more of the votes for a first round candidate make the second round unnecesary too.

My objective is to find what are the second round results made of. That is, I intend to build a model that explains why the second round results happened by using only the voting tally of the first round.

For the mathematically inclined: I intend to find a vector of weights \( w \) such that \( w'x=y \) where each \( x_{i} \) are the results for the \( i \)-th candidate on the first round and \( y \) are the results of a candidate on the second round. Depending on how \( w \) is formulated second round voting preferences can be uncovered.

Let's start by saying that I have (and so can you) the results from the 2012 election:

First we load some the R packages we are going to take into use.

```
library(reshape2, warn.conflicts=FALSE, quietly=TRUE, verbose=FALSE)
library(plyr, warn.conflicts=FALSE, quietly=TRUE, verbose=FALSE)
library(ggplot2, warn.conflicts=FALSE, quietly=TRUE, verbose=FALSE)
library(limSolve, warn.conflicts=FALSE, quietly=TRUE, verbose=FALSE)
```

The election data is contained in two variables:

```
load("fin_prez_2012.RData")
ls()
```

`## [1] "area.results" "round.totals" `

The variable *area.results* hold the voting results for each
polling area (an aggregation entity smaller or equal than a small
city: small villages are likely to form a polling area, large cities
contain many polling areas) and for both rounds and for each
candidate:

`head(area.results) `

```
## split_level election.round variable value
## 1 Äänekoski-Akanniemi - Karhunlähde first Arhinmäki 154
## 2 Äänekoski-Alkula-Keskusta (Suolahti) first Arhinmäki 193
## 3 Äänekoski-Honkola first Arhinmäki 82
## 4 Äänekoski-Keskusta first Arhinmäki 137
## 5 Äänekoski-Kirkonmäki - Kuhnamo first Arhinmäki 188
## 6 Äänekoski-Konginkangas first Arhinmäki 40
```

Eight candidates ran on the first round:

`levels(area.results$variable) `

```
## [1] "Arhinmäki" "Haavisto" "Lipponen" "Essayah" "Soini" "Niinistö"
## [7] "Biaudet" "Väyrynen"
```

The following color scale should make plotting easier:

```
candidate.colormap <- c("Arhinmäki"='#cd0009ff',
"Haavisto"='#61bf1aff', "Lipponen"='#ed1b24ff', "Essayah"='#f7931dff',
"Soini"='#edd866ff', "Niinistö"='#00577dff', "Biaudet"='#007ac9ff',
"Väyrynen"='#1b9345ff')
```

Their totals for the first round is as follows:

```
totals.1st.round <- ddply( subset(area.results, election.round=='first'), .(variable),
function(x) c(agg=sum(x$value)))
colnames(totals.1st.round) <- c('candidate', 'vote.count')
totals.1st.round
```

```
## candidate vote.count
## 1 Arhinmäki 167663
## 2 Haavisto 574275
## 3 Lipponen 205111
## 4 Essayah 75744
## 5 Soini 287571
## 6 Niinistö 1131254
## 7 Biaudet 82598
## 8 Väyrynen 536555
```

But a graph is worth a thousand words:

```
p <- ggplot(totals.1st.round, aes(x=candidate, y=vote.count/1000, fill=candidate))
p <- p + labs(title='Total votes (1st round)')
p <- p + geom_bar() + theme(legend.position='none')
p <- p + scale_fill_manual(values=candidate.colormap[totals.1st.round$candidate])
p <- p + scale_x_discrete(name="Candidate")
p <- p + scale_y_continuous(name="Count") p
```

The other variable that we loaded, `round.totals`

, holds
the number of votes casted for each round and the difference in votes
casted between the second round and the first.

`head(round.totals) `

```
## split_level first second second.round.diff
## 1 Äänekoski-Akanniemi - Karhunlähde 1480 1321 -159
## 2 Äänekoski-Alkula-Keskusta (Suolahti) 1533 1383 -150
## 3 Äänekoski-Honkola 1075 910 -165
## 4 Äänekoski-Keskusta 1400 1263 -137
## 5 Äänekoski-Kirkonmäki - Kuhnamo 1578 1385 -193
## 6 Äänekoski-Konginkangas 773 676 -97
```

Summarizing the difference in votes casted shows something worth a remark…

`summary(round.totals$second.round.diff) `

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -683.0 -104.0 -63.0 -77.8 -35.0 241.0
```

…which is that typically for each polling area electorate turnover is lower on the second round. That is, typically less voters cast their votes on the second round. We will delve into this promptly.

The drop in the turnover that we just show triggers the question of whether it is a generalized phenomena or something affecting to some particular supporters. That is whether the turnover drop is more severe on voters that on the first round voted for some particular candidate.

Hence we will estimate the second round turnover difference by using the vote count for each candidate in the first round. We will do so by assuming that the second round results are a linear combination of the results in the first round while allowing that each voter of a dropped-off candidate may abstain.

The model looks as follows:

\[

\begin{aligned}

& Xw \approx -\Delta \\

\text{where} & 0 \preceq w \preceq 1

\end{aligned}

\]

where \( X \) contains the results of the first voting round (one
row for each polling area, one column for each candidate), \( w \) are
my *mixing* coefficients and \( \Delta \) is the turnover
difference with negative sign.

That is, it tries to find a linear combination of votes in the first round that explains the missing votes on the second round, and such linear combination is constrained to real numbers larger than 0 (the model assumes that only first round voters may take part in the second round) and smaller than 1 (the percentage of voters showing up for a given candidate on the second round cannot be larger than their number on the first round).

The model is the constructed by reducing the problem to an \( \ell^{2}_{2} \)-norm minimization problem while adding the natural mixing limits as constraints thus obtaining a modified Least Squares problem, where the modification is in the form of constraints defining a boxed feasible region:

\[

\begin{aligned}

\min_{w} & & \parallel Xw - \Delta \parallel ^2 \\

\text{s.t} & & 0 \preceq w \preceq 1

\end{aligned}

\]

That is, find the vector \( w \) that minimizes the difference between \( Xw \) and the second round voting difference (notice the sign reversal since I am considering the difference also with reversed sign).

The function `lsei()`

in the R
package `LimSolve`

is suitable for solving such
minimization problem if we augment the parameter constraints as to
suit out problem definition. That is, we have to convert our
constraints specification \( 0 \preceq w \preceq 1 \) into the
canonical form expected by `lsei()`

which is \( Gx \geq h
\) which for our case is

\[

\begin{aligned}

I w \succeq 0 \\

-I w \succeq -1

\end{aligned}

\]

where \( I \) is the identity matrix comformant with our weight vector \( w \).

This is how it looks like in R. First we construct the matrix \( X \)

```
x <- dcast(subset(area.results, election.round == "first"), split_level ~ variable)
candidate.names <- colnames(x)[!colnames(x) == "split_level"]
X <- as.matrix(x[, candidate.names])
rownames(X) <- x$split_level
```

Building \( \Delta \) is trivial:

```
D <- matrix(with(round.totals, first - second), nrow = nrow(round.totals), ncol = 1)
rownames(D) <- round.totals$split_level
colnames(D) <- "turnover.diff"
```

Last but not least the augmented constraints construction:

```
n <- ncol(X)
I <- diag(rep(1, n))
G <- rbind(I, -I)
h <- rbind(matrix(0, n, 1), matrix(-1, n, 1))
```

So now we can move on to solve the minimization problem and show the values for the vector \( w \):

```
sol <- lsei(A = X, B = D, G = G, H = h, type = 2, verbose = FALSE)
show(w <- sol$X)
```

```
## Arhinmäki Haavisto Lipponen Essayah Soini Niinistö Biaudet Väyrynen
## 0.03245 0.00000 0.10747 0.00000 0.06378 0.00000 0.00433 0.24868
```

The most remarkable result is that almost 1 out of 4 people who voted for Väyrynen and 1 out of 10 for Lipponen in the first round did not vote for any of the two candidates on the second round. All resulting values can be seen in the graph below:

```
x <- data.frame(Candidate=names(w), drop.out.rate=w)
p <- ggplot(x, aes(x=Candidate, y=drop.out.rate*100, fill=Candidate))
p <- p + labs(title='Voter Second Round Drop-out Rate by Candidate')
p <- p + geom_bar() + theme(legend.position='none')
p <- p + scale_fill_manual(values=candidate.colormap[x$Candidate])
p <- p + scale_y_continuous(name="Drop-out%")
p
```

Notice that Väyrynen ended third in the first voting round.

At this point it would be good to measure how good is the model fit. A analysis of the model residuals is in place:

```
r <- (X %*% (1 - w) - round.totals$second)/round.totals$second ## normalized by district size
shapiro.test(r)
```

```
## ## Shapiro-Wilk normality test
## ## data: r
## W = 0.8568, p-value < 2.2e-16
```

`summary(r) `

```
## V1
## Min. : -0.1274
## 1st Qu.:-0.0155
## Median : 0.0012
## Mean : 0.0080
## 3rd Qu.: 0.0222
## Max. : 0.4112
```

… which indicates residuals following a normal distribution.

Last we will try to estimate the composition of the second round results from the first round voting.

First we start by compensating the number of votes lost in the second round, which is trivial now that we have our estimate in \( w\)

```
Z <- round(X %*% diag(1 - w))
colnames(Z) <- colnames(X)
```

In the final round there are two candidates only. Let me introduce you their names:

`as.character(unique(subset(area.results, election.round == "second")$variable)) `

`## [1] "Haavisto" "Niinistö" `

The variables \( Y_{H} \) and \( Y_{N} \) represent the number of votes each of them got in the second round (the index denotes the first name of their respective last names).

```
YH <- subset(area.results, election.round == "second" & variable == "Haavisto")
YN <- subset(area.results, election.round == "second" & variable == "Niinistö")
```

I am going to define \( w_{H},w_{N} \) as the *mix*
variables, that is \( w_{H} \) is the vector that holds where first
round votes and \( w_{N} \) for Niinistö both compensated by second
round drop-outs in the election process, that is:

\[

\begin{aligned}

Zw_{H} \approx Y_{H} \\

Zw_{N} \approx Y_{N}

\end{aligned}

\]

Since our model assumes that second round voters choose either of the two candidates the following identity holds:

\[

w^{(i)}_{N} + w^{(i)}_{H} = 1

\]

Let's put is all together. The model we are trying to build now is slighly different than the previous one and it looks as follows (using only \( w_{H} \)):

\[

\begin{aligned}

\min_{w_{H}} & & \parallel

\begin{bmatrix}

Z & 0 \\

0 & Z

\end{bmatrix}

\begin{bmatrix}

w_{H} \\

1-w_{H}

\end{bmatrix} -

\begin{bmatrix}Y_{H}\\

Y_{N}\end{bmatrix}

\parallel ^2 \\

\text{s.t} & & 0 \preceq w_{H} \preceq 1

\end{aligned}

\]

Quick reminder: \( Z \) is the first round results compensated with the second round drop-out rate by candidate, \( Y_{H} \) and \( Y_{N} \) are the second round vote count for each of the two candidates in the second round, and \( w_{H} \) are the mixing coefficient for the second round such that \( Zw_{H} \approx Y_{H} \) and \( Z(1-w_{H}) \approx Y_{N} \).

An alternative way of expressing it is:

\[

\begin{aligned}

\min_{w_{H,N}} & & \parallel

\begin{bmatrix}

Z & 0 \\

0 & Z

\end{bmatrix}

\begin{bmatrix}

w_{H} \\

w_{N}

\end{bmatrix} -

\begin{bmatrix}Y_{H}\\

Y_{N}\end{bmatrix}

\parallel ^2 \\

\text{s.t} & & w_{H},w_{N} \succeq 0 \\

& & \begin{bmatrix}

I; I

\end{bmatrix}

\begin{bmatrix}

w_{H} \\

w_{N}

\end{bmatrix} = 1

\end{aligned}

\]

in which the inequality constraints force the *mix* variable
to be positive and the equality constraints force the exclusive voter
preference.

Let's see how we solve this. Let me do a sanity check first in order to make manipulations easier later on:

```
stopifnot(Reduce(`&`, Map(function(a, b) Reduce(`&`, a == b), c(rownames(Z), rownames(Z)),
c(YH$split_level, YN$split_level))))
```

Now we construct the variables that the solver expects:

```
ZZ <- matrix(0, nrow(Z), ncol(Z))
A <- rbind(cbind(Z, ZZ), cbind(ZZ, Z))
colnames(A) <- rep(colnames(Z), 2)
b <- matrix(c(YH$value, YN$value), nrow(A), 1)
E <- diag(1, ncol(Z), ncol(Z))
E <- cbind(E, E)
f <- matrix(1, ncol(Z), 1)
G <- diag(1, ncol(A), ncol(A))
h <- matrix(0, ncol(A), 1)
```

```
sol <- lsei(A, b, E, f, G, h, type = 1, verbose = FALSE)
show(wh <- sol$X[seq(1, ncol(Z))]) ## Mix variables for Haavisto
```

```
## Arhinmäki Haavisto Lipponen Essayah Soini Niinistö Biaudet Väyrynen
## 1.0000000 1.0000000 0.7799422 0.0009319 0.3448406 0.0067961 0.4775150 0.1603960
```

`show(wn <- sol$X[seq(ncol(Z) + 1, 2 * ncol(Z))]) ## Mix variables for Niinisto `

```
## Arhinmäki Haavisto Lipponen Essayah Soini Niinistö Biaudet Väyrynen
## 0.0000 0.0000 0.2201 0.9991 0.6552 0.9932 0.5225 0.8396
```

Which seems to suggest that the vast majority of Essayah voters supported Niinistö in the second round, as did Väyrynen's supporters mostly and Soini's. Haavisto however got most of his support from Arhimäki's first round voters and most of Lipponen's. Biaudet's supporters look divided according to the model. Notice that the objective is to explain voting sympathies from first round choices and not final results though know the former allows you easity to infer the later.

Last but not least a graph showing second round preferences by first round voting choice

```
x <- data.frame(second.round.choice=c('Haavisto', 'Niinistö'), rbind(wh,wn))
x <- melt(x, c("second.round.choice"), candidate.names,
variable.name='first.round.choice', value.name='Sympathy')
p <- ggplot(x, aes(x=first.round.choice, y=Sympathy, fill=first.round.choice))
p <- p + geom_bar() + labs(title='Second round sympathy') + theme(legend.position='none')
p <- p + scale_fill_manual(values=candidate.colormap[x$first.round.choice])
p <- p + scale_x_discrete(name="First round choice")
p <- p + scale_y_continuous(name="Sympathy %")
p <- p + facet_grid(second.round.choice ~ .) p
```

Again, an analysys of the obtained residuals is in place:

```
## normalizing by individual area result
r <- 100 * (Z %*% matrix(wh) - YH$value)/YH$value
shapiro.test(r)
```

```
## ## Shapiro-Wilk normality test
## ## data: r
## W = 0.6557, p-value < 2.2e-16
```

`summary(r) `

```
## V1
## Min. : -30.82
## 1st Qu.: -3.07
## Median : 1.52
## Mean : 6.08
## 3rd Qu.: 8.49
## Max. : 221.79
```

… which indicates residuals following a normal distribution.

The following graph shows how the first round voting sympathies contributed to the second round final result according to the model estimates:

```
model.results <- merge(merge(second.round.sympathy,
drop.out.rate, by.x = c("first.round.choice"),
by.y = c("Candidate")), totals.1st.round,
by.x = c("first.round.choice"),
by.y = c("candidate"))
model.results$first.round.choice <- as.character(model.results$first.round.choice)
model.results$second.round.choice <- as.character(model.results$second.round.choice)
model.results$contributions = as.integer(round(with(model.results, vote.count *
(1 - drop.out.rate) * Sympathy)))
p1 <- ggplot()
p1 <- p1 + geom_bar(data = model.results, stat = "identity",
aes(x = second.round.choice, y = contributions/1000, fill = first.round.choice))
p1 <- p1 + scale_fill_manual(name = "First round votes for:",
values = candidate.colormap[model.results$first.round.choice])
p1 <- p1 + scale_x_discrete(name = "\nSecond round candidates")
p1 <- p1 + scale_y_continuous(name = "Total seconds round votes (in 000s)\n")
p1 <- p1 + theme(axis.title.x = element_text(size = 24),
axis.title.y = element_text(size = 24))
p1 <- p1 + theme(axis.text.x = element_text(size = 14),
axis.text.y = element_text(size = 14))
p1 <- p1 + theme(legend.title = element_text(size = 16),
legend.text = element_text(size = 14))
p1
```