Toning Down Polarization in Elections, Note 09/12/2021

Voting is central to collective decision-making. A proper mechanism should inspire trust in the participants and an assurance that their preferences are conveyed.

It has been said that democracy is the worst form of government except all those other forms that have been tried from time to time.

—Winston Churchill

The rules to an election have been set in stone for millenia, more or less. The glacial pace of experimentation and evolution is understandable, since the framework with which we come to agreements must be so intuitive that no one could possibly question its integrity. Does that remind you of something? We've seen firsthand how even the smallest added complexities (like mail-in ballots... let's not talk about electronic) inadvertently feed right into the rhetoric of demagogues.

Nevertheless, people continue to advocate for incremental changes. Ranked-choice voting appears to be a simple-enough upgrade: allow people to rank a number of candidates rather than just their top choice. It is alluring because, in places like the United States, it can slowly make it okay to choose third-party candidates. As long as you list someone more reasonable right after, you can rest easy knowing that you have not contributed to a heinous spoiler ploy.

The most fashionable style for evaluating ranked choices is flavored as Instant Runoff Voting (IRV). You know how parliamentary governments are often split among dozens of parties, and often a runoff election must be called after no one secures a significant percentage of the vote? The idea of IRV is to automate that process by having people turn in ranked choices and simulating as many runoffs as necessary to arrive at a majority candidate. We first treat everyone's ballot as a single vote for their top pick. Until a majority is formed, the lowest-polling candidate is dropped and any ballot with them listed as its current choice is revised to the next candidate that is still considered viable. Simple!

But is it ideal? Some scholars do not think so. In my view, ranked choices---ordinal judgement, as they say in the social sciences---reside in that sweet spot of informativeness without too much free range, as it is difficult to elicit more granularity from individuals without risking confusion. For instance, having them assign percentages to candidates, or choosing from a range of scores, increases the attack surface for deception or strategic voting. That is my feeling. Objectively, we have data from actual ranked-choice elections out in the wild. This allows us to explore some realistic hypotheticals.

We have established that electoral processes carry massive inertia, and that it is desirable. I will need you to disregard that as we enter an imaginary world, a utopia where public institutions run on transparent algorithms with no political limits to their sophistication. Spare me this thought experiment, if you will:

It's time to think bigger: overview of the method.

What can be said if we frame this as a Bayesian problem? Let us suppose that the population follows some distribution of preferences for all the candidates. In other words, an individual can be written as a vector of utility values exhibited for each candidate. We observe only a set of constraints in the form of rankings between some of the candidates, and are left to fill out the rest in our minds. Each individual's utilities are a latent variable that we shall infer using the few data on hand, and some impartial prior distribution. The task can be turned into one of sampling the posterior for the population's parameters, like means and variances, relying on guesses for the utilities; then, one can shift gears and sample another viable guess for each utility vector based on the rankings combined with these new parameters for the distribution. We kickstart the chain of samples by drawing a vague initial guess for the population's distribution. As an aside, I learned soon after having these thoughts that my formulation is called a Thurstonian model in psychology literature.

The tactic of partitioning the unknown random variables (here, into a. latent utilities and b. population parameters) and drawing from one set at a time while conditioning on the others' latest samples is called Gibbs sampling. It was proven a long time ago that this see-saw sampler eventually approximates the true joint distribution of all the variables. Once we have such a sample, we can marginalize on the population's parameters (i.e. disregard the latent utilities) to magically arrive at an actionable posterior.

One final order of business is choosing how to parametrize the population's distribution of utility vectors. The most typical in Thurstonian literature is the well-adored multivariate Gaussian. One formulation I have yet to encounter (but I wouldn't be surprised if I simply missed it) is a Gaussian mixture. It makes imminent sense to me because each mixture component can be interpreted as a voting bloc! We could allow for a composition of cohorts that each identifies by ideology, affiliation, demographic, or geography. The Gaussian mixture model is flexible enough to approximate most relevant distribution shapes to an acceptable degree.

I had to incorporate the additional step of sampling possible cohort origins for each voter, with a Dirichlet prior, and aggregating these memberships to infer again the mixture weights for the cohorts.

Purpose and application.

As stated from the onset, my aim is to concoct a statistical technique that picks the least polarizing candidate. The idea is that a candidate that is most agreeable to, say, 99% of the population is the one with the highest 1%-quantile of utilities. Utilities can depend on each other in complicated ways through the covariance and mixture structure of the population---a practical necessity to allow for conservative candidates to be favored all together by one individual and liberal candidates by another. In deciding a winner, however, the only relevant quantiles are on the marginal distributions of each candidate's utility.

We first turn our attention to a particularly controversial outcome spotted in real-world IRV:


Burlington's 2009 Mayoral Election	Winners
Plurality	Wright
IRV	Kiss
Condorcet	Montroll
Latent Utility Model 10k samples, 8 cohorts \(\{75\%, 90\%, 95\%, 99\%\}\) agreement	Montroll Smith Kiss
10k samples, 1 cohort \(\{75\%, 90\%, 95\%, 99\%\}\) agreement	Montroll Kiss Wright

Let me explain the other kinds of winners displayed below: a so-called plurality winner is simply the one that garners the most top votes, as if ranked choices were not employed at all, i.e. before the simulated runoffs in IRV. The Condorcet winner, if one exists, is the candidate that beats all pairwise comparisons from tallying the relevant ballots.

Again, our scoring mechanism looks for the candidate that most of the population can stand to digest. Here, that happens to be the Condorcet winner! Not bad. Moving on, we turn to city elections in Minneapolis.

In 2017, two of Minneapolis's wards ended up with city-council representatives who differed from whom would have otherwise been picked, sans IRV.


Minneapolis Ward 3's 2017 Election	Winners
Plurality	Jentzen
IRV	Fletcher
Condorcet	Fletcher
Latent Utility Model 20k samples, 8 cohorts \(\{75\%, 90\%, 95\%, 99\%\}\) agreement	Fletcher Bildsoe Pree-Stinson Jentzen
20k samples, 1 cohort \(\{75\%, 90\%, 95\%, 99\%\}\) agreement	Fletcher Jentzen Bildsoe Pree-Stinson

Here we see another reassuring result for the novel technique. With multiple cohorts, we pick Fletcher, the Condorcet and IRV (albeit not plurality) winner. Now we've built up to the most interesting result:


Minneapolis Ward 4's 2017 Election	Winners
Plurality	Johnson
IRV	Cunningham
Condorcet	Cunningham
Latent Utility Model 50k samples, 8 cohorts \(\{95\%, 99\%\}\) agreement	Gasca Cunningham Johnson Hansen
--- \(\{75\%, 90\%\}\) agreement	Cunningham Gasca Johnson Hansen

Who is Cunningham and who is Gasca and why is this noteworthy?

Here's a bit of background. I will explain as briefly as possible while passing minimal judgement on the quality of each candidate, since I learned about this election from afar. Essentially, Johnson represented the Democratic establishment of that district: old, white, powerful, with a family legacy in the position. She appeared impactful, but staid. Hansen was a fringe libertarian. Cunningham and Gasca were the diverse, younger incumbents. Of the two, Cunningham achieved significantly greater popular support. His fiery and perhaps populist rhetoric landed him neck and neck with Johnson, eventually usurping her through IRV. A marvelous result by many accounts.

But my algorithm picked Gasca, who received a mere quarter of Johnson's or Cunningham's number-one votes. Was Gasca more agreeable to Johnson voters, who perhaps felt marginally alienated by Cunningham's campaign? That's my conjecture. Here are the correlation coefficients from the final approximation:

heatmap of correlations between ward 4's candidates

Gasca preference coincided rather strongly with preference for Cunningham! Johnson-Cunningham correlated negatively, but about as strongly. Johnson-Gasca preference correlation was less negative. So Gasca could have been less polarizing while still providing a refreshing change to the political scenery.

One interesting detail to note is the mild correlation surrounding the libertarian Hansen. Hansen voters appeared to favor Gasca, then Cunningham, and then Johnson. I expected a stronger correlation with Cunningham for a hint towards anti-establishment populism. Perhaps this outcome further corroberates Gasca's agreeability. Please take this storytelling with a grain of salt, and let me know if you disagree with my interpretation.

European football.

On a less serious note, we can apply a similar methodology to evaluating the performance of sports teams in various leagues. Take, for instance, the English Premier League of 2020-2021. The league normally awards three points to the victor of a match, and one to both teams in a draw. If, instead of aggregating points on this arbitrary (though heavily considered among many smart people, I'm sure, just not as rigorously founded) scale, we looked at match victories as a single ballot with that pair of teams and showing indifference towards all others, what would our procedure yield?

Behold, the teams ranked by their expected objective worthiness:
(i.e. approximate game-population means, whatever that means)

League Ranking

Man City
Man United
Liverpool
Chelsea
Leicester
West Ham
Tottenham
Arsenal
Leeds
Everton
Aston Villa
Newcastle
Wolves
Crystal Palace
Southampton
Brighton
Burnley
Fulham
West Brom
Sheffield United

Objective Ranking:
eight cohorts

Man City
Man United
Liverpool
Chelsea
West Ham
Leicester
Tottenham
Arsenal
Everton
Aston Villa
Leeds
Wolves
Newcastle
Brighton
Crystal Palace
Southampton
Burnley
Fulham
West Brom
Sheffield United

Objective Ranking:
one cohort

Man City
Man United
Liverpool
Chelsea
West Ham
Leicester
Tottenham
Arsenal
Everton
Aston Villa
Leeds
Wolves
Newcastle
Brighton
Crystal Palace
Southampton
Burnley
Fulham
Sheffield United
West Brom

Remarkably similar overall---good news for the League's system.