A Mathematical Case for Approval Voting

Approval voting is the best voting system (AKA, Taylor gets political)

Introduction

Updated 12/5/2025.

I'm back — Misha (Pokémon GO)

It’s time to get political… No, not like that! This post is based on an amazing paper I recently read: Approval Voting by the legendary Steven Brams and Peter Fishburn. If you are at all interested in voting systems, I highly recommend reading the paper. I have been also reading the Handbook on Approval Voting, and that has been quite enlightening as well. I wanted to share some of my thoughts on approval voting thus far, and why I think it’s the best voting system out there.

First: If there is one thing to take away from this post, it’s the following:

If you are asked to vote on a proposition to give an alternative voting system like RCV or Approval Voting, VOTE YES.

The purpose of this post is to explain why approval voting is actually the best voting system, mathematically speaking, but anything is better than the current plurality system we have in the US. Though, proportional representation is the ultimate goal (a topic for another post).

Plurality Problems

If you’ve dabbled in politics, even just a little, you’ve probably heard just how bad the current system used in the US (and many other countries) is. The first-past-the-post (FPTP), or plurality, voting system, where each voter selects one candidate and the candidate with the most votes wins.

This system has a number of well-known issues, such as the spoiler effect, where a third-party candidate can siphon votes away from a major candidate, leading to the election of a candidate who is less preferred by the majority. One of the most famous examples of this is the 2000 US Presidential Election, where Ralph Nader’s candidacy is widely believed to have siphoned votes away from Al Gore, leading to George W. Bush winning the election in Florida by only 537 votes (out of 5,963,110 cast in Florida, which is 0.009005% of the votes cast in the state), and thus winning the presidency (despite losing the popular vote).

Nader, being to the left of Gore, likely had much more ideological overlap with Gore than with Bush. Thus, it’s extremely likely most of Nader’s voters would have preferred Gore over Bush. Without the ability to express this preference, Nader voters effectively caused Bush to win by voting with their conscience. Honesty cost them dearly. If they had sucked it up and cast their vote for a “lesser evil”, they might have been better off. I think it’s safe to say, we can certainly do better than this.

Now, here’s the thing: Every voting system sort of defines its own metric for who should win, and FPTP is no exception. For example, the Borda count actually finds the best average rank of the candidates (this is a fun thing to prove). The metric used in FPTP is essentially measuring enthusiasm via first place support. This is not entirely undesirable, but it’s also pretty terrible in many ways.

Spoiled elections have been happening for a long time (take the 1912 US Presidential Election, for example). A lesser known example of plurality disenfranchising voters is the 1970 New York Senate election, where James Buckley (the conservative candidate) won with only 39% of the vote, compared to 37% and 24% for two liberal candidates, Richard Ottinger and Charles Goodell, respectively. Together, the liberal candidates had 61% of the vote, showing that a majority of voters likely preferred a more liberal candidate over the conservative Buckley. However, due to the spoiler effect, the election wasn’t decided based on what the voters at large wanted, but rather which candidate had the largest and most fervent base of first-choice support (by about 2 percent).

Speaking of New York, the recent Mayoral race this year had a major focus of getting candidates with no chance to drop out of the race, to avoid splitting the vote. Curtis Sliwa was essentially bribed to drop out so that Andrew Cuomo could have a better chance of defeating Zohran Mamdani, though he stayed in. Eric Adams too also faced intense pressure to drop out, which he eventually did (with rumors of offers from the Trump administration for a position).

Similarly, Robert F. Kennedy Jr. tried very hard to get his name off of the ballots in 2024 once he decided to ally with Donald Trump. I think we should perhaps consider the absolute absurdity of the idea that it should somehow be a good thing that candidates are bribed to drop out of elections, and voters are encouraged to reduce their selection of candidates because the system itself punishes them for having more choices. But this mo’ candidates mo’ problems issue can be mathematically articulated.

Consider a case where we have $N$ candidates, $A_1,A_2,\ldots,A_N$, each with approximately equal share of support. Let’s say each candidate has $k$ first place votes, for some integer $k$. Then, the final voter decides the election by giving one candidate $k + 1$ votes, making them the winner with $\frac{k + 1}{Nk + 1}$ of the vote. As $k \to \infty$, this approaches $\frac{1}{N}$, meaning that it’s possible for a candidate to win with just over $\frac{1}{N}$ of the vote. In fact, this is actually the worst case scenario for FPTP. As the number of candidates increases, the possible proportion of the electorate who did not vote for or do not feel represented by the winner can approach 100%. Imagine if the supporters of every other candidate had the winner as their least favorite candidate. Then overall voter utility would be abysmal.

For example, say we have $N=10$ candidates, and each candidate has 10% of the vote. Then, one candidate can win with just over 10% of the vote, meaning nearly 90% of voters did not vote for the winner, and possibly strongly dislike them. This is a terrible outcome for representation.

One of the first and most obvious solutions to fix a system like FPTP is to allow voters to express more nuanced preferences. This is where ranked voting systems and approval voting come into play.

Ranked Voting Systems

Now, you RCV fans might be thinking, “Well, RCV would fix that!” And, to that I say, “Possibly!”

Ranked voting systems, such as Ranked Choice Voting (RCV) (also known as Instant Runoff Voting), ask voters to rank the candidates in order of preference.

Proponents of ranked systems like RCV often tout the feature of ranking candidates as a way to express more nuanced preferences. And, indeed, this is an appealing step up from plurality’s “pick only one” mentality! However, one of the main issues comes with the fact that ranks are actually a lot less granular than one might think (not to mention, easily manipulated strategically).

Consider a Nader voter in the 2000 US Presidential Election, who might rank Nader over Gore over Bush. In RCV, this voter can express that they prefer Nader the most, Gore second, and Bush last. But the question is: which outcomes would this voter actually be satisfied with? Clearly, they would probably be most satisfied if Nader wins. There is little ambiguity there. And if Bush wins, they would likely be the most dissatisfied. But what about if Gore wins? We know that their happiness would be somewhere in between, but we have no idea if this voter would see the result and think “I’m okay with this” or not.

More generally, if we have three candidates, $A,B,C$, and a voter who prefers $A > B > C$, it’s impossible to know which results this voter would be satisfied with. Can you think of two lists of three things where you clearly rank them in preference 1, 2, and 3, but where in one list you are satisfied with the second place outcome, and in the other list you would be dissatisfied?

I, for example, love Mangos, Bananas are good, but not as good as Mangos, and I hate Avocados. So I rank them Mango > Banana > Avocado. If Mango wins, I’m happy. If Avocado wins, I’m sad. But if Banana wins, I would be okay with it, since it’s still a fruit I like.

On the other hand, while I hate Avocados more than anything in the world, I also strongly dislike Beets. Suppose that bananas drop out of the running, and I have to choose between Mangoes > Beets > Avocados. Now, I’m only okay with Mangoes winning, and I would not be satisfied with anything else.

Suppose we ask a voter who ranks $A > B > C$ to rate each candidate on a scale of 0 to 100. It’s impossible to know if they would say

\[A: 100, B: 95, C: 80\]
\[A: 100, B: 99, C: 0\]
\[A: 100, B: 50, C: 40\]
\[A: 100, B: 1, C: 0\]
etc.

I am going to steal an example mentioned in a comment featured in this video from PBS Infinite Series: Consider the option between

$11
$10
$1
A punch to the face

I think most of us would rank the options in the order given above. But how much better is 10 dollars compared to one dollar? Am I okay with only getting one dollar? How much better is 11 dollars compared to 10? How much more do I really value 11 dollars over 1 versus 10 dollars over 1? It’s also very difficult to really express that I would strongly prefer a dollar over a punch to the face.

In short, the ranking system does not capture the intensity of the voter’s preferences. While often one is allowed to simply not rank certain candidates, this can often be utilized strategically (ex. if a candidate says to only rank them and not others). Additionally, this leads to ballot exhaustion issues in RCV, where if a voter’s ranked candidates are all eliminated, their ballot no longer counts in the final rounds. And then, the final winner may not actually even get majority support of the entire electorate!

Now you may be thinking: “Well, we can just use a range/rating voting system then!” The issue with that (in general) is that it gets even more complicated, and is even more susceptible to strategic voting. With one key exception, which I will get to later. (Hint: it’s approval voting.)

Getting back to the different metrics that voting systems define, RCV can actually be much worse in valuing first place support. In fact, in RCV, it’s possible for a candidate to win with just over $\frac{1}{2^{N-1}}$ of the first choice vote (though, the proportion of voters who rank the winner last cannot exceed 50%)! I recommend you try to find an example where this happens as an exercise (try for three candidates first). So, sure, RCV clearly does not measure first choice support well, but that’s also not the point of RCV!

Rather, RCV instead looks to eliminate candidates until one candidate has majority support over all others (in terms of the remaining ballots, at least). But, we will go into quite in-depth about how problematic this can be later.

The point I am trying to make is that the metrics that different methods optimize for can yield strictly different outcomes for (at least) pathological cases. However, pathological cases do not a bad voting system make. It’s easy to point out undesirable outcomes for any voting system if you look hard enough.

What Makes a Good Voting System?

There are cases where different metrics may be arguably desirable over others. What makes a good voting system is indeed subjective. But, as a mathematician, I think the following criteria are important:

The system is simple enough for the average voter to understand.
The system satisfies as many desirable mathematical properties as possible. This ensures a robustness against various paradoxes and pathological cases.

These, I think, are universally agreed upon, for the most part. However, I will add one more criterion that I think is important:

3: The system results in optimal representation of the electorate.

Now, this last criterion is a bit vague. One could argue that whoever has the most first choice support, and gets sufficient enthusiasm, should win (even if they only get about $1/N$ of the vote). Or perhaps one might propose that the majority should rule, should a majority agree on one candidate.

Optimal representation can mean different things to different people. Is it maximized total voter utility (even if that means a few extremely happy voters and many unhappy ones)? We will delve into this more later, but for now, I want to introduce a concept that helps us formalize this idea of an “optimal” outcome.

There is an objective way to measure whether or not a candidate is really the optimal winner of the election: the Condorcet criterion.

Condorcet Winners

A Condorcet winner is a candidate who would beat every other candidate in a head-to-head matchup. That’s it! Simple enough, right?

But why pick this as the end-all-be-all criterion for optimal representation? Well, the way I think of it is as follows: Suppose you run the election and get a winner, call them $W$. The question one should ask is: “Is there any other candidate $C$ such that a majority of voters would prefer $C$ over $W$?” If the answer is yes, then clearly $W$ is not the optimal representative of the electorate, since a majority would prefer someone else! In a sense, there would be doubt about whether $W$ is really the best choice if another candidate $C$ is preferred by a majority.

More voters (approximately 500,000, in fact) preferred Al Gore over George W. Bush in the 2000 US Presidential Election, yet Bush won due to the Electoral College system, and an absurdly close vote in Florida. Was Bush really the optimal winner if more voters preferred Gore over him?

Therefore, we sort of give a rigorous and deterministic definition of the “optimal” winner, essentially by looking at how the winner compares to every other candidate in a head-to-head matchup.

Now, if every election had a Condorcet winner, life would be easy. We could just pick the Condorcet winner every time, and be done with it. However, this is not generally the case, at least theoretically. The classic paradox is the rock-paper-scissors scenario, where we have three candidates $A,B,C$, and the electorate is split into three groups:

Group 1 prefers $A > B > C$
Group 2 prefers $B > C > A$
Group 3 prefers $C > A > B$

With the right number of voters in each group (ex. an equal split), we can have a situation where $A$ beats $B$, $B$ beats $C$, and $C$ beats $A$, creating a cycle with no Condorcet winner. If the three groups above have equal size, then each candidate wins exactly one head-to-head matchup two to one, and loses by the same margin in another matchup.

Now, there is a way to potentially fix a lack of a Condorcet winner in some cases, by measuring by how much each candidate beats the other candidates, and starting the ranking from there. But the main issue is that this method is very complicated! Thus, it arguably fails our first simplicity criterion. Particularly if we have to hold multiple runoffs or head-to-head matchups to determine the winner.

Luckily, there’s a way around holding multiple runoffs and elections: by asking the voters to rank the candidates. This way, we can simulate head-to-head matchups without actually holding them. If we ask voters to rank candidates, we can see who would win in a head-to-head matchup by simply looking at their rankings. We can compare how many voters rank candidate $A$ over candidate $B$, and vice versa, to see who would win in a head-to-head matchup.

However, in addition to the granularity issues mentioned earlier with ranked voting systems, ranked voting systems are also notoriously susceptible to dishonest strategic voting. One could, for example, rank their favorites highest, and the other front-runners lowest, regardless of their actual preferences.

It is primarily the granularity which, in some ways, leads to the general issue that Arrow’s Impossibility Theorem highlights.

Arrow’s Impossibility Theorem

Okay, it’s time for math. Okay, not that much, but a little. Without getting into the weeds of the proof of Arrow’s theorem (Veritasium has a very accessible video on it), I want to give you a very rough idea of what exactly it says is impossible. This is a math blog after all!

For an election, we essentially have two desirable ideals for a voting system:

Responsiveness: The result of the election should reflect the preferences of the electorate.
Stability: The result of the election should not be unduly influenced by irrelevant factors.

One can consider a type of stability being that the introduction of a new candidate should not change the relative ranking of existing candidates (i.e. no spoiler effect). If Gore beats Bush in a two candidate race where nobody is allowed to vote for anyone else, then introducing Nader should not cause Bush to suddenly jump over Gore in the final ranking if nobody swapped the relative rankings of Gore and Bush (ex. no Gore voters moved Bush above Gore, and no Bush voters moved Gore above Bush). However, we have a slightly different notion of stability used in Arrow’s theorem.

We articulate these two ideals formally and mathematically as follows:

Pareto Efficiency: If every voter prefers candidate $A$ over candidate $B$, then the final result should have $A$ ranked higher than $B$. This is a formalization of a relatively minimal responsiveness property.
Independence of Irrelevant Alternatives (IIA): The idea is that if everyone in the electorate does not change their relative rankings of two candidates $A$ and $B$, then the final result should not change the relative ranking of $A$ and $B$. This is one formalization of the idea of “stability” mentioned earlier.

To get a sense of what IIA means, let’s continue our example of Gore and Bush. Suppose that Gore does better than Bush in an election. For example, say Gore gets more votes overall. Now, suppose that some Gore voters decide to move Nader around in their ballot, but never change the relative ranking of Gore and Bush. For example, changing $G>B>N$ to $N>G>B$ or $G>N>B$. In all three ballots, Gore is still ranked higher than Bush. If the voting system satisfies IIA, then the final result should still have Gore ranked higher than Bush

Clearly, IIA would have been violated in the actual 2000 US Presidential Election in Florida. We had that Bush beat Gore (we had Bush > Gore > Nader as the final ranking, obtained by counting first place votes), but suppose that 538 Nader voters in Florida who preferred Gore over Bush had instead voted for Gore. That is, they change Nader > Gore > Bush (i.e. a vote for Nader) to Gore > Nader > Bush (a vote for Gore). Then Gore would have gotten more first place votes than Bush, and thus would have won Florida and the presidency. But we did this without changing anyone’s relative ranking of Gore and Bush! Those 538 voters still preferred Gore over Bush, but the “irrelevant alternative” of Nader caused the final ranking of Gore and Bush to flip.

More formally, if $A > B$ in the final result, then as long as everyone’s relative rankings of $A$ and $B$ stays the same, they should be able to change all other aspects of their ballot (including introducing new candidates) without changing the fact that $A > B$ in the final result. Maybe someone else wins, or $B$ jumps down to last place, but $A$ should still be ranked higher than $B$ in the final result.

I want to be clear about what exactly these two properties mean, though, individually. For example, let’s talk about systems that satisfy one but not the other.

An Example of IIA

Take a system where we sort the candidates alphabetically by their last name. For example,

\[\mathcal{B}\text{ush} > \mathcal{G}\text{ore} > \mathcal{N}\text{ader}\]

This system satisfies IIA, because even if everyone changes their vote entirely (but namely if they keep relative rankings of two candidates fixed) it does not change the relative ranking of existing candidates in the final ranking. But, it does not satisfy Pareto efficiency, because it does not respond to the electorate’s preferences at all. What if literally every voter preferred Gore over Bush? The final result would still have Bush ranked higher than Gore. Intuitively, this method definitely has the stability, but no responsiveness at all.

We saw before that Plurality Voting (FPTP) satisfies Pareto efficiency, but not IIA. We saw the IIA failure with the 2000 election above, where 538 voters could only move Nader but change the relative result of Gore and Bush. But we can still see a clear basic responsiveness. If every voter prefered Gore over Bush, then Bush would not have gotten a single vote, and thus Gore have had to get more votes than Bush. Thus, FPTP is Pareto efficient.

Dictatorship

But, okay, I want to give you one voting system that does in fact satisfy both properties: dictatorship! That is, it’s me, I decide who wins the election. Taylor wins. Whatever my ranking is, that’s the final result.

Well, if everyone prefers $A$ over $B$, then I, the dictator, must also prefer $A$ over $B$, and thus the final result has $A$ ranked higher than $B$. So Pareto efficiency is satisfied. It’s responsive (at least to my preferences!).
Also, if everyone’s relative rankings of $A$ and $B$ do not change, then that includes me, the dictator! Thus, the final result will still have the same relative ranking of $A$ and $B$ (because I didn’t change my ranking of them, and the final result is just my ballot!). So IIA is satisfied.

Now, what Arrow’s impossibility theorem says is that dictatorship is actually the only ranked voting system that satisfies both properties! In fact, the underhanded sleight-of-hand that Arrow’s theorem rests on basically comes down to the fact that no “reasonable” (non-dictatorial) voting system has IIA. In a sense, dictatorship is the absolute simplest voting system possible, and that simplicity is what allows it to satisfy basically any desirable property you can throw at a voting system, all while being absolutely terrible in practice.

The proof is way too long to go into here, but I’d like to share a bit of intuition for why this is the case. As we said before, one can effectively think of Pareto efficiency as a “responsiveness” property, and IIA as almost a type of “continuity” or “stability”. A crucial lemma in the proof of Arrow’s theorem is that if every voter ranks a candidate $A$ either first or last, then $A$ must either win or be in last place. This is called the “extremal” lemma.

This essentially comes down to the fact that we can rearrange all other candidates without changing anyone’s relative rankings of the extremal candidate $A$. By Pareto efficiency, if we rearrange two other candidates $B$ and $C$ such that everyone prefers the original loser over the original winner, then we have to switch their rankings in the final result. If the extremal candidate were in the middle, IIA would prevent us from being able to switch the rankings of $A$ without breaking a type of transitivity.

For those who are here for the math, the actual proof of Arrow’s theorem is actually a little bit like the intermediate value theorem in calculus. The argument essentially comes from the idea that eventually we can “push” a candidate from being everyone’s last choice to a winner, by moving them up from the bottom of one voter’s ranking to the top at a time. At some point, that candidate has to cross from last place to first, and at that point, there must be a voter who “decides” the outcome. Essentially, we can show this decider is actually a dictator, and their ranking is the final result. More specifically, that they decide all pairwise matchups.

A note on the Gibbard-Satterthwaite theorem

Now, it gets even worse for ranked voting systems. The Gibbard-Satterthwaite theorem states that every minimally responsive non-dictatorial ranked voting system with at least three candidates is susceptible to strategic voting.

That is, there exists at least one scenario where a voter can misrepresent their preferences in a way that leads to a more desirable outcome for them. This means that the voting system might not just be opaque and complicated to understand for the average voter, but it might require them to be strategic in order to get the best outcome (which they may also not be particularly happy with) that they can. Since elections are also often held simultaneously, voters have to guess what other voters will do, and vote strategically based on that. When, for example, polls are misleading, this can lead to disastrous outcomes.

The proof of this theorem is actually quite interesting, in my opinion. The intuitive idea comes down to “blocking sets”. For example if you have a group of 51 senators who agree to vote the same way on everything, then they decide the outcome of any vote in the senate. This is a blocking set.

The proof of the Gibbard-Satterthwaite theorem relies on showing that in a ranked system satisfying strategyproofness (i.e. no voter can get a strictly better outcome by being dishonest), if you split up a blocking set, one of the pieces has to be a blocking set. Repeatedly applying this idea, we can whittle down any blocking set to a single voter, who is then a dictator. What this shows is that strategyproofness is (like IIA) a property so strict in a ranked system that it can literally only be satisfied by dictatorship.

Here’s a fun voting system which is still strategyproof and (slightly) more democratic: random dictatorship! Take a random ballot cast and that’s the result of the election. It’s dictatorship but everyone gets a chance to have their voice heard. I wonder why nobody advocates for this system…

At this point, you may be lowering strategyproofness and IIA into a coffin together. It seeems that only dictatorship can satisfy these properties. But, that’s only for a ranked voting system. Don’t leave them to rest in peace just yet!

An Example of Strategic Voting in RCV

Let’s consider a concrete example using the 2016 GOP primary, supposing it was done under RCV. Donald Trump was the clear front‑runner, while Ted Cruz and Marco Rubio were competing for second place. Some polling showed Marco Rubio was often a strong second choice for many GOP primary voters, and was doing better in general election polls, while Cruz was not well-liked outside of his base.

Let’s say Trump has a clear plurality with about 40% of first‑place support. And let’s suppose Ted Cruz and Marco Rubio are nearly tied behind him at about 30% each.

In this scenario, under RCV, the crucial question is not if Trump will win the first round, but of which of Cruz or Rubio will be eliminated first, and that elimination will decide the winner.

Suppose

Cruz voters tend to rank Cruz > Rubio > Trump,
while Rubio voters tend to rank Rubio > Trump > Cruz.

What should an abstract Trump supporter do with their ranked ballot in this situation? The honest option is to put Trump first, but in this configuration there is a counterintuitive strategic move that becomes strategically beneficial: Trump supporters can increase Trump’s chances by ranking Cruz ahead of Trump (Cruz first, Trump second), even if they dislike Cruz.

Why would they do this? Well, if Cruz is eliminated first, a large majority of Cruz transfers may flow to Rubio, pushing Rubio to win the final round’s head‑to‑head against Trump. If, instead, enough Trump supporters rank Cruz first, pushing Rubio to have the least first place votes and be eliminated, then Trump only needs a third of Rubio’s votes to push him to a majority over Cruz and win the primary.

Thus, even when most Trump supporters genuinely place Cruz last, it can still be optimal for them to misorder their ballot to influence which rival is knocked out first. Their honest ranking is not the optimal strategy, and they are encouraged to pretend their least favorite is actually their favorite in order to get a better outcome. This shows that RCV is susceptible to strategic voting. We can also see it’s not monotonic nor sincere:

Not monotonic: ranking a candidate higher can actually hurt their chances of winning.
Not sincere: voting dishonestly can lead to a better outcome.
Note: to be strategyproof, a voting system must be sincere and have only one optimal strategy (to vote honestly). Thus, RCV is not strategyproof because it is not sincere.

A small, concrete paradox makes the point. Imagine two Trump supporters cast Cruz‑first ballots and those two ballots are exactly the deciding factor that causes Rubio to be eliminated by one vote in round one. Trump then wins in the final round. If those two voters had instead voted honestly (Trump first), the first‑round ordering flips so that Rubio beats Cruz by one vote, Cruz is eliminated, and Rubio goes on to beat Trump. In that case, moving Trump higher in their ranking (voting honestly) actually causes Trump to lose. This is a clear violation of monotonicity, sincerity, and strategyproofness.

Further, notice that under these rankings, there is no outcome where Cruz wins this election unless he is ranked first by a majority of voters. This does show that RCV is quite good at not electing candidates who are widely disliked (it cannot elect a Condorcet loser, a candidate who loses all head-to-head matchups against other candidates). But it also means that Cruz is strictly a spoiler in this election, who seems to have siphoned off votes from Rubio, even though Rubio would actually be a Condorcet winner. That is to say, while RCV does help prevent some spoiler effects, it is still absolutely susceptible to them in other ways.

The point of this example is not to make claims about actual GOP primary voters from the 2016 election, or proclaim realistic theoretical results, but that this (relatively) plausible configuration, based on actual perceived dynamics, can lead to RCV pathologies.

Now, it may seem like all hope is lost. For hundreds of years, competing ranked systems squabbled over which was best, until Arrow, Gibbard, and Satterthwaite came along and dashed all hopes of a perfect ranked voting system.

However, there is actually one non-dictatorial voting system that satisfies IIA and a relaxed version of Pareto efficiency, and for which strategic voting is both more straightforward and more sincere (i.e. strategic voting is less about misrepresenting the order of one’s preferences, and being sincere can lead to a better outcome). That system is approval voting.

Approval Voting

Okay, so what is approval voting (or Approval Voting)? If it satisfies IIA and Pareto efficiency, it must be super complicated, right? Actually, it’s dead simple. In fact, it’s arguably even simpler than plurality.

In approval voting, each voter can vote for as many candidates as they want. The candidate with the most votes wins.

I would argue it’s the simplest non-dictatorial system that exists. Plurality is basically approval voting with the added requirement of one vote per voter. This simplicity is part of what makes Approval Voting so robust.

Personally, I can think of a number of examples of elections where I would be perfectly happy to vote for multiple candidates. I often find myself, when watching a debate, thinking “Yeah, I definitely like two of these candidates, and would be happy with either winning. That other one, though, no way.” Approval voting allows me to express that sentiment directly.

Note: This is technically a rated system, but with only two levels: approve or disapprove (a binary system). This gives us the ability to escape the impossibility results that plague ranked systems while also keeping things simple such that strategic voting is way less complicated. For example, you can’t just give your favorites a 100,000 points and literally everyone else -100,000 points (even if you actually like those other candidates) in an attempt to game the system. Choosing between giving a candidate a single vote or not giving them a vote is far less vulnerable to strategic manipulation. You can even effectively vote against a candidate by approving all other candidates except them!

It can’t be that simple, can it? Well, yes, it actually is! And not just this, it actually satisfies far more desirable properties than just Pareto efficiency and IIA.

First, let’s talk about Pareto efficiency. Approval is not strictly Pareto efficient, but it does satisfy a relaxed version of it. Take this very simple example: Suppose every voter prefers $A$ over $B$, but still every voter approves of both $A$ and $B$. Then, $A$ and $B$ will be tied for first place with 100% of the vote each. Thus, we can’t say that $A$ will strictly beat $B$, but we can say that $A$ will not lose to $B$!

Now, let’s see why it satisfies IIA. The workaround is that we’ve left rankings behind. Since we only have two tiers of preference, then the only way to adjust one’s ballot while preserving relative preferences of two candidates $A$ and $B$ is to move both candidates between the approve and disapprove buckets together (i.e. we can only change the ballots of voters who are indifferent to them, and must remain indifferent), in which case their total approval votes both go up or down together, preserving the difference in their totals. Therefore, Approval voting actually satisfies IIA! In fact, it almost satisfies a super version of IIA, because no matter how everyone else changes their ballots, the exact difference between two candidates’ approval totals is preserved as long as no voter changes their relative approval of the two candidates. Thus, approval voting is actually even more stable than IIA requires! Additionally, the spoiler effect is almost entirely eliminated.

For the spoiler effect, instead of a strict ranking, for which adding a new candidate requires comparing that new option to all existing options, we instead essentially ask voters to group candidates into two categories: those they approve of, and those they do not. In other words, the candidates for which the outcome is satisfactory, and those for which it is not. Note, this means that instead of the $O(n!)$ growth of possible ballots, we only have $2^n$ possible ballots (each candidate is either approved or not approved). This is a massive reduction in complexity, particularly once you have 4 or more candidates.

Unlike a system like RCV, where first choice support (and ranks in general) matter immensely, when we add a new candidate in approval voting, it only requires them to ask which bucket the new candidate goes into: approve or not approve. In comparison with a ranked system, where you have to figure out where the new candidate fits into your existing ranking of all other candidates, approval voting requires an objectively simpler binary decision.

The assumption being that if you are okay with $A$ winning, and we add $C$, that shouldn’t change whether or not you would be okay with $A$ winning. Now, granted, it may change how happy you are with $A$ winning, but if your goal is to elect a candidate that you would be satisfied with, then that granularity is completely unnecessary, and only complicates the process. Therefore, Approval voting almost entirely eliminates the spoiler effect, and satisfies a stronger version of IIA.

However, it is still possible that adding a new candidate could change whether or not you approve of an existing candidate, so we cannot say that approval voting completely eliminates the spoiler effect. But it still gets us close, and the fact that it isn’t swayed by irrelevant alternatives is a massive boon to its robustness.

In general, the more complicated a voting system is, the less robust it will likely be. As mentioned before, each voting system essentially defines its own metric for who should win. The more complicated the system, usually in pursuit of satisfying a more complicated metric, the less robust it will be to pathological cases. And the scenarios where it rates poorly on other metrics will be more glaring. In contrast, approval voting measures the simplest metric possible: who would voters be satisfied with winning? In that sense, it also arguably measures the most important metric, in my opinion.

As a side note: typically runoffs and elimination methods tend to be the worst offenders when it comes to failing to elect the Condorcet winner, because the elimination criterion can often knock out the Condorcet winner early on (essentially, it can be shortsighted). Is having the least first place enthusiasm really the most important disqualifier for a candidate? In my opinion, no. It fails to consider both overall satisfaction, and head-to-head matchups.

Take the example above: where $A$ is preferred over $B$ by every voter, but both are approved of by every voter. From one perspective, we have that $A$ should beat $B$, and it’s frustrating that they are tied. However, from another perspective, no matter how the tie is broken, the outcome is satisfactory to every voter! In particular, for multiwinner elections, such as for a council or committee, having both $A$ and $B$ win is actually a perfectly fine outcome. And the fact that they might split votes in an elimination style system such that one of them might be eliminated early on is actually quite a glaring issue.

For all you RCV fans out there, touting your supposed granularity, you have to grapple with the fact that you are not picking a strictly more granular system. You are picking a ballot type that captures a certain type of granularity. One must decide which is more important:

Capturing only relative positions of candidates
Capturing which candidates a voter would actually be satisfied with

It’s like the example we talked about at the start: ranking $A > B > C$ says absolutely nothing about whether the voter would be satisfied with $B$ winning. Yes, it’s better than $C$ winning, but is it good enough? You may object: but in the case that the voter only approves of $A$, then they can’t fill out their ballot in a way that “defends” against $C$ winning over $B$. In approval voting, the voter is treating $B$ and $C$ as equally undesirable outcomes!

However, this is precisely the point: the voter is unhappy either way. To see why approval voting is actually better at leading to optimal overall outcomes, we need a shift in perspective from too much focus on each voter’s ability to give a type of granularity of their personal preference, to the overall outcome of the election.

If enough voters are unhappy with both $B$ and $C$ winning, then neither are likely to win under an approval voting system. But if the subset of voters who only like $A$ is sufficiently small, then is the ability to guard against $C$ winning really worth the risk of ranked systems? Personally, I think not.

The classic example where RCV fails to elect the Condorcet winner due to a phenomenon called “center squeeze”. In 2009, Burlington, Vermont held a mayoral election using RCV. After the first round, the results for the active ballots between the three candidates were as follows:

Kurt Wright (R): 3,294 (37.3%)
Bob Kiss (P): 2,981 (33.8%)
Andy Montroll (D): 2,554 (28.9%)

Montroll was eliminated first, and most of his votes transferred to Kiss, who won the final round against Wright with 51.5% of the vote. However, the head to head matchups were as follows:

Kiss vs Wright: 51.5% to 48.5% (The actual final round result where Kiss won)
Montroll vs Wright: 53.0% to 47.0% (Montroll wins by 6 percentage points)
Montroll vs Kiss: 53.9% to 46.1% (Montroll wins by 7.8 percentage points)

Thus, Montroll was the Condorcet winner, beating both other candidates head-to-head! However, being more in the middle of the Republican Wright and the more left-leaning Kiss, Montroll was eliminated first, and thus could not win. This was so upsetting to many Democratic and Republican voters that Burlington eventually repealed RCV and returned to FPTP because of this election!

And, at that point, what good is RCV as a reform if it can lead to such undesirable outcomes that it causes a repeal back to plurality? Under approval voting, Montroll would have likely won handily, since he was the Condorcet winner, and thus likely had broad approval across the electorate.

In the first round of the Burlington election, Montroll had only 23.0% of first choice support, but was still the Condorcet winner. If you truly value majority rule, then I think this shows, quite blatantly, that eliminating by first choice support is simply not the right approach. Because, sure, Montroll only had 23.0% of the first choice support, but a majority of voters preferred him over every other candidate! If you cannot guarantee the election of a Condorcet winner, then you open yourself up to these kinds of upsetting outcomes which can erode public trust in the electoral system, leading us back to square one (or worse, because voters will be less receptive to future reforms).

A more recent (though also more debated) example happened in a 2022 special election in Alaska (see this paper by Adam Graham-Squire and David McCune for more details).

First-Choice Votes:

Begich: 53,810 (28.5%)
Palin: 58,973 (31.3%)
Peltola: 75,799 (40.2%)

But the head-to-head matchups were:

Begich vs. Peltola:
- Begich preferred: 87,859 voters (52.5%)
- Peltola preferred: 79,451 voters (47.5%)
- Begich wins by: 8,408 votes
Begich vs. Palin:
- Begich preferred: 101,217 voters (61.4%)
- Palin preferred: 63,618 voters (38.6%)
- Begich wins by: 37,599 votes

The numbers for the Peltola vs. Palin matchup are less clear, but it appears that Peltola almost surely won that head-to-head matchup (especially since she beat Palin in the final round), making Palin a Condorcet loser. Thus, Begich was the Condorcet winner, but was eliminated first due to having the least first choice support (again). This election became the rallying cry for RCV opponents, and in 2024, Alaska voted on whether to repeal RCV which failed by less than a percent (160,230 (49.88%) to 160,973 (50.11%), failing by just 743 votes).

This is why I do think that picking the most robust system has a real practical importance. If you go with “just good enough in practice”, then you risk these kinds of failures which can lead to backlash and repeal of reforms.

The image in some minds that a Condorcet winner is some bland centrist candidate is not always true. Steven Brams has claimed, rightfully, in my opinion, that Ronald Reagan was a Condorcet winner. In their seminal paper on approval voting, they also claim that Richard Nixon was a Condorcet winner in 1968. I might add Eisenhower to that list. And not just Republicans: I would argue that Franklin D. Roosevelt, Barack Obama, Harry Truman, and Lyndon Johnson were all Condorcet winners as well. Brams also argued that George McGovern and Barry Goldwater were extremist candidates who lost in landslides because their respective parties nominated them under plurality instead of more broadly acceptable alternatives. Though, I wouldn’t say the outcome would have been much different with alternative candidates, but I doubt the landslide losses would have been as bad.

In the above RCV scenarios, the Condorcet winners lost because they were behind by just a few percentage points (in both cases, less than 5%). Begich lost to Palin in the first round by about 3 percentage points, barely over 5 thousand votes. Montroll lost to Kiss in the second round by less than 500 votes. The idea that these Condorcet winners are so bland that they deserved to lose because they weren’t “exciting” enough is extremely suspect. In particular, if you value majority rule so much, then the only logically coherent conclusion is that the candidates who have majority support over every other candidate should be the natural choice!

Suppose we instead used approval voting in the above two elections. In both cases, I believe strongly that the Condorcet winners would have won handily. The progressive supporters of Kiss would have likely approved of Montroll, and the moderate Republicans and independents who supported Palin would have surely approved of Begich over Peltola. Thus, both Condorcet winners would have likely won with strong approval margins.

Now, what if I told you that approval voting actually guarantees that if there is a Condorcet winner, they will win the election? And, not just that, what if I told you approval voting is strategyproof and sincere (there is no strategic advantage to misrepresenting your preferences)? If that were true, then it would be a slam dunk. However, like the above generalization away from ranked systems, these properties come with some caveats and additional assumptions.

The primary assumption is that voters engage with the system in the intended way. That is, vote for all candidates they would honestly be satisfied with winning, and do not vote for candidates they would not be satisfied with winning. We call this “dichotomous” (two-tiered) preferences. Under this assumption, approval voting guarantees that

There is no incentive to vote insincerely (the system is sincere)
There is only one optimal strategy: the sincere strategy where you vote for all candidates you approve of and none that you don’t (the system is strategyproof)
If there are Condorcet winner(s), they will win the election if every voter votes sincerely.
There will, in fact, be at least one Condorcet winner among the set of winners.
Approval Voting is actually also (trivially) pareto efficient! If every voter ranks $A>B$, then that literally means that every voter approves of $A$ and none approves of $B$, so $A$ must strictly beat $B$ (in fact, 100% to 0%). In effect, this is super Pareto efficiency.

Now, to be fair, these are some pretty big assumptions. I would not be so bold as to claim that Approval Voting is actually strictly Pareto efficient in practice (it is definitely weakly Pareto efficient, though). In reality, voters generally have multiple tiers of preference. For example, we could have

My favorite candidates
Candidates I am lukewarm about
Candidates I dislike

You can also split these tiers further into, for example, candidates you dislike but not hate, or candidates you are lukewarm about but would still be satisfied or dissatisfied with winning. The point is, real voters have more complex preferences than just approve/disapprove. And, thus, strategies can absolutely exist. But, the strategyproofness under dichotomous preferences still sort of leaks into the general method as a whole, and at least greatly simplifies the possible strategies under approval voting.

We must again decide: what is the purpose of the voting system? Is it to maximize individual voter utility, or is it to elect a candidate that the most voters would be satisfied with?

If the voter’s goal is to elect a satisfactory candidate, then they arguably do truly have dichotomous preferences, and these properties do hold. In particular, with this goal,

voters do truly have one and only one optimal vote
that vote is entirely sincere (the vote for all satisfactory candidates)
that vote will be strictly beneficial to their goal (they will never be inadvertantly helping an unsatisfactory candidate beat any candidate they find satisfactory, which can happen in RCV)

This also means that for you, as a voter under approval voting, this strategyproofness and sincerity can apply strictly to you. Regardless of what everyone else does, if you choose to view your goal as electing a candidate you find satisfactory, then your optimal strategy is to vote for all candidates you find satisfactory, and none that you do not. Monotonicity means your voice is heard, and strictly helps your approved candidates and hurts your disapproved candidates. Unlike plurality, where a third party is basically a wasted vote, potentially helping your least favorite candidate win, or even RCV, where you might have failed to crunch the numbers and find the optimal, potentially insincere, ranking to get your favorite candidate elected, approval voting gives you a straightforward, sincere strategy that always strictly helps YOU!

Why Approval Satisfies So Many Properties

Let’s go one by one, and see why approval voting satisfies so many desirable properties. At least, under certain assumptions.

Monotonicity

A voting system is monotonic if voting more favorably for a candidate cannot hurt their chances of winning, and voting less favorably cannot help their chances of winning. Approval voting is clearly monotonic, since voting for a candidate gives them an additional vote, and not voting for them does not. There is no way that voting for a candidate can hurt their chances of winning, since it only adds to their total votes.

In fact, changing your ballot from not voting for a candidate to voting for that same candidate does strictly help their chances of winning (this means approval voting actually satisfies something stronger than monotonicity, since changing from a non-vote to a vote can only strictly help them). If they lost before, your vote can make them tie. If they tied for first before, your vote will make them win.

Brams and Fishburn prove that any optimal strategy (even if situational) involves approving of your most favorite candidates and never approving of your least favorite candidates. What happens in the middle depends on the situation, which we will get to later. This is, in fact, true even if we assume that voters have more than two levels of preference (ex. favorite, okay, dislike). Thus, monotonicity holds in all cases.

You can always vote for your favorites

It deserves extra emphasis that in approval voting, there is literally never a reason to not approve of your favorite candidates. This is in stark contrast to ranked systems, where ranking your favorite candidate first can sometimes make them lose (as we saw in a previous example, due to failing to pick a dishonest strategy which would potentially make them win). In approval voting, strategy is much more straightforward, and never involves candidates you love or hate.

In ranked systems, strategy is a full permutation optimization problem, where one has to decide the entire ordering of candidates. One must consider how well each candidate is doing in the polls, as being honest might give you a worse outcome! This added complexity leads to more opportunities for paradoxes and undesirable outcomes. It also makes it difficult for voters who are not familiar with the candidates to give a full ranking, increasing cognitive load, potentially leading to ballot exhaustion, and (again) a well intentioned vote leading to a worse outcome if they do not pick a strategic ordering.

In contrast, approval voting only requires a binary decision for each candidate: approve or not approve. You never have to consider not voting for your favorite candidates, and you never have to consider voting for your least favorite candidates. Strategy can simply be boiled down to: “I want to vote for these candidates I like, and I guess I can give a vote to a safer frontrunner I prefer over the other frontrunner, just in case my favorite candidate can’t win.” Such a voter is not throwing their vote away to a candidate that won’t win, and can simultaneously include a simple strategy with a safety vote. And they don’t have to break the law by submitting multiple fraudulent ballots!

Sincerity

First, what is “sincerity” in voting systems? Essentially, we say that a voting system is sincere for a given voter if there is no incentive for that voter to misrepresent their preferences. In the context of a ranked system, this might be swapping the order of two candidates in your ranking (which can tip the scales, as we have seen). In approval voting, this would be not approving of a candidate you like more than another candidate you are approving of. For example, if you rank $A > B > C > D$, then approving of $A$ and $C$ but not $B$ would be insincere.

The key to showing that approval voting is sincere in at least some cases requires us to investigate the strategy space of a voter. That is, what are the possible (or, more importantly, optimal) strategies a voter can take?

Well, by monotonicity, we know that, at least, an optimal strategy must involve approving of all favorite candidates, and not approving of all least favorite candidates. The question is: what about the middle candidates?

Strategyproofness

In reality, insincerity can be optimal. Suppose that $A$ and $B$ are tied in polling, and so are $C$ and $D$. If you really like $A$ the most, then approving of $B$ in the case where they are tied for first with $A$ could lead to a tie, which would not be optimal if you strictly prefer $A$. And if both $A$ and $B$ lose by more than one vote, voting for $B$ makes no difference. Thus, not approving $B$ is actually an optimal strategy over approving both $A$ and $B$ in this scenario. Whereas, if $C$ and $D$ are tied for first, approving of $C$ would at least elect the lesser of two evils. This is a very contrived example, but it shows that there are cases where insincere voting can be optimal even in approval voting.

One thing to note, though, is that this strategy cannot be a dominating strategy, because there are other scenarios where approving of $B$ would be strictly better than not approving of them. As mentioned previously, Brams and Fishburn show that the only commonality of all optimal strategies is that you approve of your most favorite candidates and never approve of your least favorite candidates. The middle candidates depend on the situation.

It turns out that this is actually because we used four preference levels, however. If we only had two or three levels of preference, then sincere voting is always optimal in approval voting. You would never vote for your least favorites, and you would always vote for your favorites. The only strategy, in the case of three levels, would be to situationally vote for some of your middle-tier candidates under certain circumstances.

That is, if we actually had $A>B=C>D$ or $A=B>C>D$, then the only possible optimal strategies would be $\{(A), (A,B), (A,C)\}$ or $\{(A,B), (A,B,C)\}$ respectively. Notice that in both cases, we never skip any candidate we like strictly more than another candidate we approve of. Thus, no optimal strategy involves insincere voting under these preference structures.

In particular, if we assume that voters have only two levels of preference (approve/disapprove), then sincere voting is the only optimal strategy. This is because this would actually partition the candidates into your absolute favorites and absolute least favorites, for which we have already established that the only optimal strategy is to approve of all favorites and none of the least favorites. Further, excluding any candidate you approve of can only hurt their chances of winning against candidates you disapprove of. Thus, there is no incentive to vote insincerely (if the ultimate goal is to get an outcome one would be satisfied with, rather than achieve one strictly preferred favorable outcome to another satisfactory outcome).

So, we have this very strange Schrödinger’s preference space happening here. On one hand, in reality, voters have complex preferences, and it can be optimal to vote insincerely, even in approval voting. On the other hand, there is only one optimal strategy if we assume that voters have only two levels of preference (would be satisfied with/would be dissatisfied with): sincerely voting only for all those they approve of.

What we see is that it ultimately comes down to the perspective and goal of the voter. If the voter is merely trying to achieve an acceptable outcome, then approving honestly is the only optimal strategy. However, if the voter is trying to maximize their individual utility by angling for a favorite candidate over a less preferred (but still acceptable) candidate, then they expand the strategy space, at the cost of potentially contributing to an outcome they would be dissatisfied with.

We come back to the 2000 US presidential election example. Bush won the entire election by 537 votes in Florida. If 538 out of the subset of all 97,488 Nader voters who preferred Gore to Bush in Florida had voted for Gore instead, Gore would have won Florida and the presidency. It’s likely most of the Nader voters preferred Gore over Bush, but sincerity cost them dearly. Under approval voting, they would not have had to make that choice. They could have voted for Nader honestly, and optionally have voted for Gore as well, strategically or sincerely, depending on their actual opinion of Gore. However, even if they do not approve of Gore, the strategic move to also cast an approval vote for him would be slightly more palatable if they can still express their genuine approval of Nader.

Approval Condorcet Winner Theorem

Finally, we arrive at the pièce de résistance: what I call the Approval Condorcet Winner Theorem. This requires that we assume that voters have just two levels of preference: approve/disapprove. Under this assumption, we can show that if there are Condorcet winner(s), Approval Voting will always elect one of them.

In one sense, this is actually pretty obvious. If we have a winner $W$ (more generally a set of winners, who are tied), and denote $n(X)$ as the number of approvals candidate $X$ received, then we obviously need $n(W) \geq n(C)$ for every other candidate, and $n(W)>n(C)$ for every $C$ who didn’t tie for first. That is, the winner(s) must have strictly more voters approving of them than any losing candidate $C$. Otherwise, $C$ would have won instead of $W$, or at least tied for first!

But, it gets more interesting. Because this doesn’t immediately scream “Condorcet winner!” To see why $W$ is necessarily a Condorcet winner, we can observe the following, if we define $n(A\setminus B)$ as the number of voters who approve of $A$ but not $B$:

\[n(W) - n(C) = n(W\setminus C) - n(C\setminus W) > 0 \iff n(W\setminus C) > n(C\setminus W)\]

The key is that $W=W\setminus C\sqcup W\cap C$ (a partition) and similarly for $C$. That is, every $W$-approver is precisely either one who strictly approves of $W$ but not $C$, or one who approves of both. We can also see that

\[n(W) = n(W\setminus C) + n(W\cap C)\]

\[n(W)-n(C) = n(W\setminus C) + n(W\cap C) - (n(C\setminus W) + n(W\cap C))\]

which is $= n(W\setminus C) - n(C\setminus W)$.

Thus, the difference in cardinality of “strict” approvals (approving of one but not the other) is exactly equal to the difference in total approvals.

We can then say that if $W$ won an approval vote, then for any other candidate $C$ who did not win, the number of voters who approve of $W$ but not $C$ must be larger than the number of voters who approve of $C$ but not $W$. In other words, $W$ must beat $C$ in a head-to-head matchup of strict approvals. Therefore, $W$ is a Condorcet winner against all other candidates who did not win!

And what of all the candidates who tied for first? By this same logic, the difference of approvals is zero, so the difference in strict approvals must also be zero. So all winners tie against each other in head-to-head matchups of strict approvals. Therefore, all winners under an approval vote are also weak-condorcet winners against all other winners.

That is, we have that all winners in an approval vote are strong Condorcet winners against all losers, as all losers lose to all winners in head-to-head matchups of strict preferences.

In summary, we can say that the winners of an approval vote are exactly the set of candidates who are Condorcet winners with respect to the approve/disapprove partition induced by the voters’ ballots. That is, no losing candidate has a claim that the electorate would be more satisfied or represented by them.

I do want to take a moment to highlight that this theorem does not say that approval voting will always elect the regular Condorcet winner (RCW). Rather, it elects the Condorcet winner with respect to the approve/disapprove partition induced by the voters’ ballots. What I call the approval Condorcet winner (ACW).

This once again highlights a change in perspective:

While a regular Condorcet winner (RCW) is strictly preferred to all other candidates in head to head matchups, this requires focusing entirely on relative preferences that say nothing about any level of satisfaction. This desire to increase a type of granularity also admits cycles such that no candidate may even be a Condorcet winner.
By simplifying the question to be in terms of only considering how many voters would be satisfied versus unsatisfied, and removing those indifferent (with respect to this metric) to the outcome, we do guarantee that such winners must exist, and that approval voting must select them without fail.

I would also like to finally highlight that this theorem is less that “approval happens to elect the ACW”, and more that the nature of an approval vote necessarily projects all voters’ preferences into dichotomous preferences, and the approval vote is exactly measuring head-to-head matchups of strict approvals. Counting up all the checkmarks on the ballots is literally a pseudo-calculation of all head-to-head matchups of strict approvals between every pair of candidates. To know if candidate $A$ beats candidate $B$ in strict approvals, just check who got more total approval votes! If $A$ got like 10 more approval votes than $B$, then we know that $A$ must have beaten $B$ in strict approvals by exactly 10 votes.

Note that this also gives a trivial proof that an ACW always exists, since approval voting always produces at least one winner. Further, it shows that there are never any cycles in the strict approval head-to-head matchups: If $A$ beats $B$ in strict approvals, and $B$ beats $C$ in strict approvals, then $A$ must have more total approvals than $B$, and $B$ must have more total approvals than $C$. Thus, $A$ must have more total approvals than $C$, meaning that $A$ beats $C$ in strict approvals. Therefore, the strict approval head-to-head matchups are always acyclic. Or, put another way, approval voting and dichotomous preferences induce a transitivity to the social preference ordering of candidates.

Notice, that we haven’t said anything about strategy yet. This is because, by nature of the approval voting system, every voter necessarily projects their potentially complex, multi-tiered preferences into a two-tiered approve/disapprove system. Thus, if we take all voters at their word (or, rather, their vote), and assume they voted sincerely, then we can say that the winner(s) of the election are necessarily Condorcet winner(s).

Thus, the catch comes from the fact that voters may have voted strategically/insincerely. That is, any flaws in the result of the approval vote were not from the system itself, but the psychology of the voters. This is in stark contrast to ranked systems, where flaws can arise from the system itself interacting with sincere voters in particular cases (ex. center squeeze in RCV eliminating Condorcet winners in the first round).

I’d like to go on a brief tangent with a specific example to explain why I actually believe that an ACW is actually a much more preferable outcome than a regular Condorcet winner. And not just because this goal is one that can be achieved without fail, mathematically.

An Argument Against Condorcet Winners

I know I pumped up Condorcet winners a lot in previous sections, but I do want to bring up a very contrived example where electing the Condorcet winner may not be the best outcome.

Suppose we have three candidates, $A,B,C$, and the election miraculously only has two different ballot types:

$N$ voters prefer $B > C > A$. In particular, they hate $A$.
$N + 1$ voters prefer $A > B > C$. And, let us say that some number of these voters are at least moderately okay with $B$ winning.

Now, $A$ has a majority of the vote, which gets arbitrarily narrow as $N \to \infty$. And, thus, $A$ is trivially the Condorcet winner, since they would beat both $B$ and $C$ in head-to-head matchups $N + 1$ to $N$. Now, if we put all our stock into electing the Condorcet winner, or just a majority winner, then $A$ wins, and arbitrarily close to half the electorate is extremely dissatisfied with the outcome.

Consider the clear runner-up, $B$. In this scenario, no voter hates $B$. And, as long as at least two of the $N + 1$ voters who prefer $A$ are okay with $B$ winning, then $N+2$ or more voters would be satisfied with $B$ winning (also a majority). If we say that all $A$ voters are okay with $B$ winning, then 100% of the electorate is satisfied with $B$ winning. This makes $B$ the approval Condorcet winner! If we consider, rather than the strict rankings of candidates, the approve/disapprove partition induced by the ballots, then we have the $A$ versus $B$ matchup as follows:

$N$ voters approve of $B$ but not $A$
$N + 1$ voters approve of $A$ and some subset also approve of $B$

If no $A$ voters approve of $B$, then we have $N$ strict approvals of $B$ and not $A$, and $N + 1$ strict approvals of $A$ and not $B$. Thus, $A$ would be the approval Condorcet winner as well. Notice this is actually equivalent to a plurality vote between $A$ and $B$, since everyone is bullet voting for their favorite candidate.

If only one of those $A$ voters approve of $B$, then we have $N$ strict approvals of $B$ and not $A$, and $N$ strict approvals of $A$ and not $B$ (with one voter indifferent). Thus, we have a tie in strict approvals, meaning that both candidates are approval Condorcet winners.

If exactly two of those $A$ voters approve of $B$ then we have $N$ strict approvals of $B$ and not $A$, $N-1$ strict approvals of $A$ and not $B$, and two voters who approve of both. The comparison of strict approvals would be $N$ for $B$ and $N-1$ for $A$, meaning that $B$ is the approval Condorcet winner. Notice that the difference in strict approvals ($N - (N-1) = 1$) matches the difference in total approvals ($(N + 2) - (N + 1) = 1$).

If we extend this to say that all $A$ voters approve of $B$, then we have $N$ strict approvals of $B$ and not $A$, and zero strict approvals of $A$ and not $B$ (exactly $N+1$ indifferent approvals). Thus, we get $N$ strict approvals for $B$ and zero for $A$, still making $B$ the approval Condorcet winner. And, we can confirm that the difference in strict approvals ($N-0=N$) matches the difference in total approvals ($(2N+1)- (N+1) = N$).

However, in all these scenarios, $A$ will never not be the regular Condorcet winner (RCW). The only difference is just how much more widely $B$ is approved of. For those touting majority rule, let’s take the absolute worst case for $A$, where every single voter approves of $B$. We then have a choice between:

A candidate with the narrowest possible majority, where arbitrarily close to half the electorate is extremely dissatisfied
A candidate with potentially unanimous satisfaction, who lost the election because exactly one more voter preferred $A$ over $B$ rather than putting $B$ first.

In my opinion, the second outcome is clearly preferable. The argument that $A$ has more enthusiasm is quite weak. I talked to someone about this scenario, and they claimed that $A$ was the rightful winner, by pure merit of having majority rule.

Consider if we have $N=50,000,000$. Thus, there are $100,000,001$ voters total. Now suppose that two of those first place $A$ votes (about $2\times 10^{-6}$% of all votes) were mistakenly misplaced. Perhaps they fell under a cabinet, or lost in the mail, or the voters faced an emergency on election day and weren’t able to vote. Now, $B$ wins by one vote! And these same people would claim that $B$ is now the rightful winner, despite the fact that the overall satisfaction of the electorate has not changed at all!

The way I see this scenario is that when we value majority rule above all else, we can get strong chaos in close elections with respect to overall satisfaction. In comparison, if the entire electorate is satisfied with $B$, while nearly half is extremely dissatisfied with $A$, then $B$ appears to be clearly the better representative of the electorate. Thus, a method which measures overall satisfaction (perhaps you see where I am going with this) would not have such chaos in this misplaced ballots scenario: the result would be unanimous for $B$.

In comparison to a close approval election, the chaos is not at all with the overall satisfaction of the electorate. If the two frontrunners have about 80% approval, then no matter who wins, 80% of the electorate will be satisfied. The only possible difference might be the amount of overlap between the two candidates’ approval bases.

For example, if one candidate $A$ gets 80 approvals versus $B$’s 81, then it’s possible we may have

100% overlap. The single difference is from a single non-indifferent voter who approves of $B$ but not $A$ (ex. 0 strict $A$ approvals to $1$ strict $B$ approval).
0% overlap. In this case, there are 80 voters who strictly approve of $A$ and 81 distinct other voters who strictly prefer $B$.

In either case, the difference in voters who feel unrepresented by $A$ versus $B$ is always exactly 1. The proportion of overall satisfaction is entirely stable.

The above scenario ultimately poses a philosophical question. While I clearly prefer the second outcome, I can’t say that those who prefer the first outcome are objectively wrong. Ultimately, in the age of polarization, I think that finding candidates who can unite the electorate is far more important than finding candidates who can eke out a narrow majority due to fervent enthusiasm from a slight majority (or just a slight plurality if using FPTP, for example).

To be fair, in the case where only two of the $A$ voters are okay with $B$ winning, then $B$ would again win by one vote ($N+2$ to $N+1$), still leaving us with the same issue that now a majority of voters are at least somewhat dissatisfied with the outcome. This is arguably equally contrived as the near perfect tie! But, I would argue that in this case, one more voter feels represented than in the case where $A$ wins. However, you get a different issue with the legitimacy of that winner who does not have majority first choice support.

Now, if you are on board with the approval Condorcet winner being a good outcome to aim for, then the issue becomes strategy (you can read more about strategy in approval voting in this post). If every voter strategically bullet votes for their favorite candidate, then we are just back to plurality voting, and we’ve lost all the progress we’ve made.

The question becomes: is it possible to educate voters to vote sincerely in approval voting? Can we convince them that to view the goal of the election as electing a candidate they would be satisfied with, rather than maximizing their individual utility will lead to better outcomes for everyone? Without more real world data, it’s hard to say. However, I and many other academics believe that approval voting is the best system available for single-winner elections.

Summary

To summarize, we saw that

Plurality is really bad
RCV is strictly better than plurality, but is still flawed. It’s not monotonic, can eliminate Condorcet winners, and insincere voting can be optimal.
Arrow’s theorem shows that no ranked system can satisfy desirable properties like IIA and Pareto efficiency simultaneously, without being dictatorial.
Gibbard-Satterthwaite shows that no minimally responsive ranked system can be strategyproof without being dictatorial. There can always be a case where dishonesty is optimal.
We sidestepped these impossibility theorems by using approval voting, which is not a ranked system.
Approval voting is monotonic, satisfies IIA, and is weakly Pareto efficient. Further, under dichotomous preferences, it is sincere, strongly Pareto efficient, and strategyproof. It also almost entirely eliminates the spoiler effect.
Approval voting guarantees that all Condorcet winners (with respect to the approve/disapprove partition) will win the election. In fact, $A$ will get more approval votes than $B$ if and only if $A$ beats $B$ in a head-to-head matchups of strict approvals (and will do so by the exact same margin).
Approval votes are highly stable with respect to overall electorate satisfaction, even in very close elections. If two candidates have similar total approval percentages, then the difference in overall satisfaction between them is bounded by the difference in approval votes. That is, if two candidates have about 60% approval, for example, then no matter who of them wins, about 60% of the electorate will be satisfied. That proportion of satisfaction is almost entirely stable, regardless of how the tie shakes out.

The math is clear. Approval voting is superior to both plurality and RCV in nearly every way, except for having less real world data.

The Political Reality

The post is pretty much over. You can leave now. Again,

Vote for any alternative voting system over plurality
Consider approval voting seriously

The rest of this post is just some waxing poetic about the political reality of electoral reform.

I admit, I have bashed RCV a bit in this post. As I said before, RCV is strictly better than our current plurality system. If you find yourself lucky enough to vote on adoption of RCV, I absolutely encourage you to do so. However, I believe that approval voting is strictly better than RCV, mathematically speaking. It satisfies more desirable properties, and leads to better outcomes in more cases. We saw that in multiple major examples (Burlington, 2009 and Alaska, 2022) RCV’s pathologies DO occur, and they lead to major efforts to repeal RCV in those areas. This greatly worries me about the long term viability of RCV as the preferred alternative voting system.

However, the reality is that RCV has gained more political traction than approval voting. Although approval voting adoption would be logistically simpler (since it only requires changing how ballots are interpreted, rather than changing the entire ballot format), RCV has been more heavily marketed and pushed by various advocacy groups.

The quest for an alternate voting system is plagued by multiple opponents. Primarily, the two party system, which benefits immensely from our current plurality system. The two evils love that the spoiler effect exists, because it allows them to pressure voters into voting for the “lesser evil” of the two major party candidates, rather than risk “wasting” their vote on a third party candidate.

However, the advocation for different voting systems is also plagued by infighting between different advocacy groups. Each group believes their preferred system is the best, and thus they often sandbag each other rather than working together to push for any improvement over the current system.

We all agree plurality is terrible. But we also disagree on the best alternative. I believe I have made a strong mathematical case for approval voting. However, there are strong cases to be made for RCV, in opposition to my criticisms. It again comes down to what properties, and what metrics, one values most in a voting system.

Without enough real world data on approval voting, compared to the extensive data on RCV, one can rightly claim that approval voting is untested in practice. Mathematically elegant, sure, but unproven. Though RCV has its pathologies, and they have been documented, they are relatively rare in practice. Should that be good enough for us to accept RCV as the best alternative? I, personally, do not think so. Particularly since we have already had multiple high profile failures of RCV in practice, leading to repeal efforts (one successful, one almost successful). Approval voting has not had that chance to fail in practice yet.

The idea of maximizing electorate satisfaction, rather than individual voter utility, is not exactly the most enticing prospect for many voters, either. The flash of being able to rank candidates, though fundamentally mathematically flawed, is quite appealing. Arrow’s theorem itself, based on a property that most systems don’t even satisfy (IIA), is, in a sense, a bit underhanded, and hard to explain to the average voter.

Proponents of approval voting seem to struggle with marketing and enthusiasm. Not to mention, this idea of “actually, the system that actually has some momentum is not the best one” is admittedly a hard sell, which I am quite, unabashedly, guilty of making in this post.

Frankly, I don’t quite know how to end this post. The obvious solution is to get approval voting adopted in more places, to get the real world data we need to compare it to RCV. Approval voting has only really been adopted in two cities, Fargo, ND and St. Louis, MO, as well as some academic societies (where it has worked well, but elections in academic societies are very different from higher stakes elections for public office). Just this year, North Dakota banned approval voting and ranked choice voting altogether, unfortunately. Though flimsy reasons were given, it is quite clear that the two party system is trying to suppress any attempts at electoral reform, to preserve their duopoly.

Our current two party overlords love plurality voting, and they love that our movements for electoral reform are fractured. If you, reading this, have an opportunity to get approval voting OR RCV adopted in your area, I urge you to do so.

Forgive me for getting up on my soapbox, but the United States is nearly as polarized as it was before the Civil War. This month, one poll showed that while over 85% of Republicans approve of Donald Trump, 95% of Democrats disapprove of him. Independents are also 30% approve and 69% disapprove. This is an astounding level of polarization.

Approval voting, by maximizing overall electorate satisfaction, rather than individual voter utility, and electing Condorcet winners, is a step towards healing that polarization. In the past, a substantial portion of the electorate tended to be satisfied with either the Democratic or Republican nominee. In Brams and Fishburn’s paper, they document that a survey from 1968 found that both Nixon and Humphrey were approved of by over 60% of voters. This is quite apparently no longer the case when it comes to our current major party nominees. Donald Trump, despite winning two terms, has failed to get a majority of votes in all three of his runs for president, and yet won twice (in fact, only getting a plurality over his Democratic opponent for the first time in 2024).

Parties are moving further and further apart, and voters are becoming more unhappy when the other side wins. And why does it happen? Because the two parties can get away with it. Electoral reform has itself become a political pawn. Though it’s typically Democrats pushing for RCV, they get a little uncomfortable when it’s suggested for a Blue state, rather than a Red state. The deck is stacked against us.

This doesn’t have to occur overnight, but it can start with major party primaries adopting approval voting, to select more broadly acceptable nominees. The reality is that most primary voters do not pick candidates that would be broadly acceptable to the general electorate. It’s great for the primary voters until their candidate loses in the general election.

We can quibble over RCV vs approval voting all day. It’s what they want us to do. The important thing is that we push for electoral reform, and break the stranglehold of the two party system. As long as we are stuck with single winner elections, over proportional representation, both RCV and approval voting are strictly better than plurality.

In conclusion, if you can do something to push for electoral reform in your area, do it. Whether it’s RCV, approval voting, or proportional representation, anything is better than what we have now. But please consider approval voting because we need that data. Thanks.

If you find yourself inspired to advocate for approval voting, I recommend checking out The Center for Election Science, which advocates for approval voting and has resources to help you get started.

Enjoy Reading This Article?

Here are some more articles you might like to read next:

Approval voting is the Only Internally Consistent Cardinal Method

A Practical Case for Approval Voting

The Lichtman Perception Paradox

Is Approval Voting Strategyproof?

Why do we row reduce? What IS a matrix?