Arrow's Impossibility Theorem

Walkthrough and proof of the theorem that proves ranked voting systems cannot satisfy seemingly reasonable properties, unless you have infinitely many voters!

Introduction

I recently made a post on The Gibbard-Satterthwaite Theorem, which shows that ranked voting systems cannot be strategyproof. The Gibbard-Satterthwaite Theorem is the cousin of the more famous Arrow’s Impossibility Theorem, which shows that ranked voting systems cannot satisfy a set of seemingly reasonable properties. In this post, I will give a walkthrough and proof of Arrow’s Theorem based on a very concise 1970 paper by the legendary Peter C Fishburn.

I highly recommend this video by Veritasium, which gives a great introduction to the theorem and its implications. However, the clickbait title of “Why Democracy Is Mathematically Impossible” is a bit misleading, though I respect the grind. I would like to be upfront about what I believe can and should be said about Arrow’s Theorem.

Arrow himself has said:

“Most systems are not going to work badly all of the time. All I proved is that all can work badly at times.” (Kenneth Arrow)

Arrow’s theorem is a beautiful result, but it is not a death sentence for democracy, or even, more specifically, ranked voting systems. I might even call it a magic trick! Like the Gibbard-Satterthwaite Theorem, the underlying idea is as follows:

A magician pulls a dictator out of a hat

The proof I am going to present does it a bit differently, however.

For Gibbard-Satterthwaite, the troublesome property was strategyproofness. For Arrow’s Theorem, the killer property is the “independence of irrelevant alternatives” (IIA). Rather than start with a heavy definition, I’d like to explain the loose intuitive idea with a famous joke:

Morgenbesser, ordering dessert, is told by a waitress that he can choose between blueberry or apple pie. He orders apple. Soon the waitress comes back and explains cherry pie is also an option. Morgenbesser replies “In that case, I’ll have blueberry.”

At the risk of ruining the joke by explaining it, the humor comes from the fact that Morgenbesser clearly prefers apple to blueberry. Therefore, it would make little sense for him to ever pick blueberry if apple is on the menu. How could the presence of cherry pie cause him to change his mind about apple vs. blueberry?

I recently published a post all about IIA, so I won’t go into too much detail here. I recommend reading that for a deeper understanding, or perhaps a primer for this post, but the key point is that Morgenbesser’s behavior violates the independence of irrelevant alternatives. While easy to maintain a coherent preference at the individual level, aggregation of coherent preference orders often leads to irrationality.

Definition: (Independence of Irrelevant Alternatives) If every voter maintains the same relative ranking of $x$ and $y$ when moving between two profiles, then society should maintain the same relative ranking of $x$ and $y$ as well, even if we move around any or all other candidates.

In short, just because I change my mind about how I feel about Apple and Cherry, if I maintain the same relative preference of Apple to Blueberry, then the societal preference of Apple to Blueberry should not change.

In my post on IIA, I give detailed examples for how systems like Ranked Choice Voting and even pristine Condorcet methods can violate IIA (and how that manifested as a controversial result in the 2009 mayoral election in Burlington and the 2022 Alaska House special election). The general pattern, for RCV specifically, is that the presence of a spoiler candidate, who lost in the final round, can cause a consensus choice to be eliminated early on, despite being able to defeat any other candidate in a head-to-head matchup.

What Arrow’s theorem tells us is not that these systems are just “not good enough”–or that we just need to tweak the math a bit to be better–but that ranked voting inherently cannot maintain coherence when aggregating preference orders of three or more candidates while actually being responsive to the voters.

IIA, the articulation of this coherence, is an extremely restrictive property. Essentially no “reasonable” ranked voting system can satisfy IIA. The typical proof of Arrow’s Theorem shows that if you have a “responsive” voting system that satisfies IIA, then it makes the system so rigid that it can only be responsive to one voter, and that voter is hence a dictator.

Fishburn’s proof actually goes a different route, and shows that if you have a “responsive”, “stable” (IIA) voting system that isn’t a dictatorship, then you can instead pull infinitely many voters out of a hat. Which is too mathematically cute for me not to write a blog post about!

Preliminaries

Arrow’s theorem concerns a voting system attempting to satisfy the following properties:

  1. The electorate is comprised of a finite number of voters
  2. Three or more candidates are allowed
  3. Unrestricted domain: Allowing any possible input ranking of candidates (with ties allowed) from all voters as input, and outputting a societal ordering of candidates (with ties allowed). No ballot is “illegal”.
  4. Unanimity: If every voter ranks $a>b$, then society ranks $a>b$.
  5. IIA: If every voter keeps the same relative order of $a$ and $b$ when moving between two profiles, then society must keep the same relative order of $a$ and $b$ as well.
  6. Non-dictatorship: No single voter’s ballot decides the whole election. This would be a “dictator”. More specifically, we don’t have a single voter such that if they rank $a>b$, then society necessarily ranks $a>b$.

Theorem: (Arrow’s Theorem) No ranked voting system can satisfy all six of the above properties simultaneously.

Unlike the usual proof, which shows that the first five conditions force a dictator, we are going to walk through Fishburn’s proof of the equivalent statement: if a system satisfies conditions 2 through 6, then it cannot have only finitely many voters. If you can show that some combination of five properties implies that the sixth property must be false, that’s equivalent to showing that all six properties cannot be satisfied simultaneously.

So let $V$ be the set of voters, let there be at least three candidates, and let $\sigma$ be a “voting system” that takes every possible ranking with ties from the voters and outputs a societal ranking with ties. We will just call these outputs and inputs “rankings”; the only structural fact we need is transitivity.

It may seem strange for a voting system to output a ranking (a “social welfare function”) instead of a candidate (a “social choice function”). But we can associate such a social welfare function with a social choice function by just picking the top-ranked candidate(s) as the winner(s).

For the proof, we define an “Arrovian voting function” $\sigma$ on $V$ to satisfy exactly properties 2 through 6. That is,

Definition: We call a voting function $\sigma$ an “Arrovian” voting function on $V$ if $\sigma$ is a voting system that takes in any possible profile of voters’ ranked preferences and outputs a transitive societal ranking of three or more candidates, satisfies unanimity and IIA, and for which no single voter in $V$ is a dictator.

This is essentially our ideal “fair” voting system. To show that the 6 properties Arrow and Fishburn define cannot be satisfied simultaneously, we will prove that if $\sigma$ is Arrovian on $V$, then $V$ must have infinitely many voters.

The Proof

Note: This proof is loosely based on Fishburn’s proof, but I have made some adjustments to the structure, notation, terminology, and filled in some details to make it more accessible. Fishburn’s proof is concise and elegant, but it’s a bit hard to follow. Lemma \ref{overrule} is not in Fishburn’s paper, but it makes the proof of Proposition \ref{single-block} much more intuitive and easier to follow, in my opinion.

Lemma: If $\sigma$ is Arrovian on $V$, then there is more than one voter in $V$.\label{more-than-one}

Proof: If there was only one voter in $V$, they would be a dictator. Therefore, $V$ must have more than one voter. $\square$

Definition: We say that a voter $v$ can be overruled with $a$ over $b$ if whenever every voter except $v$ ranks $a>b$, then society ranks $a>b$, even if $v$ ranks $b>a$. In other words, $v$’s vote cannot flip the $a$ vs. $b$ outcome against the unanimous preference of every other voter. We denote this property $avb$, and say that “$v$ is overruled on $a$ over $b$”.

We denote the set of all voters except for $v$ as $V’$.

Proposition: In an Arrovian voting system, any single voter can be overruled by all other voters, $V’$, in their preference of any two candidates. That is, for any voter $v$ and any distinct candidates $a$ and $b$, we have $avb$ and $bva$.\label{single-block}

By IIA, it’s sufficient to show that there exists a single “critical” profile where $v$ ranks $b>a$ and both $V’$ and society rank $a>b$. If such a critical profile exists, then no matter how anyone ranks any other candidate in any other profile where $v$ ranks $b>a$ and everyone in $V’$ ranks $a>b$, IIA would guarantee the same societal relative ranking of $a>b$ as this critical profile, since they would have to agree on the relative rankings of $a$ and $b$.

We will prove this incrementally by showing that if $v$ can be overruled on a single pair of candidates, then they can be overruled on any pair of candidates. In a paper by Blair and Muller (1983), they basically describe this as a “contagion” property of “decisiveness” on $V’$, which I think is a great visceral metaphor for the idea. We’re about to unleash a virus of overruling that will infect every pair of candidates, making $v$ powerless and inconsequential.

Lemma: (Overruling lemma) If $pvq$ in an Arrovian voting system, then for any $r$ distinct from $p,q$,\label{overrule}

  • (i) $rvq$
  • (ii) $pvr$

This is a crucial step, but it’s not sufficient to prove the Proposition on its own. This lemma says that if we know that $v$ can be overruled, in one direction, on a single pair of candidates ($p$ over $q$: if $V’$ ranks $p>q$ and $v$ ranks $q>p$, then society ranks $p>q$), then we can show that $v$ can be overruled on another pair of candidates involving a third candidate $r$. But we have yet to show that $v$ can be overruled just yet. Unanimity is not sufficient to get us the first overruling (Unanimity only applies when 100% of voters agree, not when all but one voter agrees).

Click to expand proof

Proof: By Lemma \ref{more-than-one}, we have more than one voter, meaning $V’$ is nonempty. Consider this first profile:

Voters Preferences
$v$ $q > r > p$
$V’$ $r > p > q$

By the assumption that $pvq$, we must have that society ranks $p>q$, since $V’$ ranks $p>q$. By unanimity, every voter ranks $r>p$, so society must rank $r>p$. By transitivity, society ranks $r>q$, and in total ranks $r>p>q$. Thus, we have that $v$ ranks $q>r$ but $V’$ ranks $r>q$, and society ranks $r>q$. Therefore, we have $rvq$. That is, we showed that $v$ can be overruled with $r$ over $q$.

Consider the similar second profile where we adjust and shift the ranking of $V’$:

Voters Preferences
$v$ $q > r > p$
$V’$ $p > q > r$

By the same logic as before,

  • $pvq \implies$ society ranks $p>q$
  • Unanimity implies society ranks $q>r$
  • Transitivity implies society ranks $p>r$

Thus, $V’$ ranks $p>r$ even though $v$ ranks $r>p$, so we have $pvr$.

Therefore, if $pvq$, then $rvq$ and $pvr$. $\square$

Basically, if $v$ can be overruled with $p$ over $q$, then $v$ can also be overruled with $p$ over any other candidate, and any other candidate over $q$. We have shown that if we have a single overruling, then that overruling “propagates” like a contagion to all other pairs of candidates.

This is one of the most important steps of the proof, so I’ve included an intuitive explanation to try to ground the idea in terms of a story about a voter named Victor, as I think might be more illuminating than reading the formal proof. You can click to expand the intuition block if you want to read it.

Intuition

Suppose we have a voter Victor, who is a bit of a contrarian. Victor likes to go against the grain: if everyone else prefers Pete to Quincy, Victor will prefer Quincy to Pete.

We assume that Victor can be overruled in this preference: if all the other voters prefer Pete to Quincy, then society will rank Pete over Quincy even if Victor prefers Quincy to Pete. And let’s say this is the baseline scenario: Victor is being overruled on Pete versus Quincy by the entire electorate who all agree that Pete should be elected over Quincy.

Group Preferences
Victor Quincy $>$ Pete
Rest Pete $>$ Quincy
Society Pete $>$ Quincy

Now, suppose that there is a third candidate, Rachel, who enters the race. Victor still likes Quincy the most, but prefers Rachel to Pete. The rest of the electorate, still prefers Pete to Quincy, but loves Rachel. How does this affect the societal ranking?

Voters Preferences
Victor Quincy $>$ Rachel $>$ Pete
Rest Rachel $>$ Pete $>$ Quincy
Society Pete $>$ Quincy

In this scenario, everyone agrees that Rachel is preferred to Pete, so society must rank Rachel over Pete. It would be silly to do anything else! Thus, we have that society ranks Rachel over Pete and Pete over Quincy by transitivity, so society ranks Rachel over Quincy as well. This means that society’s ranking is exactly the same ranking as what everyone besides Victor prefers: Rachel over Pete over Quincy.

Voters Preferences
Victor Quincy $>$ Rachel $>$ Pete
Rest Rachel $>$ Pete $>$ Quincy
Society Rachel $>$ Pete $>$ Quincy

The winner of the elections is thus the candidate ranked at the top by society: Rachel.

Observe what we have now:

  • Victor prefers Quincy over Rachel
  • The rest of the electorate prefers Rachel over Quincy
  • Society ranks Rachel over Quincy

Victor is thus overruled on the pair Quincy versus Rachel in this profile.

Now, by IIA, we know that whenever Victor ranks Quincy over Rachel while everyone else ranks Rachel over Quincy, society must still rank Rachel over Quincy.

Therefore, Victor can be overruled by the rest of the electorate on the pair Quincy versus Rachel in all profiles. We can move Pete around in the rankings of everyone else, introduce new candidates, or whatever else we want. But as long as Victor ranks Quincy over Rachel and everyone else ranks Rachel over Quincy, society must rank Rachel over Quincy. And that’s the fundamental idea behind the proof of the lemma. IIA allows this overruling to spread to all relevant profiles.

As an exercise to ensure your understanding, see if you can reason through why Victor can be overruled by the rest of the electorate on Pete versus Rachel if everyone else besides Victor ranks Pete $>$ Quincy $>$ Rachel. If you need a hint, feel free to look back at the proof!

Let’s be sure we understand what has been shown thus far:

Now we can prove Proposition \ref{single-block}. To do this, we just need to show that we can get a single overruling of $v$ on some pair of candidates, and then the Overruling Lemma will let the contagion spread to all pairs of candidates, making $v$ powerless.

Proposition \ref{single-block}: In an Arrovian voting system, any single voter can be overruled by all other voters in their preference of any two candidates. That is, for any voter $v$ and any distinct candidates $a$ and $b$, we have $avb$ and $bva$.

Click to expand proof

Proof: Let $v$ be any voter and $a,b$ be any distinct candidates. We want to show that $avb$ and $bva$. First, we show that for any single voter $v$ who ranks $a>b$ in an Arrovian voting system, there exists a candidate $r$ such that $bvr$.

Since $v$ cannot be a dictator by assumption, we must have some profile $f$ where $v$ ranks $a>b$ but society does not rank $a>b$ (society ranks $b\geq a$). By IIA, any profile where these same voters maintain the same relative ranking of $a$ and $b$ must also have society rank $b\geq a$.

We construct such a profile $g$ where everyone has the same relative ranking of $a$ and $b$ as in $f$, but we take a third candidate $r$, which must exist since we have three or more candidates, and construct the profile

Voters Preferences
$v$ $a>r>b$
$V’$ $a>r$, $b>r$
$\sigma$ $b\geq a$

The precise placement of $a$ and $b$ in $V’$ may vary from voter to voter, but we always place $r$ below both $a$ and $b$ in all of their rankings. This new profile $g$ agrees with $f$ on ${a,b}$, so by IIA, society must rank $b \geq a$ in $g$ as well.

By unanimity, society must rank $a$ over $r$. This means society ranks $b\geq a$ and $a>r$, so by transitivity, society ranks $b > r$. Thus, we have that $v$ ranks $r>b$ but $V’$ ranks $b>r$, and society ranks $b>r$. Therefore, we have $bvr$.

We now apply the Overruling Lemma repeatedly.

  • $bvr$ directly gives us $avr$, by the Overruling Lemma (i).
  • $bvr$ implies $bva$ by the Overruling Lemma (ii).
  • By the Overruling Lemma (ii) applied to $avr$, we have $avb$.

Since $a$, $b$, and $v$ were arbitrary, this completes the proof. $\square$

Similar to the above proof, we use very similar logic. However, rather than the entire electorate being in total agreement, here we only need something weaker. In the following intuition block, we walk through the logic carefully, as we did above, continuing the story of Victor.

Intuition

We assume that our favorite little contrarian, Victor, is still not getting his way. Victor isn’t a dictator, so even if he prefers Alice to Bob, there must be some way that Bob doesn’t lose to Alice. That is, Bob is ranked equally or higher than Alice in society’s ranking (Bob $\geq$ Alice). Now let’s say Rachel, like last time, jumps into the race! Victor, like last time, puts Rachel in the middle of his ranking, under Alice but above Bob, while every other voter hates Rachel. They put Rachel dead last in their rankings. Poor Rachel.

IIA tells us that since society ranked Bob $\geq$ Alice before, the introduction of Rachel shouldn’t change that. So even with Rachel in the race, society must maintain that preference.

Group Preferences
Victor Alice $>$ Rachel $>$ Bob
Others Alice $>$ Rachel, Bob $>$ Rachel
Society Bob $\geq$ Alice

We don’t know how the others rank Alice or Bob. All we know is that every voter ranks Rachel last, and society at least ranks Bob $\geq$ Alice.

However, notice that everyone agrees that Alice is better than Rachel, so Alice has to defeat Rachel. But we assumed that Bob is ranked at least as high as Alice in society’s ranking. So society must rank Bob $\geq$ Alice > Rachel, and thus Bob strictly defeats Rachel.

Group Preferences
Victor Alice $>$ Rachel $>$ Bob
Others Alice $>$ Rachel, Bob $>$ Rachel
Society Bob $\geq$ Alice $>$ Rachel

This means that Victor, once again, is a contrarian when it comes to Rachel and Bob. Everyone but Victor agrees Bob is better than Rachel, and society agrees! Thus, Victor has been overruled on Bob versus Rachel.

In summary, by just knowing that Victor didn’t get his way on Alice versus Bob, we were able to show that Victor can be overruled on Bob versus some other candidate, who in this case is named Rachel. This is the start of our chain of overrulings, which we will use to show that Victor can be overruled on any pair of candidates.

Now, we know that Victor can be overruled on Bob over Rachel. We can apply the overruling lemma to create a chain of overrulings. If we go back to the scenario described in the intuition block for the proof of the overruling lemma, we can apply that to use the Bob versus Rachel overruling to get an overruling of

  • Bob over Rachel $\implies$ Bob over Alice
  • Bob over Rachel $\implies$ Alice over Rachel
  • Alice over Rachel $\implies$ Alice over Bob

Thus, we get that Victor can be overruled on Alice over Bob and Bob over Alice. This can be applied to any pair of candidates, so Victor has been shown to be powerless and inconsequential in the decision of any pair of candidates, should every other voter disagree with him on that pair.

Therefore, we have shown that any single voter $v$ can be overruled by all other voters in their preference of any two candidates. This is a whole lot of rigorous math to show the most basic assumption I think anyone would intuitively expect from literally any voting system with at least three voters: that if every voter except one prefers $a$ to $b$, then society should also prefer $a$ to $b$.

However, Proposition \ref{single-block} is actually a lot stronger than it appears. Even with three voters, it’s not always possible in an Arrovian voting system that two voters can overrule the third voter in a way that establishes a clear societal ranking, if the preferences are arranged in a certain way.

Example: Take this example with three voters:\label{rock-paper-scissors}

Voters Preferences
Voter 1 Rock $>$ Scissors $>$ Paper
Voter 2 Scissors $>$ Paper $>$ Rock
Voter 3 Paper $>$ Rock $>$ Scissors

As an exercise, use Proposition \ref{single-block} to show that it’s impossible to establish a societal ranking for this profile if we assume the system is Arrovian (and thus, the Proposition applies). Therefore, three voters is insufficient, and we can’t define an Arrovian voting system on a set of three voters.

We will see such a cyclic profile used in the next step of the proof, which is to show that if $n$ voters can be overruled, then,

  1. We must have more than $n+1$ voters, and
  2. $n+1$ voters can be overruled.

We have shown $n=1$ voter can be overruled, and that we need more than $n=1$ voters in $V$. If we can prove the above two points, then that shows that 2 voters can be overruled, and we need more than 2 voters. Then we can apply the same logic again to show that 3 voters can be overruled, and we need more than 3 voters. And so on, and so on. This is an idea called “mathematical induction”.

This will show that we cannot have any finite number of voters, and thus we must have infinitely many voters.

To Infinity and Beyond

Proposition: If any strict subset of size $n$ voters can be overruled (implying $V$ contains more than $n$ voters) in an Arrovian voting system, then $V$ must have more than $n+1$ voters, and $n+1$ voters can be overruled with any pair of candidates. \label{induction-step}

Click to expand proof

Suppose that $V$ contains more than $n$ voters, and let $U$ be a strict subset of $V$ containing $n$ voters. Further, suppose that any strict subset of exactly $n$ voters can always be overruled with any pair of candidates. Then let $v$ be a voter outside of $U$, and denote $S’=V-(U\sqcup{v})$. Meaning $S’$ is all voters besides those in $U$ and $v$. We allow the possibility that we have only $n+1$ voters and $S’$ is empty. Consider the following profile:

Voters Rankings
$v$ $a>b>c$
$U$ $b>c>a$
$S’$ $c>a>b$

We know that any single voter can be overruled on any pair of candidates by Proposition \ref{single-block}, and we have that every voter besides $v$ ranks $c>a$. Therefore, since $cva$, then society must rank $c>a$. Similarly, by assumption that a strict subset of size $n$ can be overruled on any pair of candidates, we have that $U$ ranks $b>a$, but everyone outside of $U$ ranks $a>b$. Thus, society must rank $a>b$. By transitivity, we must have that society ranks $c>b$. At this point, we expect society to rank $c>a>b$, exactly as $S’$ does.

If $S’$ is empty, then we would only have voters $v$ and those in $U$. So, by unanimity, we would need that society ranks $b>c$, which would be a contradiction to the fact we showed that society ranks $c>b$. We must then have that $S’$ is nonempty, so $V$ must contain strictly more than $n+1$ voters. Therefore, the set of size $n+1$, $U\cup{v}$, can be overruled. Since $U,v,a,b,c$ were arbitrary, we have sufficiently shown that any set of size $n+1$ can be overruled in the decision of any two candidates. $\square$

You may read an intuitive explanation of the above proof, using our friend Victor and a group called the Bob fan club, by expanding the intuition block below!

Intuition

Let’s go through the logic carefully, as we did before.

Back to our friend Victor, that little scamp. An election is being held, and there are three candidates: Alice, Bob, and Clark.

Victor, on this day, has the whim to support Alice then Bob and then Clark. And now, imagine we have a group of diehard Bob supporters who run a fan club at the local university. The Bob fan club is just small enough that they can be overruled if literally everyone else disagrees with them. Let’s say the Bob fan club has $n$ voters. The Bob fan club hates Alice, but they tolerate Clark. Victor is in agreement with the Bob fan club that Bob is better than Clark, but isn’t allowed to join the club because he loves Alice.

Unfortunately, it seems that everyone else who is planning to vote besides Victor and the Bob fan club, who we will call the “Outsiders,” are fans of Clark and hate Bob. Let’s think about who has to win based on these preferences.

Voters Rankings
Victor Alice $>$ Bob $>$ Clark
Bob fan club Bob $>$ Clark $>$ Alice
Outsiders Clark $>$ Alice $>$ Bob

Now, we know that Victor can be overruled. That was the big result from before. And we’re assuming the Bob fan club is just small enough to be overruled. So what happens?

  • Well, everyone but Victor agrees that Clark is better than Alice, so society is going to definitely have to rank Clark higher than Alice since a single voter can be overruled by everyone else. (society ranks Clark $>$ Alice)
  • Similarly, everyone but the Bob fan club agrees that Alice is better than Bob, so society must rank Alice higher than Bob by our assumption that any strict subset of size $n$ can be overruled. (society ranks Alice $>$ Bob)

By transitivity, that means society must rank Clark then Alice then Bob, meaning society’s ranking matches exactly the Outsiders’ ranking. If nobody except Victor and the Bob fan club shows up to vote, then that would make no sense! Because if none of the Outsiders show up, then every voter would agree that Bob is better than Clark, so Bob would have to beat Clark. Therefore, at least one Outsider had to show up to vote.

This tells us two important things, based purely on our assumption that a strict subset of size $n$ can be overruled:

  • the turnout of the election is strictly bigger than the size of the Bob fan club plus one (Victor) ($\vert V\vert > n+1$)
  • and that a set of voters of size $n+1$ (ex. Victor and the Bob fan club) can be overruled.

Thus, if $n$ voters can be overruled, then $n+1$ voters can also be overruled, and there must be strictly more than $n+1$ voters in the electorate in order for this to be possible.

Since $n+1$ voters can be overruled, suppose a new member joins the Bob fan club. We repeat the scenario again, and come to the same conclusion! Repeating this over and over essentially closes the trap on the idea that the Outsiders are a finite group.

You could start with the Bob fan club starting with only a single founder. The above logic implies two voters can be overruled. A new member joins, and the Bob fan club now has two members. Apply the above logic and now three voters can be overruled. We can add another member and repeat, and so on, and so on. Eventually, the Bob fan club contains every person in the country. And then every person in the world! And then more people than could fit in the universe! At every point, we must have even more people outside the Bob fan club who can decide the whole election if they all stand in agreement.

We are now ready to prove Arrow’s theorem!

Theorem: (Arrow’s Theorem) If $\sigma$ is an Arrovian voting system, then it must be defined on a set of infinitely many voters.

This is certainly not the regular formulation of Arrow’s theorem that we saw above, but it is equivalent:

Proof: Proof by induction. Lemma \ref{more-than-one} tells us that we must have more than one voter. Proposition \ref{single-block} tells us that any single voter can be overruled. Proposition \ref{induction-step} tells us that if a strict subset of $n$ voters can be overruled, then $V$ must contain more than $n+1$ voters, and that $n+1$ voters can be overruled. Therefore, $V$ cannot contain any finite number of voters. $\square$.

To ensure the logic is clear, we had the proposition that a single voter can be overruled. By the inductive proposition, this tells us that we must have more than $1+1=2$ voters, and that 2 voters can be overruled. Applying the inductive proposition again tells us we must have more than 3 voters and that 3 voters can be overruled. And so on, and so on. If you ask me “what about a set of a billion voters?”, then it would take… a while… but I could go from the fact that one voter can be overruled to show that a strict subset of a “billion minus one” voters can be overruled and we must have strictly more than “a billion” voters in $V$.

Thus, an Arrovian voting system could only possibly be defined on a set of infinitely many voters. But can it, actually? Just because we’ve proven that $V$ cannot be finite does not necessarily prove that an Arrovian voting system can actually exist. Only that if it did, then $V$ would have to be infinite.

Infinitely Many Voters

Fishburn explains in his paper that, yes, indeed you can, in fact, satisfy all of Arrow’s criteria so long as you allow infinitely many voters! This gets into the territory of measure theory, which is far beyond the scope of this post. However, I think I can explain the basics at an intuitive level, though I’m going to be a bit fast and loose with the language and properties of measures.

We define a function $\mu$ (a “probability measure”), which takes in a subset of voters, and outputs a “measure” of either 0 or 1. If a subset outputs 1 (“has measure 1”), it “has the power”. Otherwise, a measure 0 set “does not have the power”.

If a group “has the power”, then they can dictate the societal ranking. For example, in our example above, both Victor and the Bob fan club would have measure 0, since they were overruled. But the Outsiders would have measure 1, and thus they decided the whole election.

We might understand “having the power” intuitively as, for example, having a “majority” if we had a finite number of voters. But that notion loses a concrete meaning when we have infinitely many voters, so we need to define a more abstract notion of “power” that can be applied to any subset of voters. This is what the $\mu$ function does.

The collection of all “powerful” sets–those with measure 1–turns out to be what mathematicians call an ultrafilter on $V$: a consistent, maximal notion of “majority” that generalizes the idea of having more than half the votes to infinite electorates. Arrow’s theorem, viewed through this lens, is almost trivial. Because the only ultrafilter on a finite set is a “principal ultrafilter” (generated by a single voter), only a single voter can “have the power.” This is exactly a dictatorship. I’m smoothing over a lot of details here, but that’s the basic idea.

With infinitely many voters, “non-principal ultrafilters” (ones not generated by a single voter) exist, and those correspond to Arrovian voting systems with no dictator. The fact that a non-principal ultrafilter exists on an infinite set is nontrivial, but relies on the Axiom of Choice, which is a perfectly intuitive sounding axiom that delivers completely unintuitive results.

We focus our attention on sets like $\mu(a>b)$, where $\mu(a>b)$ is simply the measure of the set of all voters who rank $a>b$. For any pair of candidates $a,b$, we can split $V$ into three parts (with no overlap):

\[V=\{v:a>_v b\}\sqcup\{v:b>_v a\}\sqcup\{v:a=_v b\}\]

The measures have to add up perfectly, because the sets don’t overlap and they cover all of $V$:

\[\mu(V)=1=\mu(a>b)+\mu(b>a)+\mu(a=b)\]

Exactly one of these measures, out of $\mu(a>b)$, $\mu(b>a)$, or $\mu(a=b)$ has to equal 1, and the rest zero. We then, simply, decide societal rankings by asking if the voters who give a certain ranking “have the power”. Whichever has measure 1 decides how society ranks $a$ and $b$.

Definition: We define our voting function $\sigma_\mu$ to rank $a>b$ if and only if $\mu(a>b) = 1$. If $\mu(a>b) = \mu(b>a)$, then we define $\sigma_\mu$ to rank $a=b$. We also define that $\mu({v})=0$ for all voters $v\in V$.\label{measure-theory-voting-function}

Details for the curious

We require a few basic, but seemingly obvious properties from $\mu$:

\[\mu(V)=1,\mu(\varnothing)=0\]

The set of all voters should absolutely have the power, that’s just unanimity! And no voters at all should have power, that’s just… well, it would be weird if that were the case.

Further, if we split the set of voters into a finite partition like $V=S_1\sqcup S_2\sqcup S_3$, then the measure of each set has to add up to the measure of the whole set, so

\[\mu(S_1)+\mu(S_2)+\mu(S_3)=\mu(V)=1\]

Since $\mu$ can only output 0 or 1, this means exactly one of these sets can “have the power” and no others.

We also require that the measure of any single voter is 0, or else we would have a dictator. This also corresponds to our proof that any single voter can be overruled. The induction proof that shows any finite set of voters can be overruled also corresponds to the fact the measure has “finite additivity”. That is, for a finite subset of voters $U$, we have

\[\mu(U)=\sum_{v\in U}\mu(\{v\})=0\]

If the set of voters were finite, then we would have

\[\sum_{v\in V}\mu(\{v\})=\mu(V)=1\]

Meaning that some single voter would have to have measure 1, and thus be a dictator. So we need infinitely many voters to avoid this.

This gobbledygook is necessary to generalize the notion of “majority rule” which fails for finite voters when majority preferences fail to induce a coherent ranking, as we saw in Example \ref{rock-paper-scissors}.

But can our abstract $\mu$ function define a coherent ranking where majority rule on finite voters failed? As it turns out, yes!

Lemma: The voting system $\sigma_\mu$ defined by $\mu$ on a set of voters $V$ induces a transitive ranking.\label{infinite-transitive}

Click to expand proof

We will only have to use a single fact from measure theory to prove this:

\[X\subset Y\cup Z\implies \mu(X)\leq\mu(Y)+\mu(Z)\]

Intuitively, if I “count” everyone whose first name is “Bob”, and I “count” everyone whose last name is “Dylan”, then the sum of those two counts is at least as big as the “count” of everyone whose first name is “Bob” or whose last name is “Dylan”. This is because I will be double counting everyone named “Bob Dylan”, since they are included in both of those sets.

Note that if a voter prefers Clark to Alice, then they must prefer either Clark to Bob or Bob to Alice (why?). So, if I want to “count” everyone who prefers Clark to Alice, then that will include everyone who prefers Clark to Bob, and everyone who prefers Bob to Alice. However, I will have double counted everyone who prefers Clark $>$ Bob $>$ Alice, since they are included in both of those sets. Therefore, the “count” of everyone who prefers Clark to Alice is at most the “count” of everyone who prefers Clark to Bob plus the “count” of everyone who prefers Bob to Alice:

\[\{v: c>_v a\}\subset \{v: c>_v b\}\cup \{v: b>_v a\}\] \[\mu(c>a)\leq\mu(c>b)+\mu(b>a)\]

Proof: If $\mu(a>b)=1$, then $\mu(b>a)=0$ so society can’t rank both $a>b$ and $b>a$. We will show that $\sigma_\mu$ is transitive, which is the only non-trivial part of being a ranking (“weak order”). Suppose that society ranks $a\geq b$ and $b\geq c$. We need that society ranks $a\geq c$.

If a voter ranks $c>a$, then they must rank either $c>b$ or $b>a$. Thus,

\[\{v: c>_v a\}\subset \{v: c>_v b\}\cup \{v: b>_v a\}\]

By the subadditivity bound above, we have that

\[\mu(c>a)\leq \mu(c>b)+\mu(b>a)\]

If society ranks $a\geq b$ and $b\geq c$, then $\mu(b>a)=0$ and $\mu(c>b)=0$. Hence,

\[\mu(c>a)\leq 0+0=0\]

Therefore, $\mu(c>a)=0$, and society ranks $a\geq c$. $\square$

Theorem: For the $\mu$ measure as defined above, $\sigma_\mu$ defined by the relation \(a>_{\sigma_\mu}b\) if and only if $\mu(a>b)=1$ is Arrovian.\label{measure-theory-arrovian}

Click to expand proof

Proof: We show that $\sigma_\mu$ satisfies all of Arrow’s conditions.

Unanimity: If every voter ranks $a>b$, then $\mu(a>b)=\mu(V)=1$, and hence society ranks $a>b$.

Non-dictatorship: \(\mu(\{v\})=0\) for all voters $v$, so if $v$ ranks $a>b$, and every other voter ranks $b>a$, then we have that $\mu(b>a)=\mu(V)-\mu({v})=1-0=1$. Thus, society ranks $b>a$ and not $a>b$. Thus, $v$ is not a dictator.

IIA: The societal ranking of $a$ and $b$ is determined entirely by the relationship between $\mu(a>b)$ and $\mu(b>a)$. If we adjust a profile such that we preserve the relative order of $a$ and $b$, then that precisely means that the membership, and hence the precise value, of $\mu(a>b)$ and $\mu(b>a)$ is preserved. Therefore, the societal ranking must be preserved.

By Lemma \ref{infinite-transitive}, $\sigma_\mu$ defined by our constructed $\mu$ measure induces a transitive ranking, allows for three or more candidates, and allows voters to submit any possible ranking. Therefore, $\sigma_\mu$ is Arrovian. $\square$

The Approval Voting Exception

From Wikipedia:

“Rated voting rules, where voters assign a separate grade to each candidate, are not affected by Arrow’s theorem. Arrow initially asserted the information provided by these systems was meaningless and therefore could not be used to prevent paradoxes, leading him to overlook them. However, Arrow would later describe this as a mistake, admitting rules based on cardinal utilities (such as score and Approval voting) are not subject to his theorem.”

One such rated voting system we will focus on is Approval voting.

Definition: Approval voting is a voting system where voters can “approve” of as many candidates as they find acceptable, and the candidate with the most approvals wins.

As mentioned in my post on IIA, if you extend IIA to rated voting systems by assuming that voters score candidates independently, then basically all rated voting systems trivially satisfy IIA and thus break Arrow’s theorem. If you maintain the requirement via relative ordinal rankings, then only Approval voting satisfies IIA among rated systems.

Maniquet and Mongin (2015) and Vorsatz (2007) both wrote fascinating papers axiomatically characterizing Approval voting as a function on “dichotomous preferences”. That is, if you ask each voter to partition the candidates into “acceptable” or “not acceptable”. Then as a system which aggregates those preferences, both papers show that Approval satisfies IIA in a very real sense. In that way, Approval is essentially Arrovian (though you must say this carefully). To call Approval Arrovian is a little intellectually dishonest, unless you include an appropriate number of asterisks and qualifiers to that statement.

However, like IIA, there is a very real sense that Approval satisfies unanimity, strategyproofness, and other desirable properties. I will publish a post on these topics soon, and what Approval’s satisfaction of these properties actually means for real voters in real life when they use Approval.

But we must be careful not to overstate the facts. Approval voting is not an exception to Arrow’s theorem or Gibbard’s theorem(s) at the level of general preference. Domain restrictions are common loopholes for impossibility theorems. As mentioned in my post on IIA, there is a case to be made that Approval, and other rated systems, do not satisfy IIA.

Conclusion

As stated in the introduction, Democracy is not “mathematically impossible”. Arrow’s theorem simply tells us that if we want a responsive ranked voting system that is immune to irrelevant options screwing with the results, then it’s either a dictatorship or we have infinitely many voters. Thus, we essentially have to accept that irrelevant options will screw with the results of an election using ranked ballots sometimes.

We can use a system that is more resistant to these failures, like a Condorcet-consistent method, which only fails IIA when there is no Condorcet winner, rather than a system like RCV, which is a disaster both theoretically and practically. Or… just not use a ranked voting system? Approval voting is an excellent choice too, and my primary recommendation.

Either way, just remember, folks: Democracy is not doomed, and voting is important. As I was writing this, an election in Boca Raton, Florida was literally decided by a single vote (7,568 to 7,567). IIA may be effectively impossible, but a decent voting system is not.

Appendix

For mathematically advanced readers, I will include a more general theorem I found for the conditions to define an Arrovian voting system via a measure.

Theorem: Suppose that we have a measure function $\mu$ on subsets of some set of voters $V$. If the following requirements are satisfied\label{measure-theory-general}

  1. $\mu(\varnothing)=0$, $\mu(V)=1$
  2. \(\mu(\{v\})\leq\frac{1}{2}\) for all $v\in V$
  3. The relation \(a>_{\sigma_\mu}b\) if and only if $\mu(a>b)>\mu(b>a)$ is transitive.

Then $\sigma_\mu$ defined by this relation is unanimous, non-dictatorial, and satisfies IIA.

I encourage you to try to prove this and show that Approval voting is Arrovian on the dichotomous domain via the “Approval measure” $\mu_A$ which simply counts the proportion of voters who approve of a candidate or a certain ranking.

\[\mu_A(S)=\frac{\lvert S\rvert}{n}\]

where $n\geq 2$ is the total number of voters, $S$ is any set of voters, and $\lvert S\rvert$ is the number of voters in $S$.

Hint

If we define $\mu_A(a)$ to be the proportion of voters who approve of $a$, then notice that $\mu_A(a)=\mu_A(a>b)+\mu_A(a=b)$, where $\mu_A(a>b)$ is the proportion of voters who approve of $a$ but not $b$, and $\mu_A(a=b)$ is the proportion of voters who approve of both $a$ and $b$.

References

Blair, D., & Muller, E. (1983). Essential aggregation procedures on restricted domains of preferences. Journal of Economic Theory, 30(1), 34-53. https://doi.org/10.1016/0022-0531(83)90092-3

Brams, S. J., & Fishburn, P. C. (1978). Approval Voting. The American Political Science Review, 72(3), 831–847. https://doi.org/10.2307/1955105

Fishburn, P. C., Arrow’s impossibility theorem: Concise proof and infinite voters, Journal of Economic Theory, Volume 2, Issue 1, 1970, Pages 103-106, ISSN 0022-0531, https://doi.org/10.1016/0022-0531(70)90015-3

Maniquet, F., & Mongin, P. (2015). Approval voting and Arrow’s impossibility theorem. Social Choice and Welfare, 44(3), 519–532. http://www.jstor.org/stable/43662604

Wikipedia contributors. (2024, June 10). Independence of irrelevant alternatives. Wikipedia. https://en.wikipedia.org/wiki/Independence_of_irrelevant_alternatives

Veritasium, “Why Democracy Is Mathematically Impossible”, https://www.youtube.com/watch?v=qf7ws2DF-zk

Vorsatz, M. (2007). Approval voting on dichotomous preferences. Social Choice and Welfare, 28(1), 127–141. http://www.jstor.org/stable/41106808

Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Independence of Irrelevant Alternatives
  • Is the Condorcet Winner the True Compromise?
  • Why I Currently Only Support Approval
  • Approval is the Perfect Condorcet Method
  • The Gibbard-Satterthwaite Theorem