Page MenuHomePhabricator

Implement STV tallying in STVTallier::finishTally [XL]
Closed, ResolvedPublic

Description

Implement the algorithm as suggested in: T281035#7116140.

General algorithm (source)

  1. Compute the quota.
  2. Assign votes to candidates by first preferences.
  3. Declare as winners all candidates who received at least the quota.
  4. Transfer the excess votes from winners to hopefuls.
  5. Repeat 3–4 until no new candidates are elected. (Under some systems, votes could initially be transferred in this step to prior winners or losers. This might affect the outcome.)

If all seats have winners, the process is complete. Otherwise:

  1. Eliminate one or more candidates, typically either the lowest candidate or all candidates whose combined votes are less than the vote of the lowest remaining candidate.
  2. Transfer the votes of the losers to remaining hopeful candidates.
  3. Repeat 3–7 until all seats are full.

The quota we will use (step 1) is the Droop quota (source):

floor( no. votes / (no. seats + 1) ) + 1

The method for transferring votes from elected or eliminated candidates will be the Meek method:

Acceptance criteria:

  • Add an implementation for STVTallier::finishTally (should be modular and will probably call out to several private methods)
  • Add STVTallierTest with test cases for finishTally

It may or may not be desirable to implement addVote for this task. Tallying won't work via the interface without it, but this task is only concerned with providing the algorithm, not the full tallying capability.

It would help to look at SchulzeTallier and PairwiseTallier for how to structure this.

Event Timeline

ARamirez_WMF renamed this task from Implement STV tallying in STVTallier::finishTally to Implement STV tallying in STVTallier::finishTally [XL].May 26 2021, 4:16 PM

Change 697899 had a related patch set uploaded (by TsepoThoabala; author: TsepoThoabala):

[mediawiki/extensions/SecurePoll@master] Implement STV tallying in STVTallier::finishTally

https://gerrit.wikimedia.org/r/697899

It may or may not be desirable to implement addVote for this task. Tallying won't work via the interface without it, but this task is only concerned with providing the algorithm, not the full tallying capability.

Just had a meeting with Tšepo and I would like to propose we split up the addVote() portion of this so that this ticket only deals with implementing the algorithm. Another ticket can deal with the glue step between voting (done in T282690: Voting form display for STV polls [M]) and the tallying done here. That step would deal with writing the STVTallier constructor, unpackRecord(), and addVote(). This ticket only needs to know what gets returned from the aggregate addVotes which might change a bit but at the end of the day must be an array of arrays (votes) that denote the ranking of candidates. So something like:

[
  [
    1 => 'A',
    2 => 'B',
    3 => 'C'
  ],
  [
    1 => 'A',
    2 => 'C'
  ]
]

or

[
  ['A', 'B', 'C'],
  ['A', 'C']
]

Change 704138 had a related patch set uploaded (by TsepoThoabala; author: TsepoThoabala):

[mediawiki/extensions/SecurePoll@master] Implement STV tallying in STVTallier::finishTally

https://gerrit.wikimedia.org/r/704138

Change 704138 abandoned by TsepoThoabala:

[mediawiki/extensions/SecurePoll@master] Implement STV tallying in STVTallier::finishTally

Reason:

https://gerrit.wikimedia.org/r/704138

Change 704454 had a related patch set uploaded (by TsepoThoabala; author: TsepoThoabala):

[mediawiki/extensions/SecurePoll@master] Implement STV tallying in STVTallier::finishTally

https://gerrit.wikimedia.org/r/704454

@Niharika We should decide how we want to eliminate candidates.

tl;dr: should we eliminate lowest ranking candidate any round where we can't elect one or should we only exclude candidates when they have no mathematical chance of overtaking?

The paper that first proposed the Meek's method [1] (and the following addendum [2]) don't specific an elimination methodology. This lead me to research prior STV implementations, including the Hare method (which preceded the droop quota that's in common use now) and suggests that in lieu of any other mechanism, elimination happens whenever a candidate is not elected in that round.

From this fairly well-cited summary of STV [3]

The different varieties of STV share the following features: a quota of votes is established, and any candidate who attains the quota is elected; surplus votes of elected candidates are transferred to other candidates favored by those who voted for the elected candidates; candidates are eliminated sequentially and their votes transferred to other candidates, with the candidate eliminated at each stage generally being the one with the fewest current votes.

Hare specifies that it happens after the surplus is transferred:

In 1865 Hare revised his proposal, incorporating the provision that after all surplus votes have been transferred, if the prescribed number of positions has not been filled, candidates are eliminated successively beginning with the one with the fewest votes.

I can't speak to the specific assumptions in the Meek's method where they're not declared but given that Meek published his paper in 1969, I would assume that Meek's builds upon the elimination assumptions already built into STV at this point, which is that when a candidate cannot be elected, eliminate the candidate with the fewest votes and re-distribute those votes until a candidate meets the quota. There are other variations in the STV implementations, mostly around how to distribute votes and how many (Senate rules, Droop quota, etc). As far as I could tell*, no one was debating over when to eliminate a candidate.

All of this is to say I don't think we would be wrong to eliminate a candidate any round where we were unable to elect one.

In what I think is the first programmatic implementation of Meek's method [4] (the ballot format may look familiar - it's what OpaVote uses), 2.7 describes a method of elimination with a "pseudo-random choice" tiebreaker:

If no candidate were elected under rule 2.6, then the hopeful candidate with the fewest votes changes state from hopeful to excluded. Any tie is resolved by a pseudo-random choice.

(I would like to point out again that the pseudo-random choice thing is an implementation choice. In what I believe is the only state where it would matter (2 candidates tie for the last seat), we can simply opt to not declare either of them the winner programmatically and instead leave it for the committee to decide upon)

BUT only after exhausting all surplus votes, as described by 2.9 (sorry for bad copy-pasta - see paper for actual symbols; emphasis mine):

The convergent iterative scheme is as follows: set +* equal to 0 for excluded candidates, 1 for hopeful candidates, and their last calculated values ,* . for elected candidates. (Immediately after election of any candidate the last calculated value is 1 initially.) Applying rule 2.3, using these weights, let -* be the total value of votes received by candidate . and let / be the total excess. Using this value for / , calculate the new quota q using rule 2.5. Finally update the weights for elected candidates to values %0*21 ,*43  -5* . Repeat the process of successively updating -5*46)/ 6 3 and +* until every fraction 3  -5* , for elected candidates, lies within the limits 0.99999 and 1.00001 (inclusive).

I didn't want to read PASCAL so I didn't but I do think this is the first implementation of such iteration, since it's stated in the introduction that it's only possible because they're making computers do it instead of people:

It should be emphasised that the results will not always be the same as by manual counting methods. The algorithm deliberately uses the power of the computer to get better results than are easily achievable by hand.

It's from this work that the same author, Hill, publishes some optimizations on Algorithm 123 [5], notably the "Short-cut exclusion rule"

During the iterations, if it is found that the lowest candidate’s current votes plus the total surplus of the elected candidates is less than the current votes of the next lowest candidate, it is certain that, if the iterations were continued all the way to convergence, that lowest candidate would necessarily still be the lowest and would have to be excluded. It is therefore safe to exclude the candidate at once. The next iterations will then start from a different point than would otherwise have been the case, but it follows from Woodall’s proof that the next solution vector will still be the same, so the eventual result must be unchanged.

tl;dr: lowest + surplus < second lowest, then lowest has no chance to catch up and can be safely eliminated.

Based on comparisons with OpaVote results, this is how they've implemented STV. New Zealand has also implemented STV using Hill's specifics.

The "eliminate in a round with no numbers" strategy seems to have been put in place because of human limitations. Given that other large elections have used this iteration methodology, we also would not be wrong in leveraging computers to do math.

*I did do some diligence in checking credibility of the authors, papers, and journals, but I wouldn't call it exhaustive. Mileage may vary.

[1] https://web.archive.org/web/20200712142326/http://www.votingmatters.org.uk/ISSUE1/P1.HTM
[2] https://rangevoting.org/MeekSTV2.html
[3] https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.9.1.27
[4] https://ucid.es/system/files/meekm_0.pdf
[5] https://web.archive.org/web/20200712143325/http://www.votingmatters.org.uk/ISSUE22/I22P2.pdf

After discussion of both options within the team and with @Matanya and Katie, we have decided to go ahead with Hill's method (short cut exclusion rule) to decide eliminations.

@TThoabala @STran I think we are getting an infinite recursion with data like: (EDIT sorry I made a mistake in the data I copied before)

array (
  2 => 
  array (
    'count' => 1275,
    'rank' => 
    array (
      1 => 2,
    ),
  ),
  '5_6' => 
  array (
    'count' => 1,
    'rank' => 
    array (
      1 => 5,
      2 => 6,
    ),
  ),
  3 => 
  array (
    'count' => 1275,
    'rank' => 
    array (
      1 => 3,
    ),
  ),
  1 => 
  array (
    'count' => 1275,
    'rank' => 
    array (
      1 => 1,
    ),
  ),
  4 => 
  array (
    'count' => 1274,
    'rank' => 
    array (
      1 => 4,
    ),
  ),
)

@dom_walden Pushed a patch in attempt to fix recursion issue, seems to be working. please let me know how the test goes.

Change 704454 merged by jenkins-bot:

[mediawiki/extensions/SecurePoll@master] Implement STV algorithm

https://gerrit.wikimedia.org/r/704454

Change 708413 had a related patch set uploaded (by Phuedx; author: TsepoThoabala):

[mediawiki/extensions/SecurePoll@wmf/1.37.0-wmf.16] Implement STV algorithm

https://gerrit.wikimedia.org/r/708413

Change 708413 merged by jenkins-bot:

[mediawiki/extensions/SecurePoll@wmf/1.37.0-wmf.16] Implement STV algorithm

https://gerrit.wikimedia.org/r/708413

Mentioned in SAL (#wikimedia-operations) [2021-07-29T18:19:51Z] <urbanecm@deploy1002> Started scap: rESPO796fe8ef10b5: rESPO927763c0a42d: SecurePoll backports (T283728, T284585)

Mentioned in SAL (#wikimedia-operations) [2021-07-29T18:36:57Z] <urbanecm@deploy1002> Finished scap: rESPO796fe8ef10b5: rESPO927763c0a42d: SecurePoll backports (T283728, T284585) (duration: 17m 06s)

Change 709380 had a related patch set uploaded (by Phuedx; author: Phuedx):

[operations/mediawiki-config@master] vote: Enable Single Transferable Vote

https://gerrit.wikimedia.org/r/709380

Change 709380 merged by jenkins-bot:

[operations/mediawiki-config@master] votewiki: Enable Single Transferable Vote

https://gerrit.wikimedia.org/r/709380

Mentioned in SAL (#wikimedia-operations) [2021-08-02T11:10:02Z] <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 43020b72e8f466188d738aa73f2023f3017804d0: votewiki: Enable Single Transferable Vote (T283728) (duration: 00m 57s)

Change 697899 abandoned by TsepoThoabala:

[mediawiki/extensions/SecurePoll@master] Implement STV tallying in STVTallier::finishTally

Reason:

already shipped with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/704454/

https://gerrit.wikimedia.org/r/697899

Most of the testing of this has been recorded in https://www.mediawiki.org/wiki/Anti-Harassment_Tools/SecurePoll_Improvements/Test_Results and other tickets.

There are a few open bugs left and more testing will be done for those. But, as we have had a successful election with the algorithm, I am happy to move this along.