Page MenuHomePhabricator

Request creation of voterlists VPS project
Closed, ResolvedPublic

Description

Project Name: voterlists

Developer account usernames of requestors: sd, foks

Purpose: Running scripts for generating voter lists for global elections (BoT, UCoC, U4C, etc)

Brief description: Will be used to run https://gitlab.wikimedia.org/sd/global-election-list-builder. Voter lists are currently built in production realm. This is a replacement that is faster and avoids issues such as T355594. Software used: node.js, valkey. Resource needs: 1 or 2 g4.cores16.ram32.disk20 instances (since 8 node.js processes need to be run in parallel - one for each section in wiki replicas, and each process can use upto 4 GB memory)

How soon you are hoping this can be fulfilled: This month?

Event Timeline

+1, this could be an interesting project to port to toolforge once we have persistent volumes available (for valkey, not in the short term though)

-1, it is forbidden to put private sql tables to Wikimedia Cloud. This task does not have a reason to move it. I also find the reasoning in T355594 very lackluster. The issues mentioned there should rather be fixed by moving the table out of an wiki section and move it over to something like x3. It would not be the first sql table to do so.

From modules/mediawiki/files/mariadb/tables-catalog.yaml:

name: securepoll_voters
    source: securepoll_main
    canonicality: canonical
    visibility: private

putting on hold for now until this concern is addressed

Raymond_Ndibe changed the task status from Open to Stalled.Jul 15 2025, 11:36 AM

-1, it is forbidden to put private sql tables to Wikimedia Cloud. This task does not have a reason to move it. I also find the reasoning in T355594 very lackluster. The issues mentioned there should rather be fixed by moving the table out of an wiki section and move it over to something like x3. It would not be the first sql table to do so.

From modules/mediawiki/files/mariadb/tables-catalog.yaml:

name: securepoll_voters
    source: securepoll_main
    canonicality: canonical
    visibility: private

This would not be equivalent to moving the securepoll_voters table. Securepoll_voters is the list of people that voted.

This would be closer in equivalence to moving the securepoll_lists table. Securepoll_lists is the list of people eligible to vote (and some other lists).

It looks like all SecurePoll SQL tables are marked as private for some reason: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/mediawiki/files/mariadb/tables-catalog.yaml#1260

I disagree with all these tables being private. I think about half can be changed to public. See T381197: Create views for SecurePoll db tables on Wiki Replicas

But that might be besides the point. I think the data ingested by the above ToolForge tool would use the replicas and just ingest user_name and user_editcount. Is this correct, @SD0001? If so, I don't see any privacy issues here. I think those fields of the users table are all public data.

I agree with @Novem_Linguae that the data being consumed is public data, and as such, there is no privacy issue. -- If the concern is transparency, the tables can be made publicly queryable if required, as @Novem_Linguae mentions, or the results can be published on Wiki (which they are for enwiki Arbcom elections for example). I see no problem with having a VPS project for this.

I think the data ingested by the above ToolForge tool would use the replicas and just ingest user_name and user_editcount. Is this correct, @SD0001?

Essentially yes, though not those columns precisely. It ingests the user ids and counts of edits in the given time windows for each wiki, and aggregates them. All data is taken from the wiki replicas, so naturally it's all public data. The VPS won't have any access to private data in production, of course.

fnegri changed the task status from Stalled to In Progress.Jul 24 2025, 2:41 PM

Creating with 32 cores and 64G of memory, which should be enough for 2 instances of type g4.cores16.ram32.disk20.

fnegri@cloudcumin1001:~$ sudo cookbook wmcs.vps.create_project --user sd --user foks --cores 32 --ram 64G --cluster-name eqiad1 --project voterlists --description "Running scripts for generating voter lists for global elections (BoT, UCoC, U4C, etc)." --task-id T399418

Mentioned in SAL (#wikimedia-cloud-feed) [2025-07-24T15:53:21Z] <fnegri@cloudcumin1001> START - Cookbook wmcs.vps.create_project for project voterlists in eqiad1 (T399418)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-07-24T15:54:07Z] <fnegri@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project voterlists in eqiad1 (T399418)