Page MenuHomePhabricator

Investigate work involved in adding log for election admins for SecurePoll [8 hours]
Closed, ResolvedPublicSpike

Description

Motivation

As part of SecurePoll improvements, we want to add two logs to the tool. This task is for the first one.

Log for scrutinizing elections admins being added or removed

Sometimes elections on some projects are run with only one or two admins without enough people to effectively check for biases and keep each other accountable. Then just before the election ends, more people can be added as admins which gives the false impression that all those people were election admins throughout. To mitigate this, it would be helpful to have a log that indicates the timestamp when election admins were added/removed for a given election. At minimal, it would be helpful to capture:

  • Which election was the new admin added/removed for
  • Who added the new admin
  • Username for the new admin
  • Which action was performed (addition/removal/something else?)
  • Timestamp for this action

This log should be available on a wiki page and downloadable as a CSV or Wikitext.

Requirements for this task:
  • Technical plan for implementing the logs
    • If we run into any technical roadblocks or questions, we should circle back with T&S
  • Get DBA approval on the plan

Event Timeline

Niharika triaged this task as Medium priority.Dec 16 2020, 4:55 PM
Niharika created this task.
Niharika renamed this task from Investigate work involved in adding new logs for SecurePoll to Investigate work involved in adding log for election admins for SecurePoll [8 hours].Dec 16 2020, 5:37 PM
Niharika added a project: Spike.
Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptDec 16 2020, 5:37 PM
Niharika updated the task description. (Show Details)
Niharika updated the task description. (Show Details)

I ended up doing some of this work as part of my research for T270342#6702969. The broad strokes are the same - we'll need a logging table for the SecurePoll extension as a whole which can then be used to log various events.

Logging

Here's the example from CheckUser (cu_log) and my proposal for SecurePoll's table:

sqlite> .schema cu_log
CREATE TABLE cu_log (
 cul_id INTEGER  not null primary key AUTOINCREMENT,
 cul_timestamp BLOB not null,
 cul_user INTEGER  not null,
 cul_user_text TEXT  not null,
 cul_reason TEXT  not null,
 cul_type BLOB not null,
 cul_target_id INTEGER  not null default 0,
 cul_target_text BLOB not null,
 cul_target_hex BLOB not null default '',
 cul_range_start BLOB not null default '',
 cul_range_end BLOB not null default ''
 );
CREATE INDEX cul_user ON cu_log (cul_user, cul_timestamp)
;
CREATE INDEX cul_type_target ON cu_log (cul_type,cul_target_id, cul_timestamp)
;
CREATE INDEX cul_target_hex ON cu_log (cul_target_hex, cul_timestamp)
;
CREATE INDEX cul_range_start ON cu_log (cul_range_start, cul_timestamp)
;
CREATE INDEX cul_timestamp ON cu_log (cul_timestamp)

an example row:

sqlite> select * from cu_log limit 1;
cul_id      cul_timestamp   cul_user    cul_user_text  cul_reason  cul_type    cul_target_id  cul_target_text  cul_target_hex  cul_range_start  cul_range_end
----------  --------------  ----------  -------------  ----------  ----------  -------------  ---------------  --------------  ---------------  -------------
1           20201203023926  1           Admin                      userips     1              Admin

We'd probably have to do something similar (spitballing - id, timestamp, user, user_text, type, election_id, election_text) and use an id like editadmins as well as log the diff (see logging for more details). Something like:

  1. Get existing admins, if any
  2. Get updated admins
  3. array_diff
  4. Log the diff

Admins are added as part of the create/edit page (CreatePage.php) and you cannot add users who do not exist:

image.png (410×1 px, 33 KB)

The validation happens here: https://github.com/wikimedia/mediawiki-extensions-SecurePoll/blob/11b1988728d402ec7dbfd4f851628038471ad08d/includes%2FPages%2FCreatePage.php#L292

This means that at the time we insert/update the database w/this election, we can be assured if admins are being added, they are valid. The properties (including admins) are saved to the db here: https://github.com/wikimedia/mediawiki-extensions-SecurePoll/blob/11b1988728d402ec7dbfd4f851628038471ad08d/includes%2FPages%2FCreatePage.php#L636

Get existing admins (if any)

A new election will not have an id ($this->election->getId()) and therefore no existing admins. If the election does already exist, we can pull $currentAdmins from the database w/a statement like:

			$dbw->selectRow(
				'securepoll_properties',
				'pr_value',
				[
					'pr_key' => 'admins',
					'pr_entity' => $election->getId()
				],
				__METHOD__,
				[ 'FOR UPDATE' ]
			);

Which is the wrapper method(ish) for select pr_value from securepoll_properties where pr_key = "admins" and pr_entity = $ID;

Here's what this looks like in the table:

sqlite> select * from securepoll_properties where pr_key = "admins";
pr_entity   pr_key      pr_value           
----------  ----------  -------------------
1           admins      Admin|Admin2|Admin3
11          admins      Admin|Admin2       
8           admins      Admin              
31          admins      Admin

We can convert this to an array for a diff via explode()

Get updated admins (if any)

If the admins are being updated, it will be available as data on $formData['property_admins'] as an array.

array_diff

New election:
It's a new election so we're adding admins for the first time.

Adding admins:

<?php

$array1 = ['A'];
$array2 = ['A','B'];

var_dump(array_diff($array1,$array2));
// array(0) {
// }
var_dump(array_diff($array2,$array1));
// array(1) {
//   [1]=>
//   string(1) "B"
// }

Removing admins:

<?php

$array1 = ['A', 'B'];
$array2 = ['A'];

var_dump(array_diff($array1,$array2));
// array(1) {
//   [1]=>
//   string(1) "B"
// }
var_dump(array_diff($array2,$array1));
// array(0) {
// }

I didn't dig much deeper than this but I think we can do a diff both ways to determine both additions and subtractions to the Admins list.

Log the diff

We'll have to decide how we want to log this since if we're using the same logging table for both this and #T270342, we might have some extra rows. As far as I can tell, cu_log records only ever need to log who accesses what (read only), whereas we'd be logging both read and write types of interactions in SecurePoll. I don't know what sort of architecture we'd settle on when we're inserting it into the db, but here are all the sources for that information:

Displaying/Downloading

copy-pasta from the other investigation

CheckUser has a /Special:CheckUserLog page where all the access logs can be viewed. It explicitly searches for the following types of log lines: [ 'userips', 'useredits', 'investigate' ] and otherwise allows the user to filter by target/initiator/date range. I mention this because if you look at the cu_log index, it covers all these possible values.

Presumably, we'd do something similar to create a wikipage (/Special:SecurePollLog?).

As for downloading as a wikitext or a csv, SecurePoll does a dump in what appears to be xml format (https://github.com/wikimedia/mediawiki-extensions-SecurePoll/blob/master/includes/Pages/DumpPage.php) by sending it back as a Content-Disposition: attachment which then gets downloaded. SecurePoll also apparently does some kind of csv dumping already (https://github.com/wikimedia/mediawiki-extensions-SecurePoll/blob/master/cli/dumpVoteCsv.php) so I assume it's also possible w/some kind of adjustment to the source (db instead of xml). I'd have to look at other examples but I can't imagine it's not impossible.

As for wikitext, same - I can't imagine its impossible and it would be nice to grok it from other examples (that I'm not aware of atm) but in general, we'd generate the markup ourselves and output it as the correct file type and send it back as an attachment (since we know we can access the database in some kind of array format, it then just leaves post-processing).