Page MenuHomePhabricator

Enable read new support in Special:CheckUser's 'Get edits' mode
Closed, ResolvedPublic

Description

Now that CheckUser writes to the new tables for event table migration (T330158), reading from these tables when the migration stage is set to read new can commence.

This task is to add support to read new in Special:CheckUser's 'Get edits' mode.

Acceptance criteria
  • Alias the current query info fields in CheckUserGetEditsPager to prepare for migration
  • Add code to read from the two new tables when reading new in CheckUserGetEditsPager
  • Use LogFormatter generated actiontext for entries that are read from the new tables.
  • Link to the specific log ID for entries that are read from the new tables.

Related Objects

StatusSubtypeAssignedTask
ResolvedFeatureDreamy_Jazz
OpenFeatureDreamy_Jazz
ResolvedSecurityZabe
ResolvedSecurityDreamy_Jazz
ResolvedSecurityDreamy_Jazz
ResolvedSecurityDreamy_Jazz
OpenBUG REPORTNone
ResolvedBUG REPORTDreamy_Jazz
ResolvedBUG REPORTDreamy_Jazz
ResolvedDreamy_Jazz
OpenNone
OpenNone
OpenNone
OpenFeatureDreamy_Jazz
DuplicateNone
OpenFeatureNone
ResolvedFeatureDreamy_Jazz
ResolvedFeatureDreamy_Jazz
ResolvedFeatureDreamy_Jazz
OpenFeatureNone
OpenFeatureNone
ResolvedBUG REPORTDreamy_Jazz
DeclinedFeatureNone
OpenFeatureNone
ResolvedGlaisher
OpenFeatureNone
ResolvedDreamy_Jazz
OpenFeatureDreamy_Jazz
ResolvedDreamy_Jazz
OpenNone
ResolvedFeatureDreamy_Jazz
ResolvedGlaisher
Resolved Niharika
ResolvedNone
ResolvedFeatureDreamy_Jazz
DeclinedNone
ResolvedFeatureDreamy_Jazz
ResolvedFeatureDreamy_Jazz
DuplicateNone
OpenNone
DuplicateNone
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
OpenFeatureNone
ResolvedDreamy_Jazz
ResolvedBUG REPORTDreamy_Jazz
ResolvedBUG REPORTDreamy_Jazz
OpenNone
OpenFeatureNone
DeclinedFeatureNone
OpenFeatureNone
ResolvedDreamy_Jazz
OpenNone
Resolvedtstarling
OpenNone
ResolvedTchanders
OpenBUG REPORTNone
Resolvedkostajh
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
OpenNone
OpenFeatureDreamy_Jazz
ResolvedDreamy_Jazz
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedBUG REPORTDreamy_Jazz

Event Timeline

Change 884132 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Prepare CheckUserGetEditsPager for event table migration

https://gerrit.wikimedia.org/r/884132

Change 884132 abandoned by Dreamy Jazz:

[mediawiki/extensions/CheckUser@master] Prepare CheckUserGetEditsPager for event table migration

Reason:

Replaced with Ia52d8ac125f902fa8549866816300d06327e1a5f

https://gerrit.wikimedia.org/r/884132

Change 886482 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Add support for writing both new and old for event table migration

https://gerrit.wikimedia.org/r/886482

Change 937455 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] [WIP] Enable read new support for CheckUserGetEditsPager

https://gerrit.wikimedia.org/r/937455

Change 938006 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Make multiple fixes to CheckUserUnionSelectQueryBuilder

https://gerrit.wikimedia.org/r/938006

Change 938006 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Make multiple fixes to CheckUserUnionSelectQueryBuilder

https://gerrit.wikimedia.org/r/938006

Maintenance script has been created and is stable enough to allow progress on this.

When testing this patch locally I have generated the following table of times took for the old code vs the new code:

The time displayed in the "Waiting" line in Firefox's Timings tab.

Number of resultsWhen reading oldWhen reading new
5,0009s11.2s
2,5004.45s5.975s
2000.51s0.57s

Time given by DB when running the associated SQL query in the console:

Number of resultsWhen reading oldWhen reading new
5,0000.026s0.047s
2,5000.011s0.029s
2000.006s0.014s

Running the queries on production using the UNION query were very slow, so an alternative method is needed (likely three SQL queries).

Dreamy_Jazz renamed this task from Use the CheckUserUnionSelectQueryBuilder in Special:CheckUser's 'Get edits' mode to Enable read new support in Special:CheckUser's 'Get edits' mode.Aug 30 2023, 10:57 PM
Dreamy_Jazz updated the task description. (Show Details)

Change 954051 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] [WIP] Event table migration: Implement read new support for 'get edits'

https://gerrit.wikimedia.org/r/954051

Change 937455 abandoned by Dreamy Jazz:

[mediawiki/extensions/CheckUser@master] Implement read new support for CheckUserGetEditsPager

Reason:

Testing queries in production found that the CheckUserUnionSelectQueryBuilder had unacceptable performance issues when running 'Get edits' over a large range. Using three separate queries was better for performance.

https://gerrit.wikimedia.org/r/937455

Change 957755 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/Flow@master] Add cuc_ prefix to comment_text and comment_data if they are set

https://gerrit.wikimedia.org/r/957755

Change 957755 merged by jenkins-bot:

[mediawiki/extensions/Flow@master] Add cuc_ prefix to comment_text and comment_data if they are set

https://gerrit.wikimedia.org/r/957755

Change 954051 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Implement read new support for CheckUserGetEditsPager

https://gerrit.wikimedia.org/r/954051

Change 959253 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Remove CheckUserUnionSelectQueryBuilderFactory from SpecialCheckUser

https://gerrit.wikimedia.org/r/959253

Suggested QA steps (requires a local instance to test):

  1. Install the CheckUser extension
  2. Check that no definition of $wgCheckUserEventTablesMigrationStage exists in your LocalSettings.php
  3. Add $wgCheckUserLogLogins = true; into your LocalSettings.php
  4. Create a few testing accounts
  5. Make a few testing edits, log events (such as moves), and log in and out of the testing accounts
  6. Log into an account with the checkuser right
  7. Open Special:CheckUser, enter one account username you created in step 4 to the "Username" field, change the check subtype to 'Get IP Addresses', and then run a check.
  8. Click on the IP address shown in the results list
  9. Ensure the check type is 'Get edits' on this new page and run the check
  10. Verification step 1: Ensure the actions you performed in steps 4 and 5 appear in the results list
  11. Save a screenshot of the results for later comparison
  12. Add $wgCheckUserEventTablesMigrationStage = SCHEMA_COMPAT_NEW; to your LocalSettings.php file
  13. Press the check button on the 'Get edits' check opened in step 9
  14. Verification step 2: Verify that the results list did not visually change, including checking for duplicate result lines.

If QA needs suggested steps that work on patchdemo, please ping me and I will create them.

Differences that I have observed so far between SCHEMA_COMPAT_READ_OLD and SCHEMA_COMPAT_READ_NEW:

  1. Usernames, temporary usernames and IPs are links, have mw-userlink class and are wrapped in <bdi>. Before they were plain text.
  2. Temp user links to special:contribs and has mw-tempuserlink class, so they are properly styled.
  3. Many entries now have two log links (log | logs), for the Special:Log&logid=<log id> and Special:Log&page=<page or user> respectively.
    log_logs.png (100×2 px, 63 KB)
  4. Entries for group membership changes use the public(?) name for user groups, not their internal name. For example, it says administrator rather than sysop.
  5. Entries for blocks have the expiry in a <span class="blockExpiry" title="‎infinity">indefinite</span>, which means you get a tooltip.
  6. It recognises where a page is a redirect and adds redirect=no to the page's URL.
  7. Entries of page suppressions show in full. Before they showed as (username removed) (log details removed).
    page_suppress_log.png (100×1 px, 66 KB)

These all seem OK to me.

The entries do not always appear in the same order when you switch between SCHEMA_COMPAT_READ_OLD and SCHEMA_COMPAT_READ_NEW. It seems to happen if different entries have the same timestamp. I don't know if that matters, although it does make side-by-side comparison a bit harder. Perhaps T346044 will fix this.

Differences that I have observed so far between SCHEMA_COMPAT_READ_OLD and SCHEMA_COMPAT_READ_NEW:

  1. Usernames, temporary usernames and IPs are links, have mw-userlink class and are wrapped in <bdi>. Before they were plain text.
  2. Temp user links to special:contribs and has mw-tempuserlink class, so they are properly styled.
  3. Many entries now have two log links (log | logs), for the Special:Log&logid=<log id> and Special:Log&page=<page or user> respectively.
    log_logs.png (100×2 px, 63 KB)
  4. Entries for group membership changes use the public(?) name for user groups, not their internal name. For example, it says administrator rather than sysop.
  5. Entries for blocks have the expiry in a <span class="blockExpiry" title="‎infinity">indefinite</span>, which means you get a tooltip.
  6. It recognises where a page is a redirect and adds redirect=no to the page's URL.

Thanks. I agree that these are okay (and most are intentional and/or caused by intentional changes)

  1. Entries of page suppressions show in full. Before they showed as (username removed) (log details removed).
    page_suppress_log.png (100×1 px, 66 KB)

I presume this is only when the user account you are using has the viewsuppressed right? If not, then this is an issue.

The entries do not always appear in the same order when you switch between SCHEMA_COMPAT_READ_OLD and SCHEMA_COMPAT_READ_NEW. It seems to happen if different entries have the same timestamp. I don't know if that matters, although it does make side-by-side comparison a bit harder. Perhaps T346044 will fix this.

This is an unfortunate side effect of selecting results from three tables, and then ordering them based on their timestamp. I can try to improve this, but there isn't another easy way to determine what event happened "first" when they have the same timestamp and are from different tables.

  1. Entries of page suppressions show in full. Before they showed as (username removed) (log details removed).
    page_suppress_log.png (100×1 px, 66 KB)

I presume this is only when the user account you are using has the viewsuppressed right? If not, then this is an issue.

Based on a quick inspection (had a few moments to check) it doesn't seem to be checking the deleted status. This will need a fix. This is a security issue, but IMO it is minor as:

  • When reading old, actiontext suppressed or deleted after the fact doesn't get hidden (and this change allows this to happen but mistakenly doesn't apply it).
  • The code isn't used unless read new is set (so can't be done on any wiki, including WMF wikis, unless the configuration was changed to read new).
  • You have to have checkuser rights to see it if read new was enabled.

I have an hour of free time and I'll work on this now.

  1. Entries of page suppressions show in full. Before they showed as (username removed) (log details removed).
    page_suppress_log.png (100×1 px, 66 KB)

I presume this is only when the user account you are using has the viewsuppressed right? If not, then this is an issue.

Based on a quick inspection (had a few moments to check) it doesn't seem to be checking the deleted status. This will need a fix. This is a security issue, but IMO it is minor as:

  • When reading old, actiontext suppressed or deleted after the fact doesn't get hidden (and this change allows this to happen but mistakenly doesn't apply it).
  • The code isn't used unless read new is set (so can't be done on any wiki, including WMF wikis, unless the configuration was changed to read new).
  • You have to have checkuser rights to see it if read new was enabled.

I have an hour of free time and I'll work on this now.

This was filed as https://phabricator.wikimedia.org/T347773 and is ready for QA.

I think everything else is okay so QA shouldn't be blocked for this task (let me know if I've missed something).

I think everything else is okay so QA shouldn't be blocked for this task (let me know if I've missed something).

Thanks. I haven't noticed any more differences other than the ones listed above.

I tested a variety of different actions, including those generated by the populateCheckUserTablesWithSimulatedData.php script.

I performed checks against usernames, temporary users, IPs, ranges and XFF.