Page MenuHomePhabricator

Create schema to track users opting in/out of desktop improvements
Closed, ResolvedPublic5 Estimated Story Points

Description

Background

Due to some issues with the preference update schema, we will need to create a schema that will only track the opt-ins and opt-outs of desktop improvements. This task tracks the creation of that schema.

Target

We are hoping to see less than 40% of logged-in users opting out of the new experience

Acceptance criteria

We will need the following

  • User ID
  • Initial state of skin version
  • Final state of skin version
  • User edit bucket (if applicable)

Developer notes

This schema already exists, but doesn't quite do what we want it to do, so implementation would essentially be copying the existing preference logging in WikimediaEvents, via the same hooks, but specifically looking for the skin version option.

For purpose of estimation, it would probably make sense to update PrefUpdateInstrumentation in WikimediaEvents with some desktop refresh special handling, logged to another schema.

Using onPreferencesFormPreSave to distinguish between the switched skin option.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@phuedx https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/668530 has been updated to handle the private salt. I'm putting this back in code review for your review

Change 668529 merged by jenkins-bot:

[schemas/event/secondary@master] Add new analytics/pref_diff schema

https://gerrit.wikimedia.org/r/668529

nray reassigned this task from Edtadros to cjming.
nray reassigned this task from cjming to phuedx.
nray moved this task from Code Review to QA on the Readers-Web-Backlog (Kanbanana-FY-2020-21) board.
nray moved this task from QA to Code Review on the Readers-Web-Backlog (Kanbanana-FY-2020-21) board.
nray added a subscriber: Edtadros.
nray added a subscriber: cjming.

Change 668530 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Add VectorPrefDiffInstrumentation to track all skin changes involving Vector

https://gerrit.wikimedia.org/r/668530

I moved this into Blocked on Others but it's really blocked on… us. We need to generate, configure, and deploy a value for $wgWMEVectorPrefDiffSalt. Regrettably, I haven't yet had time to do this.

Mentioned in SAL (#wikimedia-operations) [2021-04-08T18:51:35Z] <phuedx@deploy1002> Synchronized private/PrivateSettings.php: PrivateSettings: Add value for (T261842) (duration: 01m 06s)

Mentioned in SAL (#wikimedia-operations) [2021-04-08T18:52:57Z] <phuedx> phuedx@deploy1002 Synchronized private/PrivateSettings.php: PrivateSettings: Add value for $wgWMEVectorPrefDiffSalt (T261842)

@MNeisler - looks like this is deployed and ready for QA

The new schema and instrumentation have now been deployed (thank you @phuedx for deploying!) . I believe additions to config (involving wgEventStreams and wgEventLoggingStreamNames) are necessary in order to see the events though which is why I've moved this back to doing

nray removed nray as the assignee of this task.Apr 15 2021, 5:30 AM

Config patches have been created, I'll plan to backport Thursday

Change 679613 had a related patch set uploaded (by Nray; author: Nray):

[operations/mediawiki-config@master] Add mediawiki.pref_diff stream to wgEventLoggingStreamNames/wgEventStreams

https://gerrit.wikimedia.org/r/679613

Change 679614 had a related patch set uploaded (by Nray; author: Nray):

[operations/mediawiki-config@master] Add $wgWMEVectorPrefDiffSalt to private/readme

https://gerrit.wikimedia.org/r/679614

Change 679613 merged by jenkins-bot:

[operations/mediawiki-config@master] Add mediawiki.pref_diff stream to wgEventLoggingStreamNames/wgEventStreams

https://gerrit.wikimedia.org/r/679613

Mentioned in SAL (#wikimedia-operations) [2021-04-15T18:32:01Z] <jdrewniak@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679613|Add mediawiki.pref_diff stream to wgEventLoggingStreamNames/wgEventStreams (T261842)]] (duration: 01m 18s)

Change 679614 merged by jenkins-bot:

[operations/mediawiki-config@master] Add $wgWMEVectorPrefDiffSalt to private/readme

https://gerrit.wikimedia.org/r/679614

Mentioned in SAL (#wikimedia-operations) [2021-04-15T18:39:46Z] <jdrewniak@deploy1002> Synchronized private/readme.php: Config: [[gerrit:679614|Add $wgWMEVectorPrefDiffSalt to private/readme (T261842)]] (duration: 01m 08s)

@MNeisler As I mentioned in slack I believe all the config necessary to see the opt-in/opt-outs has been setup and deployed. I think it will be presented to you either as mediawiki.pref_diff or analytics/pref_diff (if you don't see anything with the name "pref_diff", that's probably a bad sign

I confirmed that events for this schema are logged and stored in eventlogging data as event.mediawiki_pref_diff.

Leaving this task open as I still need to do a more thorough QA check next week but based on an initial check of the data it appears that events are logging as expected. Thanks @nray and @phuedx for your help getting this deployed!

I completed a QA of the events logged from the new schema (mediawiki_pref_diff).

All events appear to be logging as expected but, when reviewing the data, I noticed some events (about 3.5% of all events) where the final_state was recorded as 'vector' (instead of 'vector1' or 'vector2'). These events were only recorded for users switching from an initial_state of 'vector2' during the reviewed time period (15 April 2021 through 26 April 2021). @nray @phuedx - any ideas as to what these events represent?

See breakdown of preference changes by final selected state below:

Note: Data below reflects user-initiated desktop skin preference update events recorded across all Wikis from 15 April 2021 through 26 April 2021.

initial_statefinal_statetotal_selectionstotal_users
colognebluevector133
colognebluevector233
minervavector1149139
minervavector210195
modernvector19089
modernvector27765
monobookvector110083
monobookvector26665
timelessvector1136133
timelessvector2117112
vector1minerva371361
vector1modern303298
vector1monobook163144
vector1timeless340334
vector1vector2453421
vector2minerva10796
vector2modern11097
vector2monobook5249
vector2timeless9392
vector2vector11753
vector2vector1432406

Here's a link to the notebook where I've been performing the checks. Note the full analysis of data will be done in T269664

@MNeisler Thank you for letting me know.

I looked at the code today, and I'm still perplexed how the vector value is happening. There are only two places in the code where that would even be allowed to happen [1][2], but I haven't been able to explain yet how it would happen in production. @phuedx Would you mind taking a look to make sure I haven't missed anything? I believe it's happening, I'm just confused how it's happening.

I tried every permutation I could think of locally, and I didn't find one instance where an event had the vector value. However, in the midst of trying to replicate this issue, I did discover some edge cases about the hook I'm using that might be undesirable, but is information that you should probably know about as it may influence the proceeding work:

  • At the bottom of the preferences form, there is a button with red text, "Restore all default settings (in all sections)". Unfortunately, I learned today that resetting the form through this button, won't trigger the hook I used for this instrumentation (onPreferencesFormPreSave), and therefore an event won't be sent when a user resets the form. Notably, the prefupdate instrumentation would be notified of this change as it uses a different hook (onUserSaveOptions). However, the prefupdate instrumentation currently only sends an event for skin resets ("Minerva", "Vector") and does not currently send an event for skin version resets ("1", "2") so more work would be required for that instrumentation as well to handle this scenario.
  • I installed the GlobalPreferences extension locally to see if that could be responsible for the vector value. I learned that users who change their preferences through the Global Preferences form won't trigger the hook I used for this instrumentation and therefore an event won't be sent when using the global preferences form. The prefupdate instrumentation does not currently handle this, and an event wouldn't be sent using that either. Additional code added in the GlobalPreferences extension would be needed for an event to fire.
Developer notes

[1] https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/c0e907a32fd3dd61044e27318f1276d487bb93e9/includes/VectorPrefDiffInstrumentation.php#L135

  • if VectorSkinVersion was not one of the form fields, the null coalescing operator would result in a blank string which would result in vector. However, this field should always be set per Vector Hooks?
  • if VectorSkinVersion was one of the form fields but had a null value, the null coalescing operator would result in a blank string which would result in vector. However, this should always be a string per HTMLLegacySkinVersionField ?

[2] https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/c0e907a32fd3dd61044e27318f1276d487bb93e9/includes/VectorPrefDiffInstrumentation.php#L91

  • if vector !== vector, vector could be returned ?

I admit that I'm at a loss here too. I too tested this with/without the GlobalPreferences extension loaded and by setting and with/without JavaScript enabled and I can't reproduce this issue.

I do have a couple more observations to add:

  • Looking at the event.mediawiki_pref_diff Hive table, there appear to be instances of the same user submitting the form multiple times in a similar timeframe with the same initial and final values. The events with duplicate user_hash fields have different request IDs so it's reasonable to conclude that the user is submitting the form again
  • Unfortunately, EventLogging::submit doesn't log the URI of the page that the event was logged from
  • HTMLFormField::hasField has a different meaning than array_key_exists( $formData[ ... ] ). The former asks the form whether the field has been defined whereas the latter gets the value of the form field.

[1] https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/c0e907a32fd3dd61044e27318f1276d487bb93e9/includes/VectorPrefDiffInstrumentation.php#L135

  • if VectorSkinVersion was not one of the form fields, the null coalescing operator would result in a blank string which would result in vector. However, this field should always be set per Vector Hooks?
  • Following on from my last point above, on VectorPrefDiffInstrumentation.php#L113 and L114 we check whether the form has the skin and VectorSkinVersion are defined before continuing, which means that Vector's GetPreferences hook handler has run
  • if VectorSkinVersion was one of the form fields but had a null value, the null coalescing operator would result in a blank string which would result in vector. However, this should always be a string per HTMLLegacySkinVersionField ?

This is the only case that I can think of and we should defend against it. We should also defend against the skin field being bogus data too.

Change 684972 had a related patch set uploaded (by Nray; author: Nray):

[mediawiki/extensions/WikimediaEvents@master] Add more guard statements to VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/684972

This is the only case that I can think of and we should defend against it. We should also defend against the skin field being bogus data too.

Thank you for looking at it @phuedx. The patch above adds more guard statements which should stop those events from being sent. I'm still perplexed what would make the data look like that in the first place though. I've even tried mutating the form fields in the DOM to try to get it to send bad data, but I've not been able to replicate (also the backend code for the form fields appears to have code in place that protects against this).

@Jdlrobson as someone who likes mystery (and if you have time), you might be interested in taking a look at this as well as I'm still not sure what would lead to this :)

nray removed nray as the assignee of this task.May 6 2021, 8:13 PM
nray assigned this task to phuedx.

This is blocked until @nray's patch rides the train. IIRC there were some events with erroneous data logged from group0 wikis, so we should know more by EOD tomorrow.

Change 684972 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Add more guard statements to VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/684972

Change 690065 had a related patch set uploaded (by Nray; author: Nray):

[mediawiki/extensions/WikimediaEvents@master] Fix 'final_state: vector' bug

https://gerrit.wikimedia.org/r/690065

nray removed phuedx as the assignee of this task.May 13 2021, 12:04 AM
nray assigned this task to phuedx.

Okay, with the helpful context that the recently added logging code sent, I was finally able to reproduce the final_state: vector bug locally.

How to reproduce

If the following conditions are met, this instrumentation would send an event with the final_state field set to vector (incorrect) instead of vector1 (correct) or vector2 (correct):

  1. A user uses the GlobalPreferences form and saves the GlobalPreferences VectorSkinVersion form field unchecked (corresponding to modern Vector).
  2. The above user then changes to the Vector skin from another skin (e.g. from Minerva to Vector) using the default Preferences form (not using the GlobalPreferences form) and without having checked the "Set a local exception for this global preference" for the VectorSkinVersion field.

Although I tried to reproduce this bug using the GlobalPreferences extension earlier, I must have saved the GlobalPreferences form with the VectorSkinVersion form field checked (corresponding to legacy Vector) which doesn't surface this issue.

Why does this happen?

This occurs because:

  • The GlobalPreferences extension disables any form field that the user has set with the GlobalPreferences form if the user is on the default preferences form (not using the GlobalPreferences form) AND if there is not a "local exception" for the form field. [1]
  • If a form field is disabled, the HTMLForm class calls the field's getDefault method instead of the field's loadDataFromRequest method. The former call returns a bool while the latter call returns a string in the form of "1" or "2". [2]
  • Not knowing the preceding point, this instrumentation expected the VectorSkinVersion field to always return a string of either "1" or "2" from the field's loadDataFromRequest method [3]. Regardless of whether the VectorSkinVersion form field returned a bool or a string, this instrumentation would try to concat the value with the skin name. [4] In PHP, "vector" . false === "vector" (which reproduces the issue). What make this even more confusing is that "vector" . true === "vector1" (which doesn't reproduce the issue).

How can we fix it?

To quickly fix this instrumentation's handling of such data, I refactored the generateSkinVersionName method to correctly handle a skin version that is either a bool or a string.

[1] https://github.com/wikimedia/mediawiki-extensions-GlobalPreferences/blob/ccf4c9d470bfc0714119153b592bcc167dceccc6/includes/GlobalPreferencesFactory.php#L175
[2] https://github.com/wikimedia/mediawiki/blob/fca9c972de9333bfbf881fdaa639abe27b9de4da/includes/htmlform/HTMLForm.php#L1861-L1865
[3] https://github.com/wikimedia/Vector/blob/25ee08077345e0709fc9185ced9dd8757653b4ae/includes/HTMLForm/Fields/HTMLLegacySkinVersionField.php#L65-L69
[4] https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/92450158484308e2416a2591bd7e2dcf278f3cbd/includes/VectorPrefDiffInstrumentation.php#L92

Change 690065 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Fix 'final_state: vector' bug in VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/690065

@nray's change can be backported next Monday (17th), if necessary. Otherwise, I'd expect to see no related entries in the AdHocDebug channel by next Thursday (20th).

Change 690081 had a related patch set uploaded (by Nray; author: Nray):

[mediawiki/extensions/WikimediaEvents@wmf/1.37.0-wmf.5] Fix 'final_state: vector' bug in VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/690081

Change 690081 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@wmf/1.37.0-wmf.5] Fix 'final_state: vector' bug in VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/690081

Mentioned in SAL (#wikimedia-operations) [2021-05-13T23:22:30Z] <thcipriani@deploy1002> Synchronized php-1.37.0-wmf.5/extensions/WikimediaEvents: Backport: [[gerrit:690081|Fix "final_state: vector" bug in VectorPrefDiffInstrumentation (T261842)]] (duration: 01m 07s)

Change 690789 had a related patch set uploaded (by Nray; author: Nray):

[mediawiki/extensions/WikimediaEvents@wmf/1.37.0-wmf.4] Fix 'final_state: vector' bug in VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/690789

Change 690789 abandoned by Nray:

[mediawiki/extensions/WikimediaEvents@wmf/1.37.0-wmf.4] Fix 'final_state: vector' bug in VectorPrefDiffInstrumentation

Reason:

I think I will let this ride the train instead.

https://gerrit.wikimedia.org/r/690789

nray reassigned this task from Edtadros to MNeisler.

@MNeisler A fix for the final_state: vector bug has been deployed to all wikis. Can you please verify that the data you see is free of anomalies?

I rechecked the data logged in mediawiki_pref_diff and confirmed that we stopped recording final_state = 'vector' events on May 17th.

During my review, I noticed one event where a user switched from initial_state = 'vector1' to final_state = 'wikimediaapiportal' on api.wikimedia.org. See details regarding the event below: Any idea what occurs to fire this event? @phuedx

dateinitial_statefinal_stateedit_countwikinum_selections
2021-05-19vector1wikimediaapiportal0 editsapi.wikimedia.org1

Besides that one event, I don't see any other anomalies and events are logging as expected. @ovasileva I'll plan to go ahead and start on T269664 next week.

QA notebook

I rechecked the data logged in mediawiki_pref_diff and confirmed that we stopped recording final_state = 'vector' events on May 17th.

During my review, I noticed one event where a user switched from initial_state = 'vector1' to final_state = 'wikimediaapiportal' on api.wikimedia.org. See details regarding the event below: Any idea what occurs to fire this event? @phuedx

dateinitial_statefinal_stateedit_countwikinum_selections
2021-05-19vector1wikimediaapiportal0 editsapi.wikimedia.org1

Besides that one event, I don't see any other anomalies and events are logging as expected. @ovasileva I'll plan to go ahead and start on T269664 next week.

QA notebook

Great news - thanks @MNeisler!

@MNeisler - seems like we're all done here! I'm going to go ahead and resolve but feel free to re-open in case there's anything missing

Change 700702 had a related patch set uploaded (by Nray; author: Nray):

[mediawiki/extensions/WikimediaEvents@master] Remove debug logging from VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/700702

Change 700702 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Remove debug logging from VectorPrefDiffInstrumentation

https://gerrit.wikimedia.org/r/700702