Page MenuHomePhabricator

Refactor Table of Contents A/B test to bucket by user
Closed, ResolvedPublic3 Estimated Story Points

Description

Background

While working on preserving the TOC A/B test code in T310527, the next iteration of the experiment will test logged-in users by bucketing by user ID rather than page ID. This requires a refactor of the TOC A/B test feature.

Description

Refactor the TOC A/B test to bucket by user instead of by page (excluding anons).

Acceptance criteria

  • When TOC A/B test is enabled, it applies only to logged-in users.
  • Control group sees old treatment of TOC
  • Control group loads legacy TOC styles (skins.vector.AB.styles)
  • Treatment group sees new treatment of TOC
  • Treatment group doesn't load legacy TOC styles (skins.vector.AB.styles) << path of least resistance is to load AB overriding styles when test is enabled
  • Unsampled users see whatever is specified by default config. << with the current implementation and test scope, all logged-in users will be sampled so users will be in either control or treatment buckets

Developer notes

Previous TOC A/B test functionality leveraged backend bucketing based on page ID using the PageSplitter object referenced by the WebABTestArticleIdStrategy and WebABTestArticleIdFactory classes in WME and applies the necessary classes to the body using a Vector hook.

To switch to using user ID, we can use the Vector hook to assign logged users in the treatment or control groups and assigning relevant body classes accordingly. However the current state of CSS for showing the appropriate TOC in an A/B test context does not account for grid being active so there is some added complexity to display the correctly styled TOC/page with grid enabled.

It should simplify showing the correct TOC to control/treatment buckets once grid has landed (instead of having to account for both use cases of grid being on or off). In SkinVector22:getTemplateData(), the data that's passed to the template only considers whether grid is enabled which means many of the styles related to the classes (i.e. vector-layout-grid vector-toc-visible vs vector-layout-legacy) applied by that method need to be overridden during an active A/B test.

QA steps

  • Set up local event logging with local config (toggling the enabled key between true and false):
$wgVectorWebABTestEnrollment = [
	'name' => 'skin-vector-toc-experiment',
	'enabled' => true,
	'buckets' => [
		'unsampled' => [
			'samplingRate' => 0,
		],
		'control' => [
			'samplingRate' => 0.5,
		],
		'treatment' => [
			'samplingRate' => 0.5,
		]
	]
];
  • With the experiment enabled:
    • Log in as a user with an even id and navigate to a page with a TOC:
      • The treatment should be visible - collapsible TOC (no TOC inline in main content)
      • There should be a stylesheet called skins.vector.AB.styles upon inspection via console
      • An event is logged with group property set to treatment
    • Log in as a user with an odd id and navigate to a page with a TOC:
      • The treatment should not be visible - TOC inline in main content (no collapsible TOC)
      • There should be a stylesheet added called skins.vector.AB.styles upon inspection via console
      • An event is logged with group property set to control
    • Navigate to a page with a TOC as an anonymous user:
      • You should see TOC as defined by default config (collapsible TOC - no inline TOC)
      • There should be a stylesheet added called skins.vector.AB.styles upon inspection via console
      • No event should be logged
  • With the experiment disabled:
    • Log in as a user with an even or odd id and navigate to a page with a TOC:
      • You should see TOC as defined by default config (collapsible TOC - no inline TOC)
      • There should be no stylesheet called skins.vector.AB.styles upon inspection via console
      • No event should be logged
    • Navigate to a page with a TOC as an anonymous user:
      • You should see TOC as defined by default config (collapsible TOC - no inline TOC)
      • There should be no stylesheet called skins.vector.AB.styles upon inspection via console
      • No event should be logged

Event Timeline

Change 815788 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/skins/Vector@master] POC Refactor TOC A/B test to bucket by user

https://gerrit.wikimedia.org/r/815788

Attached POC patch has problems -- most notably by moving bucketing to the client, the classes needed to show correctly styled TOC to control/treatment groups are added/removed by js (is there a better approach?) and causes the dreaded FOUC for some use cases. This can probably avoided by setting the default CSS to not show the new TOC on initial page load.

We should just use the backend Vector hook to assign logged users to the appropriate bucket -- attached POC is not the approach we want to take

ovasileva raised the priority of this task from Medium to High.Jul 26 2022, 2:57 PM
cjming updated the task description. (Show Details)

Change 815788 abandoned by Clare Ming:

[mediawiki/skins/Vector@master] POC Refactor TOC A/B test to bucket by user

Reason:

different approach needed

https://gerrit.wikimedia.org/r/815788

Change 819694 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/skins/Vector@master] Refactor TOC A/B test to bucket users on backend

https://gerrit.wikimedia.org/r/819694

hi @ovasileva cc @Jdlrobson @jwang

For the 2nd iteration of the TOC A/B test, I'm assuming we don't have to worry about unsampled users because we are testing only logged-in users for 2 smaller wikis ? The refactor of the relevant test (all temporary code that will be removed once test is over) basically throws users with even ids into the treatment bucket and users with odd ids into the control bucket but I will need to futz with it more if we have to account for unsampled users.

I made a note about this next to the last AC in the ticket description.

cjming removed cjming as the assignee of this task.Aug 8 2022, 4:59 PM

discussed during SHDT -- we will go with the even/odd user id bucketing method which assumes 100% sampling of all logged-in users. This method of bucketing does not take into account sampling rates passed in via config.

Change 819694 merged by jenkins-bot:

[mediawiki/skins/Vector@master] Refactor TOC A/B test to bucket users on backend

https://gerrit.wikimedia.org/r/819694

hi @jwang @ovasileva -- there was a note in this week's planning doc to figure out which wikis would be part of the 2nd round of the TOC A/B experiment.

As a refresher, we deployed the 1st TOC A/B test to the following:

  • euwiki + hewiki at 100% sampling rate T306606
  • pilot wikis at 100% sampling rate with frwiki + ptwiki at 50% sampling rate T306607

For the 2nd round, we will be only testing logged-in users at 100% -- is it correct to assume we'll be running this on all the pilot wikis again same as for round 1?

hi @jwang @ovasileva -- there was a note in this week's planning doc to figure out which wikis would be part of the 2nd round of the TOC A/B experiment.

As a refresher, we deployed the 1st TOC A/B test to the following:

  • euwiki + hewiki at 100% sampling rate T306606
  • pilot wikis at 100% sampling rate with frwiki + ptwiki at 50% sampling rate T306607

For the 2nd round, we will be only testing logged-in users at 100% -- is it correct to assume we'll be running this on all the pilot wikis again same as for round 1?

@jwang - would it be possible to repeat the test on a smaller set of wikis? Potentially we can do 5 of the pilot wikis? Since this test is logged-in only, I think it would also be safe to have sampling be at 100%

Hi Olga, as we discussed in the meeting, a smaller set of wikis still works for analysis, just will limit the model selection. With a set of 5 wikis, we cannot use t-test analysis, but we still can select the other modeling methods to analyze the session based data.

This isn't working. With the following config I expect to see the TOC in the sidebar but I don't - it always displays in the article:

	$wgVectorWebABTestEnrollment = [
		'name' => 'skin-vector-toc-experiment',
		'enabled' => true,
		'buckets' => [
			'unsampled' => [
				'samplingRate' => 0,
			],
			'control' => [
				'samplingRate' => 0,
			],
			'treatment' => [
				'samplingRate' => 1,
			]
		]
	];

@Jdlrobson maybe i misunderstood this ticket -- basically sampling rates will not matter for this config -- if the test is enabled, it assumes 100% sampling of logged-in users such that users with even ids will see the treatment, odd users won't. So if you try a login with an even user id, you should see the TOC in the sidebar.

See https://github.com/wikimedia/Vector/blob/master/includes/FeatureManagement/Requirements/TableOfContentsTreatmentRequirement.php#L76-L80

If this is problematic, I can try to re-work to take sampling rates into account (though I'm unclear how this would work). I was following the approach of other recent experiments (from our team as well as other team's extensions) that seem to make use of even/odd user ids (and not sampling rates) to bucket logged-in users.

But I'm also a bit puzzled about this experiment now -- unless I'm mistaken, seems like most of the pilot wikis (I didn't test all of them) have the new TOC in the sidebar for anons. Would we run the experiment even tho as anons, they see the TOC in the sidebar but when users log in, some will see the TOC inline while others see the TOC in the sidebar?

Change 828066 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[mediawiki/skins/Vector@master] AB test: Complain when TOC experiment setup incorrectly

https://gerrit.wikimedia.org/r/828066

if the test is enabled, it assumes 100% sampling of logged-in users such that users with even ids will see the treatment, odd users won't. So if you try a login with an even user id, you should see the TOC in the sidebar.

Ack! I remember now. Yes this sounds like confusion on my part. I think the software could have helped me by throwing a runtime exception so I suggest adding a check to isMet (and throw when expectations not met). I've raised a patch to make this more obvious. Once that's merged happy to resolve.

But I'm also a bit puzzled about this experiment now -- unless I'm mistaken, seems like most of the pilot wikis (I didn't test all of them) have the new TOC in the sidebar for anons. Would we run the experiment even tho as anons, they see the TOC in the sidebar but when users log in, some will see the TOC inline while others see the TOC in the sidebar?

I believe so yes. Most users stay logged in, but that's really a consideration for Olga when we turn this on.

Change 828066 merged by jenkins-bot:

[mediawiki/skins/Vector@master] AB test: Complain when TOC experiment setup incorrectly

https://gerrit.wikimedia.org/r/828066

cjming assigned this task to Jdlrobson.