Page MenuHomePhabricator

Add rich snippet instrument to WikimediaEvents
Closed, ResolvedPublic5 Estimated Story Points

Description

User story

As a product owner, I'd like pages to be bucketed randomly for A/B testing of rich snippets.

Background

Coming out of T299215, we can re-purpose the code artifacts from work done to add split testing to page schemas in order to bucket pages into distinct test groups (i.e. control and treatment) for the purpose of running A/B tests.

Once that random page split tester is in place, we can apply the new treatment using a WME hook to override the header of a qualified page in the treatment group, as well as log an indicator of whether the new treatment was applied on all test pages (control and treatment) for data analysis of the experiment.

Acceptance criteria

  • PageSplitTester class and needed hooks are added to WME.
  • The max-snippet:400 (per T299133#7707177) directive is added to the relevant meta tag for treatment pages.
  • New key-value pair max-snippet is added to the X-Analytics header for both control and treatment pages.
  • New key-value pair max-snippet is documented on the X-Analytics Wikitech page. << TK once patch passes code review

Developer notes

The PageSplitTester class was introduced in T206868 (relevant patch) and removed in T209377 (relevant patch).

Config for the sameAs property A/B test was introduced in T208755 (relevant patch) and removed in T209377 (relevant patch). We can leverage similar config for the current rich snippets A/B instrument when it is ready to be deployed.

The first step is to reconfigure the PageSplitTester class (originally done in the Wikibase extension) to be part of WikimediaEvents:

  • Add a new instrumentation class with updated methods/properties of the PageSplitTester class to WME.
  • Add isSchemaTreatmentActiveForPageId to WikimediaEventsHooks.php
  • Check if the current page is in the treatment group using isSchemaTreatmentActiveForPageId and add hook onOutputPageAfterGetHeadLinksArray to the new class to change the value of $tags['meta-robots'] if it exists, or add the tag if it is missing.
  • Append max-snippet:400 to the content key of $tags['meta-robots'] for treatment pages only.
  • For both treatment and control pages, inside the onOutputPageAfterGetHeadLinksArray hook, add a boolean value for a new max-snippet key in the X-Analytics header - true for treatment, false for control.

QA steps on local

  • To test the bucketing, sampling, and max-snippets locally, you'll need to install the XAnalytics extension and add the following config in local settings:
wfLoadExtension( 'XAnalytics' );
$wgWMEPageSchemaSplitTestSamplingRatio = 1;
  • You will also need a few pages created locally (not via content provider) in order to get bonafide pageIds for bucketing/sampling.
  • Navigate to a page and check the response headers. For pages in the treatment group, you will see "max-snippet=1" in the X-Analytics response header and a corresponding meta robots tag with the max-snippet directive in the console:

Screen Shot 2022-02-17 at 3.28.23 PM.png (2×1 px, 839 KB)

  • Navigate to another locally created page until you find one that has the X-Analytics response with "max-snippet=0" and no max-snippet directive in any robots meta tag. This means the page is sampled and bucketed in the control group:

Screen Shot 2022-02-17 at 3.31.00 PM.png (2×1 px, 881 KB)

  • To test unsampled pages (which should have neither max-snippet in X-Analytics header or in console, navigate to a local page that is populated by content provider. These will have a null pageId and therefore not contain any of the max-snippets.
  • To make sure pages are not sampled and not bucketed when the A/B test is disabled, comment out $wgWMEPageSchemaSplitTestSamplingRatio from local settings.

QA steps on beta cluster

  • See T301584#7812096 to test sample urls (with cache-busting query params) for the robots meta tag containing the max-snippet directive. X-Analytics header (though extension is installed) is not presently visible in the response headers on beta cluster and still being investigated.

QA Results - Beta

ACStatusDetails
1T301584#7864568

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Note $tags['meta-robots'] is a string of HTML so if the content is anything complicated we may want to consider refactoring OutputPage to allow easier generation of the robots tag e.g. OutputPage::getRobotsTag( 'max-snippet=400');

 $p = "{$this->mIndexPolicy},{$this->mFollowPolicy}";
                if ( $p !== 'index,follow' ) {
                        // http://www.robotstxt.org/wc/meta-user.html
                        // Only show if it's different from the default robots p
olicy
                        $tags['meta-robots'] = Html::element( 'meta', [
                                'name' => 'robots',
                                'content' => $p,
                        ] );
                }

Change 763469 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/extensions/WikimediaEvents@master] Add rich snippet instrument:

https://gerrit.wikimedia.org/r/763469

cjming updated the task description. (Show Details)

hi @ovasileva - just a heads up on status for this ticket - it got blocked during code review over a legit concern from the database team about querying the page table for the page_random field (what is being proposed for random distribution of pages for experiment bucketing) on every web request. It wasn't caught/flagged when this team implemented the solution back in 2018 but I'm intuiting that it may have contributed to the major database overloads/outages that apparently occurred around that time.

One possible, simple workaround I will inquire about here with the data analyst:

hi @jwang -- question for you about the data analysis for the rich/section snippet experiment. The main reason page_random was originally selected is because of its persistence and stability. But given the unacceptable performance degradation of querying for this field on every web request during the lifetime of the proposed experiment, we need an alternative to page_random.

Would it be acceptable if we pivoted to using page_id instead for random distribution of pages despite its instability (i.e. deleted pages are not guaranteed to retain the same ids upon restoration -- https://www.mediawiki.org/wiki/Manual:Page_table#page_id)? This would mean deleted pages would need to be removed from the sample. If it's possible to exclude these deleted pages during analysis, we have a relatively straightforward path to bucketing pages for the experiment. If not, we may need to re-think how to approach random bucketing of pages.

cc @phuedx

@cjming, Can you give me some background on

  • How page_random is generated?
  • Does each page have only one page_random id?
  • How do we sample the pages?
  • What's the metric that we will measure in AB test?

In general, I want to understand whether bucketing will easily skew in test.

  • How page_random is generated?

From what I understand, it's randomly generated and populated upon page creation - https://www.mediawiki.org/wiki/Manual:Random_page

  • Does each page have only one page_random id?

I believe so

  • How do we sample the pages?

The proposal here is to randomly assign pages into buckets (per sampling rates passed in as config) based on a hashed page_id (which we automatically get) rather than page_random (an older experiment used this field but queried the database for it on every web request which is not possible now and probably shouldn't have been possible back then - we are trying to re-purpose the code from that old experiment but need to adjust this piece).

I'm erring on the side that the concern about using a hashed page_id instead of page_random with respect to deleted/restored pages is negligible compared to all pages in the sample.

  • What's the metric that we will measure in AB test?

For this particular experiment, it's testing the rich section snippet (added to the header of pages in the treatment group and the max-snippet key-value pair added to the X-Analytics header for all pages in the treatment and control groups) and how it affects search results performance of sampled pages. Here's the original proposal T299215.

In general, I want to understand whether bucketing will easily skew in test.

I don't know if there are other concerns around using page_id for bucketing. The main one that was pointed out to me was the issue of persistence but I'm hoping that the number of pages affected is statistically insignificant and that regardless, deleted/restored pages can be excluded during analysis.

I noticed the hypotheses in T299215 is for the pageview metric.

Hypotheses in T299215
- Pages that are in the test group (which are marked to display snippets within Google search results) will have more pageviews than the control group from external search referrals for all languages studied
- Overall pageviews in the test group will not be affected negatively for all languages studied

If they are still the hypotheses we want to test with current instrumentation plan, randomizing on page (page_id or random page id), we need to modify the hypothesis in the original proposal T299215. I am afraid the pageviews from pages in control and test groups are hard to be bucketed evenly because each page has different level of traffic naturally and could fluctuate due to unpredictable global events. I proposed alternative hypotheses below to test the metrics which are more robust to external factors. @ovasileva Let me know whether they are OK for you to achieve the same measurement goals.

Proposed hypotheses:

  • The percentage of pageviews from external search referrals on the pages in the test group (which are marked to display snippets within Google search results) will be higher than the control group for all languages studied”
  • The percentage of pageviews from external search referrals on the pages in the test group will not be affected negatively for all languages studied.
cjming assigned this task to phuedx.
cjming moved this task from Doing to Code Review on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.

I noticed the hypotheses in T299215 is for the pageview metric.

Hypotheses in T299215
- Pages that are in the test group (which are marked to display snippets within Google search results) will have more pageviews than the control group from external search referrals for all languages studied
- Overall pageviews in the test group will not be affected negatively for all languages studied

If they are still the hypotheses we want to test with current instrumentation plan, randomizing on page (page_id or random page id), we need to modify the hypothesis in the original proposal T299215. I am afraid the pageviews from pages in control and test groups are hard to be bucketed evenly because each page has different level of traffic naturally and could fluctuate due to unpredictable global events. I proposed alternative hypotheses below to test the metrics which are more robust to external factors. @ovasileva Let me know whether they are OK for you to achieve the same measurement goals.

Proposed hypotheses:

  • The percentage of pageviews from external search referrals on the pages in the test group (which are marked to display snippets within Google search results) will be higher than the control group for all languages studied”
  • The percentage of pageviews from external search referrals on the pages in the test group will not be affected negatively for all languages studied.

@jwang - confirming that we can go forward with these hypotheses

cjming removed cjming as the assignee of this task.Mar 1 2022, 8:10 PM
cjming assigned this task to phuedx.
cjming moved this task from Doing to Code Review on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.
cjming removed cjming as the assignee of this task.Mar 9 2022, 5:47 AM
cjming moved this task from Doing to Code Review on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.

Change 769786 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/core@master] Add a method to Outputpage to amend robot policy meta tag

https://gerrit.wikimedia.org/r/769786

Change 769786 merged by jenkins-bot:

[mediawiki/core@master] Add option to amend robots meta tag

https://gerrit.wikimedia.org/r/769786

Change 763469 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Add rich snippet instrument

https://gerrit.wikimedia.org/r/763469

Change 773331 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[operations/mediawiki-config@master] Enable split A/B testing on beta cluster

https://gerrit.wikimedia.org/r/773331

Change 773331 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable split A/B testing on beta cluster

https://gerrit.wikimedia.org/r/773331

Mentioned in SAL (#wikimedia-operations) [2022-03-23T21:24:00Z] <cjming@deploy1002> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:773331|Enable split A/B testing on beta cluster (T301584)]] (duration: 00m 50s)

On beta cluster i"m currently seeing <meta name="robots" content="noindex,nofollow"/> on all articles so I think this needs more work (I expected to either see no robots tag OR the max-snippets set as on production we don't output a robots tag. @cjming I have a question for you on https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/773331

thanks @Jdlrobson - so testing just now, I do see the robots tag on a few pages:

https://en.wikipedia.beta.wmflabs.org/wiki/Spain?foo=bar&safemode=1

Screen Shot 2022-03-28 at 11.54.59 AM.png (2×4 px, 1 MB)

https://en.wikipedia.beta.wmflabs.org/wiki/Domestication?foo=bar&safemode=1

Screen Shot 2022-03-28 at 12.14.58 PM.png (2×5 px, 2 MB)

While on other pages in beta cluster, I don't see robots tag at all - i.e. https://en.wikipedia.beta.wmflabs.org/wiki/Barack_Obama?foo=bar&safemode=1

Screen Shot 2022-03-28 at 12.22.41 PM.png (2×3 px, 2 MB)

I was stumped about this until this thread in the slack engineering channel popped up a short while ago. I don't know if this is the reason it suddenly seems to be working because end of last week, I wasn't seeing the meta robots tag on any pages.

I'm still trying to investigate why we aren't seeing XAnalytics headers in beta cluster but for now does this suffice to move to QA?

cjming moved this task from Doing to Code Review on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.
Jdlrobson moved this task from Code Review to QA on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.

For QA purposes, the beta cluster is setup to show 50% of pages with max-snippet and 50% without.

An update on the latest sleuthing around the elusive X-Analytics header and why it appears to be missing from beta cluster (and from production for that matter). Big thanks to @phuedx and @Ladsgroup for advising me as I descend down the rabbit hole of this quandary.

Apparently the apache traffic server strips the x-analytics header before requests are sent to clients which makes sense since it is used for internal analytics purposes.

So to try to verify that the X-Analytics header is appending the correct values for this instrument, I ssh'd into the beta cluster server to curl a test treatment page that is adding the meta robots tag correctly (i.e. https://en.wikipedia.beta.wmflabs.org/wiki/Spain?foo=bar&safemode=1 -- at the very least we can confirm that the WME hook to add the max-snippet directive is working):

curl -H Host:en.wikipedia.beta.wmflabs.org --head http://<IP redacted>/wiki/Spain

Interestingly the response header returned from the above curl command did not contain the X-Analytics header either.

In the meantime, for my future self, here's the query that can be run in Hue to verify that the data is being sent after the decision is made to deploy this instrument:

SELECT x_analytics_map['max-snippet'] FROM webrequest WHERE x_analytics_map['max-snippet'] IS NOT NULL and year = 2022 and month = 3 and day = 31 and hour = 1;

I will continue investigating to see if there's another way to confirm that the WME hook is correctly adding the key-value pair in the X-Analytics header since I'd rather not rely on data collection post-deployment to see whether this is working.

Latest update about the X-Analytics header. Ginormous thanks to @Krinkle for schooling on how to properly curl for this information:

$ curl -I 'https://en.wikipedia.beta.wmflabs.org/wiki/Spain' --connect-to ::localhost
HTTP/1.1 200 OK
date: Fri, 15 Apr 2022 23:53:04 GMT
server: <beta cluster>
x-analytics: ns=0;page_id=<redacted>;max-snippet=1

So happy to report that this instrument appears to be doing what it intends - adding the max-snippet directive to robots and to the X-Analytics header to treatment pages of an enabled experiment.

Test Result - Beta

Status: ✅ PASS
Environment: beta
OS: macOS Monterey
Browser: Chrome
Device: MBP
Emulated Device:NA

Test Artifact(s):

QA Steps

✅ AC1: See T301584#7812096 to test sample urls (with cache-busting query params) for the robots meta tag containing the max-snippet directive.

Screen Shot 2022-04-19 at 8.53.32 AM.png (1×1 px, 889 KB)

Screen Shot 2022-04-19 at 8.51.50 AM.png (1×1 px, 721 KB)

Screen Shot 2022-04-19 at 8.49.52 AM.png (1×1 px, 593 KB)

Screen Shot 2022-04-19 at 8.50.43 AM.png (1×1 px, 468 KB)

Edtadros updated the task description. (Show Details)

Change 820807 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/extensions/WikimediaEvents@master] Update class name, method to PageHash

https://gerrit.wikimedia.org/r/820807

Change 820807 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Update class name, method to PageHash

https://gerrit.wikimedia.org/r/820807