Page MenuHomePhabricator

Track interaction with ReferencePreviews
Closed, ResolvedPublic2 Estimated Story Points

Description

Motivation

We want to find out to what extend Reference Previews is improving readers' behavior. This ticket is for getting more info about Reference Previews usage, T231529 is about creating a baseline that we can compare the usage against.

Acceptance Criteria
  • How often do people look at a reference pop up relative to the pages being opened? E.g. On average, there were 0.03 showings of a reference pop up per page opened where Reference Previews was deployed.
  • In how many percent of the cases do users click on “go to references section” when they see a reference preview?
  • In how many percent of the cases do users click on a link inside the reference itself when they see a reference preview?
  • How often (absolute number per page being opened) do people (who have referencepreview enabled) click on a link inside the reference itself when they see a reference preview?
  • How often (absolute number per page being opened) do people (who have referencepreview enabled) click on a link inside the reference itself, but from the references section
  • In how many percent of the cases does an opened reference pop up contain scroll bars?
    • Of those times, how often do users scroll? (For now, let’s not separate between horizontally and vertically)
Metrics

We're using a new schema "ReferencePreviewsPopups", which tracks the following actions:

  • action: poppedOpen is sent when a reference preview is rendered. scrollbarsPresent: true is set when the reference overflows its bounding box and a vertical scrollbar is present.
  • action: scrolled is sent exactly once, if the user scrolls a reference preview either horizontally or vertically.
  • action: clickedReferencePreviewsContentLink is sent when a reference content link is clicked in a reference preview.
  • action: clickedGoToReferences is sent when "jump to references" is clicked.

This can be supplemented with ReferencePreviewsBaseline to collect all of the metrics we need here, although note that the sampling rate may be different. That will already collect the following:

  • action: pageview is recorded when opening a page with Cite enabled. When Reference Previews are enabled for the user, this event will include referencePreviewsEnabled: true.
  • action: clickedFootnote when clicking a reference link ("[1]"). Note that this also fires in other circumstances such as when "Jump to references" is clicked, so we need to refine a bit or at least figure out how to calculate the correct statistics. Maybe it's as simple as ignoring this when referencePreviewsEnabled, and using poppedOpen instead.
  • action: clickedReferenceContentLink when clicking on a link in the References section.

Previous comments

I see the schema has been updated to include ReferenceTooltips. I wonder if it might make sense to use a new schema, given you'll likely want to log reference tooltip previews separately from page previews. I'd imagine reference tooltips will display less often then link previews so if you are leaning heavily on the existing logging code, the sample size for reference tooltips is likely to be very small in comparison to link previews. Does it make sense to "fork" this schema and update the code so that it doesn't log link previews at all, or is data relating to link previews important for the questions you are trying to answer?

Conclusion

To summarize how the acceptance criteria map into Grafana dashboards,

CriterionGraph
How often do people look at a reference pop up relative to the pages being opened?References viewed, baseline vs. RP
In how many percent of the cases do users click on “go to references section” when they see a reference preview?Popups Go To References Clicks per Pageview
In how many percent of the cases do users click on a link inside the reference itself when they see a reference preview?this seems to be redundant with the next criterion.
How often do people who have referencepreview enabled click on a link inside the reference itself when they see a reference preview?Popups Content Clicks per Pageview
How often do people who have referencepreview enabled click on a link inside the reference itself, but from the references sectionReferences Section Content Clicks per Pageview, with Reference Previews
In how many percent of the cases does an opened reference pop up contain scroll bars?Reference Popup has Scrollbars
Of those times, how often do users scroll?Reference Popup is Scrolled when has Scrollbars

Details

SubjectRepoBranchLines +/-
mediawiki/extensions/Popupsmaster+12 -12
mediawiki/extensions/Citewmf/1.35.0-wmf.5+24 -21
mediawiki/extensions/Popupswmf/1.35.0-wmf.5+11 -8
mediawiki/extensions/Popupsmaster+11 -11
mediawiki/extensions/Citemaster+24 -21
mediawiki/extensions/Popupsmaster+13 -8
mediawiki/extensions/Popupswmf/1.35.0-wmf.5+6 -3
mediawiki/extensions/Popupsmaster+6 -3
analytics/reportupdater-queriesmaster+1 -2
analytics/reportupdater-queriesmaster+206 -0
analytics/reportupdater-queriesmaster+106 -0
mediawiki/extensions/Popupsmaster+9 -3
mediawiki/extensions/Citemaster+24 -17
mediawiki/extensions/Popupsmaster+3 -3
mediawiki/extensions/Popupswmf/1.35.0-wmf.1+3 -3
mediawiki/extensions/Popupsmaster+3 -3
mediawiki/extensions/Popupsmaster+3 -3
mediawiki/extensions/Popupsmaster+68 -6
mediawiki/extensions/Popupsmaster+14 -23
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 542326 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] Tune referencePreviews sampling from 1:1000 to 1:10

https://gerrit.wikimedia.org/r/542326

Change 542326 abandoned by Awight:
Tune referencePreviews sampling from 1:1000 to 1:10

Reason:
wrong branch

https://gerrit.wikimedia.org/r/542326

Change 542327 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@wmf/1.35.0-wmf.1] Tune referencePreviews sampling from 1:1000 to 1:10

https://gerrit.wikimedia.org/r/542327

Change 542327 abandoned by Awight:
Tune referencePreviews sampling from 1:1000 to 1:10

Reason:
Let's do the train, instead.

https://gerrit.wikimedia.org/r/542327

New plan due to holiday schedules: this sampling rate change will go out with the 1.35.0-wmf2 train, next week.

There are now 15 rows for ReferencePreviewsPopups, all with the "poppedOpen" action. These are all from Oct 21, so we can probably expect this number daily. We should increase sampling again.

Change 545219 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] Sample ReferencePreviewsPopups 1:1

https://gerrit.wikimedia.org/r/545219

Change 545219 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Sample ReferencePreviewsPopups 1:1

https://gerrit.wikimedia.org/r/545219

One of my assumptions was completely off the mark: there are so few referencePreviewsEnabled users that I had to tune the sampling to 1:1. Meanwhile, I was relying on the ReferencePreviewsBaseline tracking (sampled at 1:1000) to give us a few of the most important metrics, such as pageviews. This will probably be statistically unusable, so we need to consider splitting the sampling (e.g. track if inSample(1000) or refPreviewsEnabled) or doing some other dramatic workaround.

Change 545869 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] Report for ReferencePreviews popups

https://gerrit.wikimedia.org/r/545869

Change 545870 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Stop sampling for referencePreviewsEnabled

https://gerrit.wikimedia.org/r/545870

Change 545942 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] Record pageviews where Reference Previews is enabled

https://gerrit.wikimedia.org/r/545942

Change 545942 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Record pageviews where Reference Previews is enabled

https://gerrit.wikimedia.org/r/545942

Change 545870 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Stop sampling when Reference Previews is enabled

https://gerrit.wikimedia.org/r/545870

Change 545869 abandoned by Awight:
Report for ReferencePreviews popups

Reason:
merge with parent.

https://gerrit.wikimedia.org/r/545869

Change 542419 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] [WIP] New reports for Reference Previews

https://gerrit.wikimedia.org/r/542419

I think the reportupdater queries are ready, but still waiting for our new schemas to roll out so I can smoke test the ReferencePreviewsCite schema query.

as far as I know we don't use the graphite writer.

Uh, oh! I think I'm pretty far out on a limb in that case. /me tosses a pinch of salt over their left shoulder

Change 542419 merged by Mforns:
[analytics/reportupdater-queries@master] New reports for Reference Previews

https://gerrit.wikimedia.org/r/542419

Need to tweak the ReferencePreviewsCite report...

Change 549054 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] Fix nonexistent field in query

https://gerrit.wikimedia.org/r/549054

awight changed the point value for this task from 8 to 3.Nov 6 2019, 1:13 PM

Change 549054 merged by Mforns:
[analytics/reportupdater-queries@master] Fix nonexistent field in query

https://gerrit.wikimedia.org/r/549054

There are many more Popups pageviews than I had expected (or many fewer from Cite). Look into this, and tweak the tracking to be idempotent in case that's the issue.

Change 549820 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] Ensure that pageviews are recorded at most once

https://gerrit.wikimedia.org/r/549820

Change 549820 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Ensure that pageviews are recorded at most once

https://gerrit.wikimedia.org/r/549820

Change 550104 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@wmf/1.35.0-wmf.5] Ensure that pageviews are recorded at most once

https://gerrit.wikimedia.org/r/550104

Change 550104 abandoned by Awight:
Ensure that pageviews are recorded at most once

Reason:
This doesn't solve our problem.

https://gerrit.wikimedia.org/r/550104

I'm failing to find a simple answer to the pageviews thing. I did discover that the Popups pageviews include Special page and Edit page impressions, but these alone shouldn't add up to the 2.5x disparity found in the real data. I'm going to wrap up this phase, and put a warning on the graph with suspect data. If anything, the difference will *under*-estimate the impact of Reference Previews, and we're already getting much higher popup view rates than baseline footnote views.

One quick fix would be to recalculate the ratios using Cite pageview counts, but this doesn't feel safe either.

Change 550489 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] [WIP] Reduce double-counting of Reference Previews pageviews

https://gerrit.wikimedia.org/r/550489

Change 550489 abandoned by Awight:
[WIP] Reduce double-counting of Reference Previews pageviews

https://gerrit.wikimedia.org/r/550489

Change 550674 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@master] [WIP] Don't record Popups actions on non-content pages

https://gerrit.wikimedia.org/r/550674

Change 550675 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Cite@master] Track pageviews only on content page views, not edits

https://gerrit.wikimedia.org/r/550675

Change 550683 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Popups@master] [WIP]

https://gerrit.wikimedia.org/r/550683

Change 550675 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Track pageviews only on content page views, not edits

https://gerrit.wikimedia.org/r/550675

Change 551389 had a related patch set uploaded (by Awight; owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Cite@wmf/1.35.0-wmf.5] Track pageviews only on content page views, not edits

https://gerrit.wikimedia.org/r/551389

Change 550674 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Don't record Popups actions on non-content pages

https://gerrit.wikimedia.org/r/550674

Change 551397 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Popups@wmf/1.35.0-wmf.5] Don't record Popups actions on non-content pages

https://gerrit.wikimedia.org/r/551397

Change 551389 merged by jenkins-bot:
[mediawiki/extensions/Cite@wmf/1.35.0-wmf.5] Track pageviews only on content page views, not edits

https://gerrit.wikimedia.org/r/551389

Change 551397 merged by jenkins-bot:
[mediawiki/extensions/Popups@wmf/1.35.0-wmf.5] Don't record Popups actions on non-content pages

https://gerrit.wikimedia.org/r/551397

Mentioned in SAL (#wikimedia-operations) [2019-11-18T11:26:54Z] <awight@deploy1001> Synchronized php-1.35.0-wmf.5/extensions/Popups: SWAT: [[gerrit:551397|Don't record Popups actions on non-content pages (T214493)]] (duration: 00m 51s)

Mentioned in SAL (#wikimedia-operations) [2019-11-18T11:28:16Z] <awight@deploy1001> Synchronized php-1.35.0-wmf.5/extensions/Cite: SWAT: [[gerrit:551389|Track pageviews only on content page views, not edits (T214493)]] (duration: 00m 51s)

Waiting 24hr for new metrics to land.

We also discovered a major source of discrepancy between the numbers coming from Popups and Cite metrics, that Cite is only loaded on pages which include references. Equipped with this information, I think we might be able to make concrete statements about what these metrics mean and whether they're safe to rely on.

awight changed the point value for this task from 3 to 2.

I'm trying to verify our pageview counts, and we think the only remaining discrepancy should be accounted for by the Cite UI extensions not loading for pages with zero references.

Relying on MRedi's research from July 2018, we see that a readership study on 2M pages found that approximately 1M of those had zero references. Unfortunately, this distribution won't be proportional to the pages actually visited, and we should expect to that the pages with more references are also the pages receiving more visits. All I can get from this is an upper bound on the expected discrepancy, that Popups pageviews should be well less than twice the number of Cite pageviews.

In the second round of analysis, MRedi found that in a larger readership sample of 5.4M enwiki pages, 24.5% have zero references. Again, without the proportion of zero-reference pages to actual pageviews, all we get is a rough upper bound, less than 25% of Popups pageviews should be missed by Cite tracking.

On November 19th, we counted 77k Popups pageview events, and 50k Cite pageview events on enwiki. Assuming that our hypothesis is true and the discrepancy is entirely due to zero-reference pages, we get 35% of pageviews hitting pages with no references. Seems reasonable, I'm closing this task now.

I've left a question on the Research page in case the original data is available and we can get a better estimate of what proportion we should expect here.

Change 550683 abandoned by Thiemo Kreuz (WMDE):
Stop tracking on special pages, previews, and such

https://gerrit.wikimedia.org/r/550683

Before closing this ticket. Is there a condensed version of the findings where we have an answer for each question from above? Or is this just about the groundwork to answer the questions above and there's still something missing to put it all together?* @awight

*) sorry if I miss something essential here.

Summarized in the task description, now I think we can close and continue work by reviewing the dashboards in the next ticket.