Page MenuHomePhabricator

[EPIC] Instrument page interactions
Closed, ResolvedPublic5 Story Points

Description

Background

With the deployment of Page Previews, we introduce a new form of reading Wikipedia content apart from the standard pageviews. We need to measure this for the same reasons as we do for pageviews. These include providing executives with accurate numbers on the overall level of usage of our content, and the editor community with accurate numbers on the readership of the individual articles and projects they are working on. In particular, based on the previous A/B tests, we expect that the deployment of previews on a wiki will cause the total pageviews to decrease for that wiki, but that "page interactions" – any intentional interaction with a page, i.e. page previews + pageviews – will increase. We would like a way to track this metric over time.

This task captures the frontend instrumentation work needed to ensure clients send all required information to the servers. The backend work for storing this data and aggregating it in a form suitable for analysis is captured in T186728.

QA Steps

On beta cluster with page previews enabled please visit article in debug mode (add ?debug=true to the URL). When a popup is visible for more than 1 second the VirtualPageView event should be triggered. (https://meta.wikimedia.org/wiki/Schema:VirtualPageView )

Notes

  1. In T184793#3953351, @Ottomata asks that we follow the EventLogging schema guidelines when creating the Pageview schema (strawdog name) so that it's easier to get events into Pivot and Superset.

AC

  • Page interactions are recorded by logging an EventLogging event.
  • Per T184793#3952974, this instrumentation should be sampled for testing purposes. The default sampling rate should be 50%.
NOTE: We estimate that recording a page interaction when a preview has been open for > 1000 ms will correspond to an increase in webrequests per pageview of 0.13%, which corresponds to ~700-800 events/sec (or, roughly, 2x the peak rate from the Page Previews instrumentation). AIUI the Hive EventLogging backend can handle this event rate 💪 but the processors need to be monitored to see if more need to be added. SIGN OFF NOTEI: This was not done and should not be necessary. We are able to enable this per wiki and having enabled this on Hungarian Wikipedia we are able to do the same.
  • As soon as a preview has been visible for > 1000 ms, a page interaction is recorded with the following information:
    • Namespace, title and ID of the page that's being previewed
    • Namespace, title and ID of the page that's currently being viewed
  • If a preview can't be generated for a page, i.e. the "generic" preview is shown (defined as in the Popups schema), then no page interaction is recorded.
  • The instrumentation should be feature flagged and the flag should be disabled by default.
  • The standard schema documentation has been added to the schema talk page and a purging strategy has been defined (in this case, no fields need to be whitelisted, as we're fine with all events being deleted after they have been aggregated per T186728.). (see https://meta.wikimedia.org/wiki/Schema_talk:VirtualPageView)
  • Events are being sent consistently regarding DNT (T190188).

Open questions


Closed Questions

When a preview has been visible for > 1000 ms

This is subtly different from $totalInteractionTime > 1000 as $totalInteractionTime includes the >= 700 ms to actually show the preview. For reference, the current median time to show a preview is ~740 ms.

When do we want to record the page interaction?

See T184793#3896845. When $totalInteractionTime - $perceivedWait > 1000.

  1. Which URL do we want to request?

We'll use the EventLogging pipeline (per https://lists.wikimedia.org/pipermail/analytics/2018-January/006136.html).

3 The instrumentation should ignore DNT. Requests with the DNT header set are included in our webrequest_raw, webrequest, and pageviews (aggregated) tables and this instrumentation is to be considered analogous. After discussion in T187277, this subtask is tracked in T190188.

Testing criteria

  • Hover over any link. Ensure the hover lasts at least 1s. Check that a message was sent to EventLogging to the VirtualPageViews schema. Check the contents of the message reflect the current page (source_) and the page being hovered.
  • Hover over any link. Ensure the hover lasts less than 1s. Check that a message was NOT sent to EventLogging to the VirtualPageViews schema.

Related Objects

StatusAssignedTask
ResolvedDereckson
ResolvedJdlrobson
Resolvedovasileva
DuplicateNone
OpenNone
Resolvedmforns
Resolvedovasileva
ResolvedJdlrobson
DuplicateNone
DuplicateNone
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
Resolvedphuedx
Resolvedphuedx
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
DuplicateNone
Duplicateovasileva
Resolvedovasileva
DuplicateNone
DeclinedNone
DuplicateJdlrobson
ResolvedMhurd
DeclinedJMinor
Resolvedphuedx
ResolvedPchelolo
ResolvedJdlrobson
DeclinedPchelolo
DeclinedNone
OpenNone
Resolvedphuedx
DeclinedJdlrobson
DuplicateNone
ResolvedFjalapeno
Resolvedphuedx
Declinedpmiazga
DeclinedNone
Resolvedphuedx
DeclinedNone
ResolvedPchelolo
ResolvedbearND
ResolvedMholloway
ResolvedMSantos
ResolvedMholloway
InvalidNone
ResolvedJdlrobson
InvalidNone
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
ResolvedbearND
ResolvedMholloway
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedbearND
ResolvedJdlrobson
ResolvedMholloway
ResolvedMholloway
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedbearND
Resolved Tbayer
ResolvedNone
Resolvedovasileva
Resolved Tbayer
ResolvedNone
Resolvedmforns
Resolvedphuedx
DeclinedNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

^ @Ottomata we are now logging VirtualPageViews to Hungarian Wikipedia.
@Tbayer these should be going to Hive if you want to take a look.

Jdlrobson reassigned this task from Ottomata to Tbayer.Mar 13 2018, 11:54 PM

Passing to Tilman. We'll meet at end of the week. Next thing to confirm is that the Hive data is sound and that the amount of events is meeting our expectation. Once that's done, we can be more aggressive about roll out and shift focus to T187277.

Jdlrobson updated the task description. (Show Details)Mar 16 2018, 10:18 PM

T189906 captures remaining work here our end. It might make sense to make bypassing DNT in another ticket, once a decision has been reached (which I think has been! (If not we can move this task back to "todo" with updated acceptance criteria and re-estimate. Not sure what makes most sense.

I am not so sure we have a decision regarding DNT.

Jdlrobson reassigned this task from Tbayer to ABorbaWMF.Mar 20 2018, 5:16 PM
Jdlrobson updated the task description. (Show Details)

Anthony, could you take a look at this? Hungarian Wikipedia would be the best place to test.

Looks good on Beta. Checked out Hungarian wikipedia as well, but don't have debug running there yet

More than 1s hover


Less than 1s hover

Looks good from my side. @Tbayer, @Jdlrobson - is there anything else we need for signoff?

Jdlrobson closed this task as Resolved.Mar 23 2018, 7:00 PM

We're good!

Tbayer reopened this task as Open.Mar 26 2018, 5:59 PM
Tbayer removed ABorbaWMF as the assignee of this task.
Tbayer updated the task description. (Show Details)
Tbayer added a subscriber: ABorbaWMF.
Jdlrobson updated the task description. (Show Details)Mar 26 2018, 6:08 PM

@Tbayer can you elaborate on why you reopened this task?

In order to keep the conversation contained we decided (in a standup) to track the DNT work inside T190188. We don't need to keep this open - it was creating lots of confusion within the team and fragmenting the conversation around DNT.

If you need it open for other reasons, I suggest we track it outside the sprint board and backlog (e.g. on a personal board).

Passing to Tilman. We'll meet at end of the week. Next thing to confirm is that the Hive data is sound and that the amount of events is meeting our expectation. Once that's done, we can be more aggressive about roll out and shift focus to T187277.

For the record, last week I posted some notes from that meeting at T186728#4069562 .

I have also been monitoring the ratio of events per pageview, see below. It's a bit higher on huwiki than one might expect from the enwiki+dewiki A/B tests (especially considering that T190188 has not been implemented yet), but that might just be a natural difference between these wikis. Once we have this rolled out to enwiki or dewiki we will be able to compare directly.

year	month	day	events	pageviews	ratio
2018	3	13	902	811425	0.001111624611023816
2018	3	14	203696	780293	0.26105065661232385
2018	3	15	234867	758504	0.3096450381276829
2018	3	16	245158	776662	0.3156559738985556
2018	3	17	252319	806245	0.3129557392603985
2018	3	18	298695	924122	0.3232203107381926
2018	3	19	272950	924593	0.2952109739096013
2018	3	20	264871	979082	0.2705299453978319
2018	3	21	253392	914155	0.27718712909736315
2018	3	22	232718	857182	0.2714919352016258
2018	3	23	204810	744587	0.27506523750750417
2018	3	24	211290	725962	0.2910482917838675
2018	3	25	245733	831343	0.2955855765911303
2018	3	26	153488	544476	0.28190039597704947

SELECT pvsperday.year AS year,
pvsperday.month AS month,
pvsperday.day AS day, 
events, pageviews,
events/pageviews AS ratio FROM (
  SELECT year, month, day,
  COUNT(*) AS events 
  FROM event.virtualpageview 
  WHERE year = 2018 AND wiki = 'huwiki' 
  AND NOT useragent.is_bot
  GROUP BY year, month, day ) AS eventsperday
JOIN (
  SELECT year, month, day, SUM(view_count) AS pageviews
  FROM wmf.projectview_hourly
  WHERE year = 2018 AND month = 3 AND day >= 13
  AND project = 'hu.wikipedia'
  AND access_method = 'desktop'
  AND agent_type = 'user'
  GROUP BY year, month, day ) AS pvsperday
ON eventsperday.year = pvsperday.year 
AND eventsperday.month = pvsperday.month
AND eventsperday.day = pvsperday.day
ORDER BY year, month, day LIMIT 10000;

@Tbayer can you elaborate on why you reopened this task?

Can you elaborate on why you closed the task without having verified that all stakeholders consider it done?

I reopened it because two subtasks had not been completed yet (please check the task description diff), and I thus don't yet consider the task fully done as written ("Instrument page interactions").

In order to keep the conversation contained we decided (in a standup) to track the DNT work inside T190188. We don't need to keep this open - it was creating lots of confusion within the team and fragmenting the conversation around DNT.

What kind of confusion? Isn't it standard practice to keep a task open until all subtasks are completed? Myself, I rather find it confusing that the task was closed even though two checkboxes were still unchecked (even before the overlooked item that I added). This has in the past led to some rather significant oversights - I believe there was a retro about this some months ago.

Hey @Tbayer! It looks like @Jdlrobson updated the acceptance criteria to reflect that the checkboxes in question were postponed to future tasks, leaving all of the Acceptance Criteria done. That would indicate to me that the task is, in fact, resolved. I do recall the decision to track the DNT work in another task (T190188: VirtualPageView schema should not use EventLogging api to send virtual page view events).

What kind of confusion? Isn't it standard practice to keep a task open until all subtasks are completed?

Good point! Perhaps the AC should be updated, too.

Maybe we can simply remove this task from Ready for Signoff and track T190188 through the board (which it seems we already are)? @ovasileva , what do you think about adding this task to the Quarterly Goals column? Then we can resolve it when the subtasks are done, as Tilman suggests, as we would with any other epic.

Maybe we can simply remove this task from Ready for Signoff and track T190188 through the board (which it seems we already are)? @ovasileva , what do you think about adding this task to the Quarterly Goals column? Then we can resolve it when the subtasks are done, as Tilman suggests, as we would with any other epic.

I like this - going even further, as there are subtasks that are quite important, could we maybe make it an epic? Either way, moving now.

could we maybe make it an epic?

No objection here. Emergent epics are as valid as prepared ones, assuming the work is still in scope. :)

Hey @Tbayer! It looks like @Jdlrobson updated the acceptance criteria to reflect that the checkboxes in question were postponed to future tasks, leaving all of the Acceptance Criteria done. That would indicate to me that the task is, in fact, resolved.

That seems a rather circular argument to me. Of course one can mark any task done instantly by removing all acceptance criteria ;) A more interesting discussion would by why folks disagree with these acceptance criteria and see this task (which as written encompasses all "the frontend instrumentation work needed") as completed even though these two things have not been done yet.

What's more, in T184793#4081743 the second AC was marked "Postponed to future task" too, even though no such future task actually exists. (The one mentioned there concerns something else.) Frankly, that's the kind of sloppiness that has cost the team dearly in case of some past instrumentations. I am sure scrum masters can play a valuable role in avoiding this, by checking such changes more closely.

I do recall the decision to track the DNT work in another task (T190188: VirtualPageView schema should not use EventLogging api to send virtual page view events).

Agreed - and that task is a subtask of this one.

What kind of confusion? Isn't it standard practice to keep a task open until all subtasks are completed?

Good point! Perhaps the AC should be updated, too.
Maybe we can simply remove this task from Ready for Signoff and track T190188 through the board (which it seems we already are)? @ovasileva , what do you think about adding this task to the Quarterly Goals column? Then we can resolve it when the subtasks are done, as Tilman suggests, as we would with any other epic.

Sounds fine to me! I also had a good chat with @Niedzielski earlier today regarding coordination about such data tasks in general (between myself and the rest of the team); let's continue the conversation.

In T184793#4082025, @MBinder_WMF wrote:
Hey @Tbayer! It looks like @Jdlrobson updated the acceptance criteria to reflect that the checkboxes in question were postponed to future tasks, leaving all of the Acceptance Criteria done. That would indicate to me that the task is, in fact, resolved.

That seems a rather circular argument to me. Of course one can mark any task done instantly by removing all acceptance criteria ;) A more interesting discussion would by why folks disagree with these acceptance criteria and see this task (which as written encompasses all "the frontend instrumentation work needed") as completed even though these two things have not been done yet.

Forgive me, @Tbayer, if I added to the confusion. I believe it was implied that a task can change and evolve, including its acceptance criteria and title, and I meant to reinforce that. You make a great point that there appears to be disagreement about whether or not the task is complete, or if it can, in fact, be split in the way suggested, and that may also have to do with the AC and the rest of the description being in conflict.

@Jdlrobson pointed out that there was an agreement made in standup to alter the task and how the work is tracked, but that wasn't written down explicitly (beyond "we agreed to do something") and I'm sure that that adds to the tumult. That isn't meant to blame him, just to point out how we might arrive at our divergence in opinion and action. :)

Further, since the task has been on the board as long as it has (including in a signoff position for more than a cycle), it's causing the team some anxiety (they are determined to finish tasks as quickly as possible, and a lingering task is a sign of an impediment, and a flag for conversation).

What's more, in T184793#4081743 the second AC was marked "Postponed to future task" too, even though no such future task actually exists. (The one mentioned there concerns something else.) Frankly, that's the kind of sloppiness that has cost the team dearly in case of some past instrumentations. I am sure scrum masters can play a valuable role in avoiding this, by checking such changes more closely.

I appreciate that you've captured a potential pitfall, particularly since you're aware of the team's previous challenges with instrumentation. I am sure the team will find the lesson valuable. It sounds like the tension here is a disagreement about how to organize this work, and possibly room for improvement in the future as the team engages with ambiguity and tasks that evolve over time. It's not always easy to do initial work breakdown if it's not known for sure how the work will play out. In this case, if the work was, in fact, postponed, then you're right to say that task should be made (and even if it is made, your observation is a signal that it is incomplete or unclear).

What's more, in T184793#4081743 the second AC was marked "Postponed to future task" too, even though no such future task actually exists. (The one mentioned there concerns something else.) Frankly, that's the kind of sloppiness that has cost the team dearly in case of some past instrumentations. I am sure scrum masters can play a valuable role in avoiding this, by checking such changes more closely.

@Tbayer - here, did you mean the schema documentation? I just want to make sure we're not missing another task.

MBinder_WMF updated the task description. (Show Details)Mar 27 2018, 5:08 PM

As a further check, following up on the results for huwiki in T184793#4081754, here is the ratio of events per pageview for the more recently launched languages. This shows larger differences, which is interesting, but I would still consider it plausible that the feature might be more popular in Russia than in Japan for a variety of reasons.

langeventspageviewsratio
fr19145318664057830.288
hu130235947415030.275
ja15879046722668370.22
ru408509161007221990.406

Data via

SELECT pvsperlang.lang AS lang,
events, pageviews,
ROUND(events/pageviews,3) AS ratio FROM (
  SELECT SUBSTR(wiki,0,2) AS lang,
  COUNT(*) AS events 
  FROM event.virtualpageview 
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND wiki IN ('huwiki', 'ruwiki', 'jawiki', 'frwiki') -- cf. https://phabricator.wikimedia.org/T189906#4073729
  AND NOT useragent.is_bot
  GROUP BY wiki ) AS eventsperlang
JOIN (
  SELECT SUBSTR(project,0,2) AS lang, SUM(view_count) AS pageviews
  FROM wmf.projectview_hourly
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND project IN ('hu.wikipedia', 'ru.wikipedia', 'ja.wikipedia', 'fr.wikipedia')
  AND access_method = 'desktop'
  AND agent_type = 'user'
  GROUP BY project) AS pvsperlang
ON eventsperlang.lang = pvsperlang.lang
ORDER BY lang LIMIT 10000;

Looks good on Beta. Checked out Hungarian wikipedia as well, but don't have debug running there yet

Which browser and OS was this tested under - Chrome on OSX? Asking because @Jdlrobson has since raised the valid concern (via email) that there might be issues that only occur on certain browsers.
Also, are we confident that there are no differences between beta and production (i.e. that testing in beta suffices)? Just want to nail things down here fully, also considering that the subtasks (checkboxes) under "Testing criteria" in the task are still not marked as completed.

What's more, in T184793#4081743 the second AC was marked "Postponed to future task" too, even though no such future task actually exists.

...

@Tbayer - here, did you mean the schema documentation? I just want to make sure we're not missing another task.

Yes. I took a stab at this myself afterwards (to be vetted by Jon; since he has marked this as done since, I assume it looked good to him).

@Nuria checked the data for me yesterday and confirmed she was seeing events from Firefox and various other browsers but the more eyes the better. I think looking at the data (variance of user agents) will give us more answers than QA at this point.

@Nuria checked the data for me yesterday and confirmed she was seeing events from Firefox and various other browsers but the more eyes the better. I think looking at the data (variance of user agents) will give us more answers than QA at this point.

Thanks - perhaps @Nuria could post the results of her analysis here so that we avoid double efforts?

Nuria added a comment.Mar 29 2018, 5:51 PM

I checked no major browser was absent (IE,chrome, ff, safari) and that data was not polluted by bots, which it wasn't. Data looks super clean actually.

I checked no major browser was absent (IE,chrome, ff, safari) and that data was not polluted by bots, which it wasn't. Data looks super clean actually.

Thank, but I meant results more in the sense of concrete query results (and the queries themselves). What does it mean that the data looks super clean?

Nuria added a comment.Mar 29 2018, 9:58 PM

@Tbayer that there are no garbage requests as the ones we see on our webrequest data,. Spofing of host headers, plain spammy requests and such..please take a look at the data yourself and let me know if you find issues.

Below is the ratio of events per pageview broken down for the 20 most frequent browsers (browser families).

As a sanity check, it's very low for several (infrequently occurring) mobile browsers, as one would expect: Firefox Mobile, Mobile Safari, Opera mini.

The two browsers with the highest ratio (Mail.ru Chromium Browser and Yandex Browser) appear to collude with Russia somehow, which would be consistent with the observation in T184793#4091625 that ruwiki may have an unusually high ratio.

The most likely problem indicator is the low ratio for IE, which might be another manifestation of our long-standing data quality problem arising from a large number of spurious IE7 views (see e.g. T148461, T157404). Will look at that next.

browser_familyratioeventspageviews
Chrome0.4174058552997290357
IE0.096522972754202689
Firefox0.2831081065938136715
Edge0.344423549012298847
Safari0.297345826811635379
Yandex Browser0.598658088210995833
Opera0.54545295510100189
Other0.00139073585221
Chromium0.171116403678772
Mobile Safari0.0138454645042
Vivaldi0.202104600517932
Mail.ru Chromium Browser0.667319823479377
Chrome Mobile0.07432326437731
CFNetwork0.07379889
Sleipnir0.30161717205262
Maxthon0.51762151120195
Pale Moon (Firefox Variant)0.21424250113131
Firefox Mobile0.00215291347
Opera Mini0.0370193
Android0.2351436661017

Data via

SELECT pvsperbrowser.browser_family AS browser_family,
ROUND(events/pageviews,3) AS ratio,
events, pageviews
FROM (
  SELECT useragent.browser_family AS browser_family,
  COUNT(*) AS events 
  FROM event.virtualpageview 
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND wiki IN ('huwiki', 'ruwiki', 'jawiki', 'frwiki') -- cf. https://phabricator.wikimedia.org/T189906#4073729
  AND NOT useragent.is_bot
  GROUP BY useragent.browser_family ) AS eventsperbrowser
JOIN (
  SELECT user_agent_map['browser_family'] AS browser_family, SUM(view_count) AS pageviews
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND project IN ('hu.wikipedia', 'ru.wikipedia', 'ja.wikipedia', 'fr.wikipedia')
  AND access_method = 'desktop'
  AND agent_type = 'user'
  GROUP BY user_agent_map['browser_family'] ) AS pvsperbrowser
ON eventsperbrowser.browser_family = pvsperbrowser.browser_family
ORDER BY pageviews DESC LIMIT 20;
Jdlrobson added a comment.EditedMar 29 2018, 10:11 PM

The most likely problem indicator is the low ratio for IE, which might be another manifestation of our long-standing data quality problem arising from a large number of spurious IE7 views (see e.g. T148461, T157404). Will look at that next.

Popups will not work on < IE 11. Which versions of IE are you seeing?

The most likely problem indicator is the low ratio for IE, which might be another manifestation of our long-standing data quality problem arising from a large number of spurious IE7

mmm.. no, this is incorrect, we stopped serving javascript to IE7 ages ago, no chance those users/bots ever see a popup.

https://www.mediawiki.org/wiki/Compatibility#Browsers

The most likely problem indicator is the low ratio for IE, which might be another manifestation of our long-standing data quality problem arising from a large number of spurious IE7

mmm.. no, this is incorrect, we stopped serving javascript to IE7 ages ago, no chance those users/bots ever see a popup.
https://www.mediawiki.org/wiki/Compatibility#Browsers

Not sure what you mean by "incorrect". The data quality problem (as long ago pointed out by various people, e.g. at T148461 in 2016 or in the "fishy browser stats" thread on Analytics in 2017) is that these old browsers are way overrepresented in our pageview stats compared to their actual usage. That affects the ratios here, precisely because there may be no corresponding previews arising from those.

Here is the result by IE version. Indeed those old versions drag the ratio down, but even for IE11 it is still very low.

IE version (browser_major)ratioeventspageviews
110.122521964042818813
80.01903555931
70.00268633359674
90.0019331586298
60.0111573591
100.00220561242816
50.00372708
120.6052643
191.011
SELECT pvsperversion.browser_major AS browser_major,
ROUND(events/pageviews,3) AS ratio,
events, pageviews
FROM (
  SELECT useragent.browser_major AS browser_major,
  COUNT(*) AS events 
  FROM event.virtualpageview 
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND useragent.browser_family = 'IE'
  AND wiki IN ('huwiki', 'ruwiki', 'jawiki', 'frwiki') -- cf. https://phabricator.wikimedia.org/T189906#4073729
  AND NOT useragent.is_bot
  GROUP BY useragent.browser_major ) AS eventsperversion
JOIN (
  SELECT user_agent_map['browser_major'] AS browser_major, SUM(view_count) AS pageviews
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 3 AND day >= 23 AND day <=28
  AND user_agent_map['browser_family'] = 'IE'
  AND project IN ('hu.wikipedia', 'ru.wikipedia', 'ja.wikipedia', 'fr.wikipedia')
  AND access_method = 'desktop'
  AND agent_type = 'user'
  GROUP BY user_agent_map['browser_major'] ) AS pvsperversion
ON eventsperversion.browser_major = pvsperversion.browser_major
ORDER BY pageviews DESC LIMIT 10;

My point being that you are comparing the ratio previews/pageviews for browsers for which previews are not available.
That being the case we know that ratio is already low. The ratio that probably represents usage better is ratio of previews/pageviews in browsers in which previews are well supported.

Why are there events for versions of IE before 11?

The number of events for those should be 0. The ratio should be 0. It's worth digging into those events and working out if they can be filtered out. Likely to be from different hosts.

Nuria added a comment.Mar 30 2018, 4:19 AM

The number of events for those should be 0. The ratio should be 0. It's worth digging into those events and working out if they can be filtered out. Likely to be from different hosts.

@Jdlrobson data is on hive on database events table virtualpageview. Do take a look.

I would not worry super much about browsers that appear in very low numbers, there is no guarantee that the UA you see is "real", example: IE "19", or "12" it just happens to "parse", for "12" most requests seem from the very same place in JP.

Nuria added a comment.Mar 30 2018, 4:28 PM

@Jdlrobson : let's please not change anything on Monday as it is a major EU holiday, rather let's launch further on Tuesday.

ovasileva moved this task from Incoming to Epics/Goals on the Readers-Web-Backlog board.
Jdlrobson renamed this task from Instrument page interactions to [EPIC] Instrument page interactions.Apr 4 2018, 8:54 PM
Jdlrobson added a project: Epic.
Tbayer updated the task description. (Show Details)Apr 24 2018, 8:24 PM
Tbayer moved this task from Triage to Tracking on the Product-Analytics board.Apr 24 2018, 8:27 PM

Why are there events for versions of IE before 11?
The number of events for those should be 0. The ratio should be 0. It's worth digging into those events and working out if they can be filtered out. Likely to be from different hosts.

See T193578#4242326 - it seems that this was an issue in ua-parser (now fixed) where e.g. IE11 in IE7 compatibility mode was registered as IE7 instead of as IE11.

☝️ /cc @Nuria.

From Web's POV, this epic is now done (all the AC are met and the subtasks are now closed)! Are there any outstanding issues on your side @Nuria/@Tbayer?

Nuria added a comment.Jul 17 2018, 5:24 PM

Closing sounds good

Tbayer closed this task as Resolved.Jul 18 2018, 9:32 AM

Yes, I think we are done with everything here. There may be some followup work once page previews are rolled out to Wikidata (T111231), in case the setup doesn't work out of the box there.

There may be some followup work once page previews are rolled out to Wikidata (T111231), in case the setup doesn't work out of the box there.

Indeed. Thanks for pointing this out!

Great work, everyone!

🎉🎉🎉