Page MenuHomePhabricator

ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't
Closed, ResolvedPublic

Description

This issue was first observed for Safari during QA (see below), and we decided to circumvent it by removing Safari user agents from analysis.

Now that the A/B test has launched, we have an additional method to check this: At the beginning of a pageview, the page issues instrumentation (from T191532) is supposed to send a pageLoaded event to both the PageIssues and ReadingDepth schemas. Because ReadingDepth depends on (in particular) support for sendBeacon and the Page Visibility API, we expect the ReadingDepth event to be missing for some browsers or browser versions, in particular Safari as discussed earlier. However, it looks like many other browsers apart from Safari are missing ReadingDepth events too ("only_pi" >> 0% in the table below). In particular, Chrome Mobile iOS, Android (stock browser) and (desktop) Chrome seem worth a closer look.

pageLoaded events logged in the PageIssues or the ReadingDepth schema as part of the page issues A/B test:

browserbothonly_pionly_rdall_pageloads
Chrome Mobile96.243.010.7527759714
Mobile Safari78.5421.360.122231489
Samsung Internet89.689.380.94rEMFR27519159ecbd
Chrome Mobile WebView97.710.881.412646714
Chrome80.2118.71.091928052
Mobile Safari UI/WKWebView35.2664.630.11786079
Android4.8695.040.09474722
UC Browser85.9111.892.21459774
Chrome Mobile iOS37.8662.010.12420661
Firefox Mobile96.581.312.11390622
Yandex Browser87.0311.931.04273847
Opera Mobile95.493.351.15267045
Amazon Silk98.390.31.31169278
YandexSearch93.095.761.15119939
Edge Mobile94.255.310.4449999
Facebook90.978.630.448057
Baiduspider-render96.740.862.4146165
NetFront NX0.0100.00.039353
Opera92.176.471.3625821
Firefox iOS53.8346.050.1221739
Puffin61.8136.731.4719311
BingPreview0.0499.960.016452
Firefox99.270.490.2415882
Opera Mini0.0100.00.011722
BlackBerry WebKit0.0299.980.010913
QQ Browser Mobile80.5115.513.9910783
Sleipnir27.072.970.029258
Crosswalk97.490.691.828723
Edge97.52.020.487366
Safari64.8634.440.77251
IE1.998.10.03428
IE Mobile0.7599.220.033208
Pinterest89.539.361.112244
Opera Coast0.0100.00.02215
Apache-HttpClient0.0100.00.01064

(Data from October 1-7. Browsers with less than 1000 pageviews in this sample removed for readability)
Query:

SET hive.mapred.mode=nonstrict;
SELECT 
browser,
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS both, 
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NULL),1,0))/SUM(1),2) AS only_pi, 
ROUND(100*SUM(IF((pipageToken IS NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS only_rd, 
SUM(1) AS all_pageloads
FROM (
  SELECT IF(pi.pageToken IS NOT NULL, pi.browser, rd.browser) AS browser,
  pi.pageToken AS pipageToken, rd.pageToken AS rdpageToken
  FROM (
    SELECT useragent.browser_family AS browser,
    event.pageToken AS pageToken
    FROM event.pageissues 
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded') AS pi
  FULL OUTER JOIN (
    SELECT useragent.browser_family AS browser,
    event.pageToken AS pageToken
    FROM event.readingdepth
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded'
    AND ( event.page_issues_a_sample OR event.page_issues_b_sample )) AS rd
  ON pi.pageToken = rd.PageToken) AS alltokens
GROUP BY browser
ORDER BY all_pageloads DESC;

Initial bug report from QA:

In T191532#4575809 @Ryasmeen noticed that it's possible for PageIssues events to be sent without ReadingDepth

Background

ReadingDepth events are only set if sendBeacon is available.
If sendBeacon is not available, PageIssues events can still be sent (using the fallback method)

Developer notes

Strangely @Jdlrobson can replicate this in Safari 11.1.1 (which is strange because according to release notes and Caniuse it should support sendBeacon)

Screen Shot 2018-09-11 at 4.21.09 PM.png (331×414 px, 45 KB)

On the other hand, we see lots of other Safari 11.1.x clients sending ReadingDepth events (T204143#4578937).

While it is not clear how this is possible, it is technically possible given the current state of the code.

We should investigate possible causes and clarify the implications for the reliability of the ReadingDepth data.

AC:

  • @Tbayer to write up outcomes of this investigation and takeaways for data analysis

--> T204143#4895679 and https://meta.wikimedia.org/wiki/Schema_talk:ReadingDepth

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Jdlrobson renamed this task from It is possible to send PageIssues events without ReadingDepth events to ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't.Sep 13 2018, 6:37 PM
Jdlrobson updated the task description. (Show Details)
Tbayer raised the priority of this task from Medium to High.Sep 13 2018, 8:48 PM

Chatted to Tilman and this does not impact the PageIssues analysis. He would like to understand the issue a bit more though. Dropping priority and removing from A/B test blockers.

That's not what I said (looks like we may have had a misunderstanding on Slack when I responded "no" to a different question than the one you had in mind - if so, apologies!). Restoring some previous task settings accordingly.

Regarding the proposal "If navigator.sendBeacon is undefined do not enable Schema:PageIssues" :

That doesn't solve the problem - actually, it would exacerbate it, by extending the problem "our instrumentation sometimes sends data when it should, but sometimes not, and we don't know why" from the ReadingDepth schema to the PageIssues schema.

Change 460441 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/skins/MinervaNeue@master] Restrict PageIssues schema logging to browsers that support sendBeacon

https://gerrit.wikimedia.org/r/460441

The task as written will guarantee PageIssues and ReadingDepth behave consistently with one another.

The question of why sendBeacon support differs for certain browsers from published information is a separate and curious question that will need further analysis +research (feel free to open a task to investigate how that can happen)

Following up on T204143#4578937, here is a closer look at which Safari 11.x clients send ReadingDepth events in production. Both 11.1.1 and 11.1.2 occur. So we can discard the hypothesis that the discrepancy is just due to sendBeacon having been added inbetween these two releases. Also, there was one event with @Jdlrobson's exact 11.1.1 user agent during that timespan (not included in the list below, which is limited to UAs with >100 events yesterday).

user_agentevents Sep 12
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15730538
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15164367
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15100163
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.1 Safari/605.1.1562207
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.1510163
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.1 Safari/605.1.1510093
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6; Tesseract/1.0) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15151
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/604.3.5 (KHTML, like Gecko) Version/11.0.1 Safari/604.3.5127
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Safari/605.1.15107

Data via

SELECT user_agent, COUNT(*) AS events 
FROM wmf.webrequest WHERE month = 9 AND day = 12 
AND  user_agent_map['browser_major'] = '11'
AND  user_agent_map['browser_family'] = 'Safari'
AND uri_query  LIKE '%ReadingDepth%' 
GROUP BY user_agent
ORDER BY events DESC LIMIT 100;

Also, there was one event with @Jdlrobson's exact 11.1.1 user agent during that timespan (not included in the list below, which is limited to UAs with >100 events yesterday).

Was that me? I was fiddling with ReadingDepth yesterday and overriding the check against production.

Also, there was one event with @Jdlrobson's exact 11.1.1 user agent during that timespan (not included in the list below, which is limited to UAs with >100 events yesterday).

Was that me? I was fiddling with ReadingDepth yesterday and overriding the check against production.

No, it was an IP from Germany on the German Wikipedia (can send you the details offline in case they are of interest).

The task as written will guarantee PageIssues and ReadingDepth behave consistently with one another.

The question of why sendBeacon support differs for certain browsers from published information is a separate and curious question that will need further analysis +research (feel free to open a task to investigate how that can happen)

OK, to be a bit clearer. This is a pointless and counterproductive change, and I'm not sure why the idea is still being pursued (cf. T204143#4582135 + T204143#4582202 ). It seems to be based on some mistaken assumptions about the planned data analysis, which does not require the two schemas to "behave consistently with one another" in this sense. It does however require that the instrumentation won't unpredictably fail to send data for browsers where we expect it to work (e.g. based on sendBeacon support per browser vendor's public documentation).

This issue is also happening for Chrome on Android.

For the record: @Jdlrobson is pointing out that this test was done using Chrome for Android 35 (released in 2014), whereas sendBeacon support was only added in Chrome for Android 42. So that means that, fortunately, Safari remains the only browser at this point where we have encountered these inconsistencies so far.

This issue is also happening for Chrome on Android.

@Ryasmeen - what version of Chrome did you see this on?

@Ryasmeen - what version of Chrome did you see this on?

Rummana told me this was Chrome 35 on Android, which doesn't support sendBeacon. This is why I've not included in the bug report description. This is working as expected.

See https://phabricator.wikimedia.org/T204143#4585115

@Ryasmeen - what version of Chrome did you see this on?

Rummana told me this was Chrome 35 on Android, which doesn't support sendBeacon. This is why I've not included in the bug report description. This is working as expected.

See also T204143#4585115 - I assume we are all talking about the same thing here?

Sorry, I didn't phrase this correctly. I meant to ask if we did not see this behavior on later versions of Chrome (sendBeacon was working as expected for Chrome versions known to support it)

Spoke with @Ryasmeen shortly:

  • Originally reported on Safari 11.1.1
  • Reproduced in Chrome 35
  • We still need to test a more recent version of Chrome

Change 460441 abandoned by Jdlrobson:
Restrict PageIssues schema logging to browsers that support sendBeacon

Reason:
per discussion on task this is not necessary

https://gerrit.wikimedia.org/r/460441

In standup we talked about this bug and agreed to exclude Safari 11.1.1 user agents from analysis.
Is there anything else left to do that's blocking the A/B test or can we close this task?

(If we want to spend time investigating the Safari 11.1.1 issue some more I'd suggest a new task outlining exactly that)

In standup we talked about this bug and agreed to exclude Safari 11.1.1 user agents from analysis.
Is there anything else left to do that's blocking the A/B test or can we close this task?

(If we want to spend time investigating the Safari 11.1.1 issue some more I'd suggest a new task outlining exactly that)

We want an extra confirmation that this is not an issue in later versions of Chrome. @Ryasmeen will be testing this today/tomorrow.

In standup we talked about this bug and agreed to exclude Safari 11.1.1 user agents from analysis.
Is there anything else left to do that's blocking the A/B test or can we close this task?

(If we want to spend time investigating the Safari 11.1.1 issue some more I'd suggest a new task outlining exactly that)

We want an extra confirmation that this is not an issue in later versions of Chrome. @Ryasmeen will be testing this today/tomorrow.

@ovasileva: Checked on the latest version of Chrome on Android (69.0.3497.100). This issue is not occurring there.

Perfect. Thanks @Ryasmeen! In that case, let's go ahead and close this task and confirm that we will be avoiding Safari in our analysis @Tbayer

Perfect. Thanks @Ryasmeen! In that case, let's go ahead and close this task and confirm that we will be avoiding Safari in our analysis @Tbayer

I have added a note to the schema talk page: https://meta.wikimedia.org/wiki/Schema_talk:ReadingDepth#Likely_broken_on_Safari
(CC @Groceryheist as this will impact his upcoming work as well)

I didn't realise we were excluding all of Safari. That seems a bit extreme imo given we have seen this issue only on 11.1.1 on desktop and we could just exclude that user agent.

I hope this doesn't mean we are excluding iPhone/iPad.
If so I recommend more testing on different Safari versions to increase our confidence. We have no reason right now to believe that all safari's are bad based on 2 desktop browsers.

I didn't realise we were excluding all of Safari. That seems a bit extreme imo given we have seen this issue only on 11.1.1 on desktop and we could just exclude that user agent.

I hope this doesn't mean we are excluding iPhone/iPad.
If so I recommend more testing on different Safari versions to increase our confidence. We have no reason right now to believe that all safari's are bad based on 2 desktop browsers.

Remind me, did we do QA for this schema on Mobile Safari? If you and/or @Ryasmeen saw valid events on that browser, I would agree that it's reasonable to assume for now that we can use its data.

I agree it would be good to do further testing to see whether we can narrow down the issue to particular versions, but considering that (per T204143#4578937 ) almost all desktop Safari events come from 11.1 currently, it wouldn't make much of a difference right now.

(Since T153207, raw user agents are no longer available in the EL tables, so we can't efficiently narrow queries down to subversions like 11.1.1.)

Remind me, did we do QA for this schema on Mobile Safari? If you and/or @Ryasmeen saw valid events on that browser, I would agree that it's reasonable to assume for now that we can use its data.

iOS Safari sendBeacon support was only added in 11.4 (Mar 2018). Thus older versions of 11 will not have it. Are we seeing events from 11.4 Mobile Safari?

Remind me, did we do QA for this schema on Mobile Safari? If you and/or @Ryasmeen saw valid events on that browser, I would agree that it's reasonable to assume for now that we can use its data.

iOS Safari sendBeacon support was only added in 11.4 (Mar 2018). Thus older versions of 11 will not have it. Are we seeing events from 11.4 Mobile Safari?

Yes, see above (T204143#4578937) - but also from (clients that are logged as) Mobile Safari 11.0.

Did we do QA for this schema on Mobile Safari?

Tbayer updated the task description. (Show Details)

Now that the A/B test has launched, we have an additional method to check this: At the beginning of a pageview, the page issues instrumentation (from T191532) is supposed to send a pageLoaded event to both the PageIssues and ReadingDepth schemas. Because ReadingDepth depends on (in particular) support for sendBeacon and the Page Visibility API, we expect the ReadingDepth event to be missing for some browsers or browser versions, in particular Safari as discussed earlier. However, it looks like many other browsers apart from Safari are missing ReadingDepth events too ("only_pi" >> 0% in the table below). In particular, Chrome Mobile iOS, Android (stock browser) and (desktop) Chrome seem worth a closer look.

For a ReadingDepth event to fire it it must be true that:

  • JavaScript is enabled
  • There are no JavaScript client side errors at runtime (we'll know more about this when T205582 is live in production)
  • the user is in the sample group (wgWMEReadingDepthSamplingRate) OR the user is in the page issues A/B test
  • There are no Event errors relating to the schema

For a PageIssues event to fire, it's a little less complicated. The following needs to be true:

  • JavaScript is enabled
  • There are no JavaScript client side errors at runtime (we'll know more about this when T205582 is live in production)
  • mw.config.get( 'wgEventLoggingBaseUri' ) must be set
  • The user falls within wgMinervaABSamplingRate
  • The page in question has issues.
  • navigator.doNotTrack is not set
  • There are no Event errors relating to the schema

In cases where ReadingDepth is missing, but a PageIssues event exists, we can expect one of the following to be true:

With regards to the first 2, I'd need more detailed information on the versions data is missing for Chrome Mobile iOS, Android (stock browser) and (desktop) Chrome. Chrome iOS is very different from Chrome for Android (one uses webkit and the other blink for rendering). For desktop, at least Chrome 39 is needed and for Android stock browser I still don't really understand why this browser is still around and I suspect it's in maintenance mode - I wouldn't be surprised if it doesn't support sendBeacon or performance.

3 actionables I can pull out of this:

  • I'd suggest looking at the EventLogging errors
  • Let's see what T205582 tells us about client errors
  • Let's get information on browser versions for the set of events fired for page issues but not reading depth for further investigation. browser family is not enough here.

I'd suggest looking at the EventLogging errors

Quick follow up on this:
I'm seeing at least one issue in kafkacat relating to sectionNumbers being set as null which relates to this.
It occurs on a page which has no issues and I cannot replicate.

ssh stat1004.eqiad.wmnet
 -C -b kafka-jumbo1001.eqiad.wmnet -t eventlogging_EventError | grep PageIssues

I see at least one event with validation message "None is not of type 'integer'".

{"event": {"code": "validation", "message": "None is not of type 'integer'", "rawEvent": "

Decoding that with decodeURIComponent

"{"event": {"code": "validation", "message": "None is not of type 'integer'", "rawEvent": "?{"event":{"pageTitle":"شهرآورد تهران","namespaceId":0,"pageIdSource":3259010,"issuesVersion":"new2018","issuesSeverity":["DEFAULT"],"sectionNumbers":[null],"isAnon":true,"editCountBucket":"0 
....
webHost":"fa.m.wikipedia.org","wiki":"fawiki"};	cp3030.esams.wmn

The issue is sectionNumbers being set as null:

sectionNumbers":[null],

Cannot replicate myself on this page, but looks like a legit bug in the PageIssues logic which would explain the discrepancy we are seeing.
Am seeing this event quite regularly across various wikis.

User agents:

  • Mozilla/5.0 (Linux; Android 9; PH-1 Build/PPR1.181005.034) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.123 Mobile Safari/537.36
  • Mozilla/5.0 (Linux; Android 7.0; SM-G610F Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.137 Mobile Safari/537.36\
  • Mozilla/5.0 (Linux; Android 8.0.0; RNE-L21 Build/HUAWEIRNE-L21) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.91 Mobile Safari/537.36\
  • Mozilla/5.0 (Linux; Android 7.1.1; TA-1032 Build/NMF26O) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Mobile Safari/537.36\

We'd need some real devices to isolate and understand this problem. I cannot replicate on any of the pages it happens, so likely to be browser specific.

[...]

With regards to the first 2, I'd need more detailed information on the versions data is missing for Chrome Mobile iOS, Android (stock browser) and (desktop) Chrome. Chrome iOS is very different from Chrome for Android (one uses webkit and the other blink for rendering). For desktop, at least Chrome 39 is needed and for Android stock browser I still don't really understand why this browser is still around and I suspect it's in maintenance mode - I wouldn't be surprised if it doesn't support sendBeacon or performance.

Here is the table from the task description broken down by browser version (and limited to rows with >= 100 views).

For desktop Chrome we indeed see "only_pi" drop to 0-2% with version 39. For Chrome Mobile iOS and Android, it looks much less clear.

browsermajorminorbothonly_pionly_rdall_pageloads
Android0594.663.911.42281
Android2010.6389.370.0207
Android2338.5760.950.48210
Android409.7890.110.134531
Android410.0399.970.087482
Android420.2199.780.0202736
Android430.9499.050.0224806
Android440.9199.060.03102051
Android5061.2637.671.081115
Android5156.0842.581.345824
Android6082.8514.552.65039
Android7085.6512.961.382962
Android7197.051.341.61746
Android8096.072.661.264432
Android8194.963.51.542143
Chrome11032.5267.260.221393
Chrome18015.0384.880.092143
Chrome2500.0100.00.0280
Chrome2600.1699.840.01894
Chrome2800.0199.990.012557
Chrome3000.3599.640.01160436
Chrome3100.3599.650.0849
Chrome3200.099.580.42236
Chrome3300.0199.990.0124059
Chrome3400.0100.00.04601
Chrome3500.0100.00.02270
Chrome3600.0599.950.04252
Chrome3700.3399.660.0137244
Chrome38018.7381.120.161901
Chrome39098.120.611.289081
Chrome40097.631.430.9431069
Chrome41097.930.691.38435
Chrome42097.61.470.934297
Chrome43098.290.890.8223004
Chrome44097.770.481.75627
Chrome45099.350.260.391547
Chrome46098.770.420.87102
Chrome47097.910.511.592960
Chrome48099.00.330.66603
Chrome49098.080.381.534960
Chrome50098.560.421.0311290
Chrome51098.40.271.337830
Chrome52097.920.881.29444
Chrome53097.181.11.725988
Chrome5373060.0100.00.01793
Chrome54097.060.552.395277
Chrome55097.70.511.7912662
Chrome56097.970.651.3830794
Chrome57094.963.691.3533667
Chrome58097.620.431.9510156
Chrome59097.650.571.799185
Chrome60097.930.591.4810340
Chrome61098.190.561.2510321
Chrome62096.861.661.4810982
Chrome63097.770.431.8114511
Chrome64098.050.421.5221378
Chrome65097.950.351.728622
Chrome66097.980.361.6735210
Chrome67097.840.331.8357601
Chrome68097.720.361.9294760
Chrome69098.550.221.231064053
Chrome70098.340.231.431327
Chrome71098.610.550.83721
Chrome Mobile iOS1900.0100.00.0499
Chrome Mobile iOS2300.0100.00.0111
Chrome Mobile iOS2800.0100.00.0124
Chrome Mobile iOS3000.0100.00.0166
Chrome Mobile iOS3100.0100.00.0177
Chrome Mobile iOS3300.0100.00.0199
Chrome Mobile iOS3400.0100.00.0101
Chrome Mobile iOS3500.0100.00.0137
Chrome Mobile iOS3600.0100.00.0154
Chrome Mobile iOS3700.0100.00.0691
Chrome Mobile iOS3900.0100.00.0209
Chrome Mobile iOS4000.0100.00.0276
Chrome Mobile iOS4100.0100.00.0218
Chrome Mobile iOS4200.0100.00.0267
Chrome Mobile iOS4300.0100.00.0608
Chrome Mobile iOS4400.3299.680.0314
Chrome Mobile iOS4500.0100.00.0634
Chrome Mobile iOS4600.0100.00.0350
Chrome Mobile iOS4700.0299.980.04725
Chrome Mobile iOS4806.6193.390.0363
Chrome Mobile iOS4908.2191.790.0463
Chrome Mobile iOS50013.0686.940.0697
Chrome Mobile iOS51013.9786.030.0981
Chrome Mobile iOS5209.690.40.0906
Chrome Mobile iOS53011.0588.890.061773
Chrome Mobile iOS5408.9790.960.071460
Chrome Mobile iOS5508.8290.960.222278
Chrome Mobile iOS5609.1690.670.181714
Chrome Mobile iOS57010.8988.810.32020
Chrome Mobile iOS5806.4593.550.02620
Chrome Mobile iOS5909.0590.950.03304
Chrome Mobile iOS6009.7590.120.133035
Chrome Mobile iOS61013.1886.710.14786
Chrome Mobile iOS62010.2789.70.035969
Chrome Mobile iOS6303.8496.140.0231921
Chrome Mobile iOS64013.8186.090.15017
Chrome Mobile iOS65017.1482.840.025741
Chrome Mobile iOS66020.479.540.067786
Chrome Mobile iOS67020.7579.210.0416317
Chrome Mobile iOS68037.9261.930.1553723
Chrome Mobile iOS69049.6250.230.15257396

Data via

SET hive.mapred.mode=nonstrict;
SELECT 
browser, major, minor,
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS both, 
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NULL),1,0))/SUM(1),2) AS only_pi, 
ROUND(100*SUM(IF((pipageToken IS NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS only_rd, 
SUM(1) AS all_pageloads
FROM (
  SELECT 
  IF(pi.pageToken IS NOT NULL, pi.browser, rd.browser) AS browser,
  IF(pi.pageToken IS NOT NULL, pi.major, rd.major) AS major,
  IF(pi.pageToken IS NOT NULL, pi.minor, rd.minor) AS minor,
  pi.pageToken AS pipageToken, rd.pageToken AS rdpageToken
  FROM (
    SELECT useragent.browser_family AS browser,
    useragent.browser_major AS major,
    useragent.browser_minor AS minor,
    event.pageToken AS pageToken
    FROM event.pageissues 
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded') AS pi
  FULL OUTER JOIN (
    SELECT useragent.browser_family AS browser,
    useragent.browser_major AS major,
    useragent.browser_minor AS minor,
    event.pageToken AS pageToken
    FROM event.readingdepth
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded'
    AND ( event.page_issues_a_sample OR event.page_issues_b_sample )) AS rd
  ON pi.pageToken = rd.PageToken) AS alltokens
WHERE browser IN ('Chrome Mobile iOS', 'Chrome', 'Android')
GROUP BY browser, major, minor
HAVING all_pageloads >= 100
ORDER BY browser, major, minor;

Insprired by a suggestion of @Jdlrobson, here is a version of the above query by iOS version, showing a clear change at iOS 11.3, but also some oddities at earlier versions like 9.1:

osmajorminorbothonly_pionly_rdall_pageloads
iOS2080.019.050.95105
iOS3211.7988.020.19526
iOS4067.4931.690.82366
iOS4361.3738.630.0233
iOS5014.9684.640.41491
iOS5142.4157.120.472556
iOS607.4692.470.077681
iOS610.0199.990.024585
iOS700.499.590.0127056
iOS710.0499.960.0147914
iOS805.1294.70.187949
iOS810.0399.970.040893
iOS820.0999.910.09232
iOS830.0299.980.024382
iOS840.0299.980.039227
iOS900.0999.910.022259
iOS9165.7932.591.6265859
iOS920.499.60.056658
iOS930.4399.570.0541170
iOS1000.0299.980.0138981
iOS1010.4799.530.0154626
iOS1020.0599.950.0448834
iOS1030.0999.910.01301205
iOS1103.1996.810.0439571
iOS1110.0899.920.0424893
iOS1120.0299.980.01546486
iOS11397.622.260.121139376
iOS11498.121.770.127677034
iOS12098.491.360.159388477
iOS12198.671.170.1764524
SET hive.mapred.mode=nonstrict;
SELECT 
os, major, minor,
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS both, 
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NULL),1,0))/SUM(1),2) AS only_pi, 
ROUND(100*SUM(IF((pipageToken IS NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS only_rd, 
SUM(1) AS all_pageloads
FROM (
  SELECT 
  IF(pi.pageToken IS NOT NULL, pi.os, rd.os) AS os,
  IF(pi.pageToken IS NOT NULL, pi.major, rd.major) AS major,
  IF(pi.pageToken IS NOT NULL, pi.minor, rd.minor) AS minor,
  pi.pageToken AS pipageToken, rd.pageToken AS rdpageToken
  FROM (
    SELECT useragent.os_family AS os,
    useragent.os_major AS major,
    useragent.os_minor AS minor,
    event.pageToken AS pageToken
    FROM event.pageissues 
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded') AS pi
  FULL OUTER JOIN (
    SELECT useragent.os_family AS os,
    useragent.os_major AS major,
    useragent.os_minor AS minor,
    event.pageToken AS pageToken
    FROM event.readingdepth
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded'
    AND ( event.page_issues_a_sample OR event.page_issues_b_sample )) AS rd
  ON pi.pageToken = rd.PageToken) AS alltokens
WHERE os = 'iOS'
GROUP BY os, major, minor
HAVING all_pageloads >= 100
ORDER BY os, INT(major), INT(minor);

I took a deep dive into this data today.
I compiled a table, cross checking browser versions with browser capabilities:
https://www.mediawiki.org/wiki/User:Jdlrobson/Page_issues_analysis

In general, the browser capabilities matched what we're seeing.

  • if the majority of events were 100% only page issues (or close to 100%) it was consistent with the lack of a support for ReadingDepth (sendBeacon AND NavigationTiming support)
  • if the majority of events were 100% both events (or close to 100%) that was consistent with support for ReadingDepth and PageIssues.

Where we were only seeing ReadingDepth events when we expected both, the margin of error was generally small (<= 10%).
The Android browsers proved the most problematic with discrepancies from 10-50%!

The most peculiar cases were:

  • Seeing ReadingDepth events where ReadingDepth should be impossible.
    • When this happened it was generally a small fraction of our data. Where it was more problematic:
    • Chrome <= 38 on Android 4-7 (Note that sendBeacon was introduced in Chrome 38 so how sendBeacon is being used outside these browsers is not 100% clear)
    • iOS Chrome prior to 11.3 (NavigationTiming was disabled in iOS 8.1 and it's unclear when it got re-added and supported in Chrome, but our data seems to indicate 11.3 )
    • the native Android browser.
  • Seeing events where JS is supposed to be disabled
    • Limited only to native Android browser.

From this analysis, I'd strongly recommend ignoring ReadingDepth data coming from Android native browser; iOS Chrome prior to 11.3 and Chrome <=38.

I have no exact answer to why we are dropping ReadingDepth events in cases where we should be sending them, but the following theories may provide answers. Given, the ReadingDepth and PageIssues events are sent at different times in the code. There are a variety of factors that could lead to only one being sent. These include

  • client's web connection speed/stability
  • client side error occurring in either page issues or reading depth
  • event could not be decoded

One other thing that's worth pointing out - ReadingDepth will not run if navigationStart is 0. I'm not sure if that is ever true (per spec it should always be non-zero) but would also account for cases where ReadingDepth is not being sent.

None of this accounts for how ReadingDepth events can be sent without sendBeacon support.

To add to this analysis
@Nuria had this to say today:

jdlrobson: do not trust user agents 100% "android 2" could be a who-knows-bot with user agent "android 2" this happens everyday
4:42 PM jdlrobson: or also, could be a misslabeled UA, that is, parser thinks is Android 2 but it is really something else
4:42 PM jdlrobson: this does not happen a lot but it does happens
4:43 PM jdlrobson: i just run some numbers yesterday and by my early estimates 5% of our traffic labeled as "user" is really bots
4:44 PM jdlrobson: so i would not expect 100% consistancy, bots have "made up" UAs

I apologise for the non-sequitur but I mentioned that I'd follow up with the latest ReadingDepth and PageIssues server-side error rates so that the conversation was all happening in one place.

I'm seeing at least one issue in kafkacat relating to sectionNumbers being set as null which relates to this.

For 2018/10/09, the number of erroneous events received by the server for the PageIssues and ReadingDepth schemas are as follows:

SchemaErrors (% of events received by the server)
MinMax
PageIssues0.010.03
ReadingDepth0.0010.005

The maximum is calculated assuming that all events that are categorised as "unknown" were actually events of that schema.

Regardless, nice investigation, y'all 💪

[0]
select
    count(*) as n
from
    event.readingdepth
where
    year = 2018 and
    month = 10 and
    day = 9
;

+-----------+
|     n     |
+-----------+
| 72358845  |
+-----------+

select
    count(*) as n
from
    event.pageissues
where
    year = 2018 and
    month = 10 and
    day = 9
;

+-----------+
|     n     |
+-----------+
| 12342261  |
+-----------+

select
    event.schema as schema,
    count(*) as n
from
    event.eventerror
where
    year = 2018 and
    month = 10 and
    day = 9 and

    event.schema in ("ReadingDepth", "PageIssues", "unknown")
group by
    event.schema
;

+---------------+-------+
|    schema     |   n   |
+---------------+-------+
| PageIssues    | 1395  |
| ReadingDepth  | 818   |
| unknown       | 2847  |
+---------------+-------+

I took a deep dive into this data today.
I compiled a table, cross checking browser versions with browser capabilities:
https://www.mediawiki.org/wiki/User:Jdlrobson/Page_issues_analysis

[...]
Thanks again for the analysis and the recommendations!

From this analysis, I'd strongly recommend ignoring ReadingDepth data coming from Android native browser; iOS Chrome prior to 11.3 and Chrome <=38.

I guess that this was meant to read "iOS prior to 11.3", correct? (cf. above)

I guess that this was meant to read "iOS prior to 11.3", correct? (cf. above)

I'd assume the NavigationTiming issues would exist across all platforms, but you may want to run some checks on iOS Safari, as I've only accounted for Chrome.
It's likely the API in iOS Safari was fixed earlier than Chrome and Chrome reacted to their change later.
That said, to avoid complicated queries using user agents, it's probably best to exclude all of iOS prior to 11.3

To add to this analysis
@Nuria had this to say today:

jdlrobson: do not trust user agents 100% "android 2" could be a who-knows-bot with user agent "android 2" this happens everyday
4:42 PM jdlrobson: or also, could be a misslabeled UA, that is, parser thinks is Android 2 but it is really something else
4:42 PM jdlrobson: this does not happen a lot but it does happens
4:43 PM jdlrobson: i just run some numbers yesterday and by my early estimates 5% of our traffic labeled as "user" is really bots
4:44 PM jdlrobson: so i would not expect 100% consistancy, bots have "made up" UAs

Well yes, we don't expect perfect accuracy with this kind of thing, which is why the task description already said " >> 0%" instead of "!= 0%". (BTW, there are also non-bot clients with forged user agents.)

Regarding Android 2 specifically, note that only a very small number of events were classified as coming from that OS version, see T204143#4650771. (We could double-check the full UA in the webrequest data to see if it's indeed a bug in ua-parser, but that doesn't seem worthwhile right now.)

Regarding undetected bots in general, I'm looking forward to the detection improvements planned for this fiscal year. FWIW, having noted the obvious bot UA "Baiduspider-render" in the data posted in the task description, I had also checked for *detected* bots earlier in the pageissues data, via useragent.is_bot, and their ratio was negligible.

Apropos, @Jdlrobson: The big browser+OS table at https://www.mediawiki.org/wiki/User:Jdlrobson/Page_issues_analysis#Results is great. But I got a bit confused about the "NO" in the "Matches expectations" column for many rows. E.g.

  • Chrome 69 on Android 4.4: "Expected: both" seems to match the data (98.34% both)
  • Chrome 36 on Android 4.4: "Expected: only_pi": seems to match the data (99.95% only_pi)

Is this because only exact 100.0% matches were marked as "YES" in "Matches expectations"?

But I got a bit confused about the "NO" in the "Matches expectations" column for many rows. E.g.

I was looking for 100% matches, yes. I've clarified the table with a "MOSTLY" value.

Jon and I talked about this some days ago, but some of what we determined isn't reflected here yet. a couple of quick thoughts and some additional analysis follows to hopefully move this forward:

Mobile Safari added support for Navigation Timing in iOS 9.0, not 10.x or 11.x. It was previously was available on iOS 8.0, but Apple removed it 8.1 due to problems with their implementation. It was back in 9.0 and has been available since.

Events from old browsers where our Grade A feature test fails are most certainly the result of User-Agent mangling. The Hive dataset being used in this task is incapable of perfection. This is by design. Being curious and scrupulous is good, but be sure to not have any expectation of it becoming perfect nor to fully understand why it isn't. It should be used to inform holistic information, not individually. Because:

  1. The UA string sent by web clients can be trivially manipulated by users via their browser settings, by browser extensions modifying requests, and by headless processes such as bots, scrapers and crawlers that may or may not accurately expose their true internals via the User-Agent string.
  2. The string is aggregated before it reaches Hive by the ua-parser library, which simplifies long and complex strings like Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; Kindle Fire Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 into something more digestible, such as Android 2.3, or maybe Kindle Fire 6.3, or maybe Mobile Safari 4.0? This is a lossy transformation and inherently produces an incorrect summary for some results. Contrary to what one might think, misclassifications or unknowns rarely end up as "Other", rather they typically end up filed under another name. This is unavoidable due to the complicated history of user-agent strings and their mixed purpose. E.g. most unknowns are insignificant variations of knowns and we want them grouped together.. except when we find them significant, which is hard to define, or at least subjective. The whole system depends on the goodwill and competence of device manufacturers and their users.

The current task title speculates about one of several possible explanations about why the schemas don't have the same number of received events. I can help narrow down the cause when more information is available, but my gut feeling tells me it is highly unlikely that it relates to availability of NavTiming or sendBeacon APIs.

Instead, I suspect the reason is that our code and the browser work fine, but our code just isn't triggered in the first place sometimes. This isn't a single reason, it's a category of reasons, all of which are likely true to some extent:

  • The page can close between event A being sent and event B still awaiting asynchronous code and lazy-loading of modules. This is among the reasons I advocate against client-side event validation and against async abstractions and abstractions such as mw.eventLog.Schema. As part of T187207, I'm collaborating with Analytics to establish a much more direct and lightweight method that effectively provides a straight path to navigator.sendBeacon. Which, once reached, provides fairly strong guarantee of delivery (for as far as that is possible over the Internet). Even on slow or intermittent connections, or when the page is closed before the beacon is delivered, the browser is meant to remember beacons and send them whenever, even outside the bounds of the tab that was once open. This is why the API was created - a new primitive separate from XHR and Fetch.
  • Network loss. Aside from client-side guarantees, there is also the network. The Internet doesn't provide a 100% delivery of requests and responses. Stuff happens. This is normal and expected. It's our responsibility to balance this with a compromise, or a heavy investment in complexity based on the needs and their relevant important. E.g. sending a beacons from a script without user input won't have the same guarantees as someone deciding to save an edit. The application and the user's browser can provide direct feedback and allow a user to interpret what and how much worked, and whether they are willing to try again. This is significant and usually requires user negotiation (or developer negotiation) because it may be different in subtle ways from the original (e.g. one second later is a different timestamp, potentially different IP address, potentially different cookies and their expires etc.). Even if such negotiation existed for scripts, it couldn't run after the tab is closed.
  • Response to loss. If a browser tried sending it and couldn't get confirmation from the server, it doesn't know if it was delivered. It can try a second time and risk over-presenting the event. Or it may be cautious and not send again unless it know it failed, which may underrepresent the event.

Given the size of the anomaly, I'm not sure how much further we should investigate. But do let me know if you find a particular problem or have questions about something, as we should certainly make sure that anything we control works the best it can.

The UA string sent by web clients can be trivially manipulated by users via their browser settings, by browser extensions modifying requests, and by headless processes such as bots, s

+1. As I mentioned to @Jdlrobson it is very likely that up to 5% of our "user-classified" traffic is actually just not identified bots with fake UAS

I'm about to post the more detailed summary of the finding and data analysis recommendations that resulted from the above discussion, and then close this task, but just to follow up on some interesting remarks by @Krinkle:

...

Mobile Safari added support for Navigation Timing in iOS 9.0, not 10.x or 11.x. It was previously was available on iOS 8.0, but Apple removed it 8.1 due to problems with their implementation. It was back in 9.0 and has been available since.

This is a discrepancy we weren't able to resolve. (@Krinkle was referring here to the fact that based on our data in T204143#4653278, almost all iOS devices with version 11.3 and newere are sending ReadingDepth events, and almost all with older versions don't.)
We have been circumventing this issue by conservatively excluding data from all versions prior to 11.3 , see T204143#4661935 .

Events from old browsers where our Grade A feature test fails are most certainly the result of User-Agent mangling. The Hive dataset being used in this task is incapable of perfection. This is by design. Being curious and scrupulous is good, but be sure to not have any expectation of it becoming perfect nor to fully understand why it isn't.

Agreed, that's why we only focused on the larger discrepancies in this task and let many smaller ones slide (cf. T204143#4663878 ).

It should be used to inform holistic information, not individually. Because:

  1. The UA string sent by web clients can be trivially manipulated by users via their browser settings, by browser extensions modifying requests, and by headless processes such as bots, scrapers and crawlers that may or may not accurately expose their true internals via the User-Agent string.

Agreed, but that seems to be something to keep in mind ever single time we rely on user agent data, not just in this particular investigation ;)

  1. The string is aggregated before it reaches Hive by the ua-parser library, which simplifies long and complex strings like Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; Kindle Fire Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 into something more digestible, such as Android 2.3, or maybe Kindle Fire 6.3, or maybe Mobile Safari 4.0?

True - as mentioned in T204143#4617685 , the raw user agent was removed from EL data a while ago, leaving only that ua-parser result. (That said, for recent EL events it should be possible to reconstruct the full UA from the webrequest table while it is not yet purged, but that probably would not have been worth the effort here.)

This is a lossy transformation and inherently produces an incorrect summary for some results. Contrary to what one might think, misclassifications or unknowns rarely end up as "Other", rather they typically end up filed under another name.

Indeed, see also the results of T193578 ...

[...]

Instead, I suspect the reason is that our code and the browser work fine, but our code just isn't triggered in the first place sometimes. This isn't a single reason, it's a category of reasons, all of which are likely true to some extent:

  • The page can close between event A being sent and event B still awaiting asynchronous code and lazy-loading of modules. This is among the reasons I advocate against client-side event validation and against async abstractions and abstractions such as mw.eventLog.Schema. As part of T187207, I'm collaborating with Analytics to establish a much more direct and lightweight method that effectively provides a straight path to navigator.sendBeacon. Which, once reached, provides fairly strong guarantee of delivery (for as far as that is possible over the Internet).

Apropos, I noticed that T187207 was closed a couple of days ago. We can't repeat the queries above as the PageIssues schema has been deactivated, but if there are other ways to check whether T187207 has an impact on the kind of issues that had been investigated here, that would be very interesting.

[...]
Given the size of the anomaly, I'm not sure how much further we should investigate. But do let me know if you find a particular problem or have questions about something, as we should certainly make sure that anything we control works the best it can.

No, it seemed that with the findings up to that point we had already covered the most important UA segments that needed to be excluded. That said, insights about the limitations of the ReadingDepth data continue to be relevant and welcome.

Thanks again to everyone who had weighed in with various insights, enabling us to launch the page issues A/B test without much further delay back in October!

We kept this task open as someone still needed to review the rather complex discussion on this ticket, tie up some loose ends and summarize the resulting recommendations for the analysis of data from ReadingDepth schema. Back in October I already left a preliminary summary at https://meta.wikimedia.org/wiki/Schema_talk:ReadingDepth , which @Groceryheist and I have been using since then.

As far as I am aware, the only question that remained open back then was whether it was too conservative to exclude all Safari mobile clients (in addition to just desktop Safari, and iOS versions prior to 11.3). After reading through everything again and running yet another query[1] of the kind we have been employing in the investigation above, it looks like using mobile Safari on iOS 11.3 and newer should be fine, at least it doesn't exhibit the large discrepancies that motivated us to exlude the other cases.

So to sum up, the final recommendation is to exclude the following user agents from data analysis involving ReadingDepth events:

  • iOS versions prior to 11.3
  • the native Android browser
  • desktop Chrome <=38
  • desktop Safari

I'm also updating https://meta.wikimedia.org/wiki/Schema_talk:ReadingDepth with a readymade Hive clause.

[1]

browsermajorminorbothonly_pionly_rdall_pageloads
Mobile Safari1000.0799.930.01609573
Mobile Safari1010.0100.00.012610
Mobile Safari1020.0100.00.037442
Mobile Safari1030.0499.960.0107775
Mobile Safari11080.7919.120.0910294206
Mobile Safari1110.5499.460.037000
Mobile Safari1120.0499.960.0132207
Mobile Safari11382.8217.020.1612832
Mobile Safari11489.6510.190.16104214
Mobile Safari12099.430.420.159080654
Mobile Safari12198.061.510.43465
Mobile Safari3176.8522.220.93108
Mobile Safari4034.3465.30.371095
Mobile Safari5040.059.760.24415
Mobile Safari5139.7959.680.533212
Mobile Safari601.9798.020.0130218
Mobile Safari610.0100.00.0814
Mobile Safari700.1799.820.0162173
Mobile Safari710.0100.00.03432
Mobile Safari800.4299.560.0199883
Mobile Safari810.0100.00.03526
Mobile Safari820.0100.00.0717
Mobile Safari830.0100.00.01976
Mobile Safari840.0100.00.03154
Mobile Safari901.1798.820.01544644
Mobile Safari910.0100.00.01487
Mobile Safari920.0100.00.04381
Mobile Safari930.0100.00.041213
SafariNULLNULL15.083.651.361107
Safari1000.0100.00.0123
Safari1010.0100.00.0397
Safari1100.3999.610.0254
Safari11182.1117.720.181710
Safari12095.673.321.012587
Safari4083.4415.720.84477
Safari8066.832.790.41244
Safari910.0100.00.0120

Data via

SET hive.mapred.mode=nonstrict;
SELECT 
browser, major, minor,
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS both, 
ROUND(100*SUM(IF((pipageToken IS NOT NULL) AND (rdpageToken IS NULL),1,0))/SUM(1),2) AS only_pi, 
ROUND(100*SUM(IF((pipageToken IS NULL) AND (rdpageToken IS NOT NULL),1,0))/SUM(1),2) AS only_rd, 
SUM(1) AS all_pageloads
FROM (
  SELECT 
  IF(pi.pageToken IS NOT NULL, pi.browser, rd.browser) AS browser,
  IF(pi.pageToken IS NOT NULL, pi.major, rd.major) AS major,
  IF(pi.pageToken IS NOT NULL, pi.minor, rd.minor) AS minor,
  pi.pageToken AS pipageToken, rd.pageToken AS rdpageToken
  FROM (
    SELECT useragent.browser_family AS browser,
    useragent.browser_major AS major,
    useragent.browser_minor AS minor,
    event.pageToken AS pageToken
    FROM event.pageissues 
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded') AS pi
  FULL OUTER JOIN (
    SELECT useragent.browser_family AS browser,
    useragent.browser_major AS major,
    useragent.browser_minor AS minor,
    event.pageToken AS pageToken
    FROM event.readingdepth
    WHERE year = 2018 AND month = 10 AND day <=7
    AND event.action = 'pageLoaded'
    AND ( event.page_issues_a_sample OR event.page_issues_b_sample )) AS rd
  ON pi.pageToken = rd.PageToken) AS alltokens
WHERE browser LIKE '%Safari'
GROUP BY browser, major, minor
HAVING all_pageloads >= 100
ORDER BY browser, major, minor;

It will also be wise to exclude events that are happening (for 1 entity) at too high of a rate , even if marked as user , those indicate probably automated traffic. You can set a high threshold, like events from one entity with more than say 30 requests per minute are probably automated, User Agents on those case mean very little and that type of data is just going to add more noise.

It will also be wise to exclude events that are happening (for 1 entity) at too high of a rate , even if marked as user , those indicate probably automated traffic. You can set a high threshold, like events from one entity with more than say 30 requests per minute are probably automated, User Agents on those case mean very little and that type of data is just going to add more noise.

Thanks for the suggestion! I'm not going to incorporate it into the data recommendation outcomes from this task for now, considering that this potential issue sounds more like something that would affect EventLogging in general, or at least multiple schemas. (The primary purpose of this task was to determine whether this particular schema shows widespread unexpected behaviour for entire browser families or (ranges of) browser versions.)

That said, it's indeed an intriguing question and I'm inclined to run some queries to understand how this might affect EL in general. What definition of "entity" would you suggest for investigating it? How about the combination of IP and (raw) user agent?

(The ReadingDepth schema in particular also contains the page token which can be used to catch extraneous events sent during the same page view; @Groceryheist had already been excluding those for the purposes of the Reading Time investigation and IIRC they were very infrequent.)

It affects EL in general but not all events alike, many bots click around links in pages and schemas that capture clicks are affected most prominently by bots meaning that their numbers are more distorted.