== Symptom
Between 8 and 20% of logged Popups events contain either duplicated link interaction tokens or are identical events [0]. Furthermore, very nearly all of these events (~98% per T161769#3195922) are logged by Firefox, including the latest version. This was first written up in {T161769}. This task attempts to summarise what we know we know and what we know we don't know so that we can work to isolate the source of the event duplication.
== Hypotheses
=== 1. Page Previews (PP) is logging the duplicate events
NOTE: This was invalidated by our implementation of the client-side deduplicator, the instrumentation for which is available on [the Page Previews Perf dashboard](https://grafana.wikimedia.org/dashboard/db/reading-web-page-previews?refresh=1m&orgId=1).
The most likely source of the duplicate events is PP. We tested this by instrumenting the piece of code that receives events for logging from the central state container, [the `eventLogging` change listener](https://github.com/wikimedia/mediawiki-extensions-Popups/blob/master/src/changeListeners/eventLogging.js) (see {T163198}). We began discarding both types of duplicate event when {c44fddf8cdfd4c8f3b9e59ee6ab10c3ea0e0b719} was deployed on Thursday, June 1st.
While he was reviewing those changes, @Jhernandez submitted {293d7ebe8dd84880cb8e7e8163d7d908e48baf21}, which, when deployed on June 1st, all but eradicated the duplicate events generated by PP [1][2].
It's also worth noting that this kind of duplication has already been observed in {T142667}. The MediaViewer, WikimediaBlogVisit, and NavigationTiming schemas, as well as a previous revision of the Popups schema, are all affected differently. I've also observed this kind of duplication in the ReadingDepth schema (see T161769#3270707).
=== 2. PP is logging events too frequently
This still needs to be tested.
Some time ago, I created [the TestEventDuplication schema](https://meta.wikimedia.org/wiki/Schema:TestEventDuplication) and [an associated script](https://gist.github.com/phuedx/9e47367349c1796191a7814e496c9d20) that would try to provoke event duplication. The script can be run directly from the browser's console on any MediaWiki instance that has `$wgEventLoggingBaseUri` config variable set to a production beacon URI, e.g. `"https://en.wikipedia.org/beacon/event"`.
=== 3. The UA is at fault
We mentioned above that very nearly all of these events are logged by Firefox, including the latest version. This is in line with earlier informal observations about the behaviour of Firefox (see T142667#2891207).
In T166115#3285770, I observed that the majority of the ReadingDepth schema's duplicate events are logged by Chrome and Chrome Mobile. If the level of duplication were constant, then we could write this off as a bug in the implementation. It isn't, however, and [the implementation](https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/modules/ext.wikimediaEvents.readingDepth.js#L187-L227) is relatively straightforward. This doesn't strictly invalidate the above, it simply highlights that all browsers are capable of logging duplicate events.
=== 4. ~~The OS is at fault~~
This was invalidated early on in {T161769}:
>>! In T161769#3195922, @Tbayer wrote:
> … the distribution by OS doesn't seem to deviate too much from the [[https://analytics.wikimedia.org/dashboards/browsers/#desktop-site-by-os| general distribution]].
=== 5. ~~The duplicates are being generated by Varnish/varnishkafka~~
After a thorough explanation of the server-side EventLogging pipeline from @Ottomata, I'm fairly confident that this hypothesis can be invalidated. As I understand it, every entry in the webrequest table has the hostname of the Varnish server and that instance's internal sequence ID for that request. If the duplicate events were being duplicated by anything after Varnish in the pipeline, then the sequence IDs should be duplicated.
In T161769#3194893, for example, the HTTP requests are all served by the same Varnish server and have different sequence IDs. Querying the hostname and sequence of all Popups events logged in June yields the same result [4].
=== 6. ~~The events are being generated by a Kafka consumer~~
NOTE: Invalidated per T167391#3402475.
In T161769#3194893 and T161769#3247604 we see the same duplication in the `log.Popups_16364296` MySQL and `webrequest` Hadoop tables, which are written to by different Kafka consumer that subscribe to different topics.
It might be worth checking for discrepancies between the `log.Popups_16364296` MySQL table, which has messages from the `eventlogging-valid-mixed` topic written to it, and those in the corresponding Hadoop table, which has messages from the `eventlogging_Popups` topic written to it. However, this wouldn't validate this hypothesis given the above.
=== ~~7. The events are being duplicated because the code is being loaded more than once~~
NOTE: Invalidated per T167391#3384593 and T167391#3402980 (as well as other informal observations on related tasks).
It's possible a gadget or piece of code in a project/user's Common.js is causing Popups JS code to be loaded again leading to multiple events being loaded.
It would be worth checking if the duplicate events only appear on certain projects.
=== 8. A bot/person is replaying the events with malicious intent/accidentally to drive Tilman insane.
It's possible to replay any XHR request either using the JS console or programatically by accident. How does EventLogging deal with these?
---
```lang=mysql,name=[0],lines=13
+----------+--------+-------+-------+
| ymd | nDupes | n | % |
+----------+--------+-------+-------+
| 20170525 | 11028 | 58367 | 18.89 |
| 20170526 | 7910 | 49686 | 15.92 |
| 20170527 | 7257 | 38524 | 18.84 |
| 20170528 | 7160 | 43665 | 16.40 |
| 20170529 | 7157 | 60953 | 11.74 |
| 20170530 | 7419 | 59861 | 12.39 |
| 20170531 | 5518 | 58975 | 9.36 |
| 20170601 | 14204 | 64431 | 22.05 |
| 20170602 | 13068 | 59266 | 22.05 |
| 20170603 | 5052 | 43240 | 11.68 |
| 20170604 | 7249 | 52642 | 13.77 |
| 20170605 | 4902 | 59737 | 8.21 |
| 20170606 | 4837 | 58618 | 8.25 |
| 20170607 | 7060 | 57580 | 12.26 |
| 20170608 | 761 | 12960 | 5.87 |
+----------+--------+-------+-------+
select
dupesPerDay.ymd,
nDupes,
n,
round( 100 * nDupes / n, 2) as '%'
from
(
select ymd, sum(nDupes) AS nDupes from
(
select
left(timestamp, 8) as ymd,
count(*) AS nDupes
from
log.Popups_16364296
where
date(timestamp) >= '2017-05-25'
and date(timestamp) <= '2017-06-08'
and event_linkInteractionToken is not null
group by
ymd,
event_linkInteractionToken
having nDupes > 1
) AS dupesPerDayInt
group by ymd
) as dupesPerDay
join
(
select
left(timestamp, 8) as ymd,
count(*) AS n
from
log.Popups_16364296
where
date(timestamp) >= '2017-05-25'
and date(timestamp) <= '2017-06-08'
and event_linkInteractionToken is not null
group by
ymd
) as nPerDay
on dupesPerDay.ymd = nPerDay.ymd
order by ymd;
```
[1] https://grafana.wikimedia.org/dashboard/db/reading-web-page-previews?refresh=1m&orgId=1&from=1496188800000&to=1496275200000
[2] https://grafana.wikimedia.org/dashboard/db/reading-web-page-previews?refresh=1m&orgId=1&from=1496880000000&to=1496966400000
[4] {P5558,lines=14}