Page MenuHomePhabricator

Document event duplication in Reading Web maintained EL instrumentation
Closed, ResolvedPublic

Description

TODO

TODONE

  • Run queries for ReadingDepth and Popups schemas.
  • Document the results on this task and on the "Extraneous EL events investigation summary [draft]" document.

Event Timeline

For the ReadingDepth schema:

select
  substring(
    userAgent,
    instr(userAgent,'"browser_family": "') + 19,
    instr(substring(userAgent, instr(userAgent,'"browser_family": "') + 19 ), '"' ) - 1
  ) as BrowserFamily,
  count(*) AS Dupes
from (
  
   select userAgent, count(*) as nDupes 
      from log.ReadingDepth_16325045
      where timestamp like '2017051%'
      and event_pageToken is not null
      and event_action = 'pageLoaded'
      group by userAgent, event_pageToken
      having nDupes > 1
  
) as dupeList
group by BrowserFamily
order by Dupes desc;

+------------------+-------+
| BrowserFamily    | Dupes |
+------------------+-------+
| Chrome           |   463 |
| Chrome Mobile    |    33 |
| Safari           |    15 |
| Mobile Safari    |    11 |
| Firefox          |    11 |
| IE               |     4 |
| Edge             |     3 |
| Samsung Internet |     2 |
| AppEngine-Google |     2 |
| Yandex Browser   |     1 |
| Python Requests  |     1 |
| IE Mobile        |     1 |
+------------------+-------+

Here's a dump of the data that I've collected while investigation whether the frequency of logged events triggers duplication.

select
  timestamp as t,
  event_testID as 'Test ID',
  event_testLength as 'Test Length (Expected)',
  count(*) as 'Test Length (Actual)'
from
  log.TestEventDuplication_16757884
group by event_testID;

+----------------+------------------+------------------------+----------------------+
| t              | Test ID          | Test Length (Expected) | Test Length (Actual) |
+----------------+------------------+------------------------+----------------------+
| 20170526083002 | 1d79cb8ed87c8bda |                     50 |                   50 |
| 20170514053424 | 265fbd258eb06c3c |                     20 |                   20 |
| 20170514072830 | 3c016715ca71fe09 |                     50 |                   50 |
| 20170514070226 | 63bd7c82e4739056 |                     50 |                   50 |
| 20170526082851 | a41956a47e5dca74 |                     50 |                   50 |
| 20170514053637 | d65f157cc11ae213 |                     20 |                   20 |
| 20170514053654 | ea4fae0f8bf7c06d |                     20 |                   20 |
+----------------+------------------+------------------------+----------------------+
phuedx updated the task description. (Show Details)