Page MenuHomePhabricator

Weird German sessions before/since 2.1.141
Closed, ResolvedPublic1 Estimated Story Points

Description

Something appears to be wrong with how the MobileWikiAppSessions schema measures sessions on dewiki - and it changed drastically with the rollout of 2.1.141 on February 15:

Android app pageviews per session, dewiki vs rest, November 2015-May 2016.png (439×711 px, 18 KB)

Android app sessions by version, dewiki, November 2015-May 2016.png (626×1 px, 54 KB)

Android app sessions by version, all wikis, November 2015-May 2016.png (649×1 px, 68 KB)

Android app sessions per user (largest wikis), monthly, November 2015-Arpil 2015.png (496×802 px, 30 KB)

There are some options for attempting to isolate the cause further by examining this along some other dimensions (e.g. perhaps it's certain click sources, counted in the schema's "from" fields, that behave differently on dewiki). But hopefully someone from the team already has an idea what's going on here.

Data sources (all queries measuring production releases only, i.e. leaving out beta versions):

SELECT YEARWEEK(LEFT(timestamp, 8)) AS yearweek, AVG(event_totalPages) AS avgtotalPages FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' AND wiki = 'dewiki' GROUP BY yearweek ORDER BY yearweek;
SELECT YEARWEEK(LEFT(timestamp, 8)) AS yearweek, AVG(event_totalPages) AS avgtotalPages FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' AND wiki <> 'dewiki' GROUP BY yearweek ORDER BY yearweek;

SELECT LEFT(timestamp, 8) AS date, COUNT(*) AS sessions, MID(LEFT(userAgent,INSTR(userAgent,' ')-1),INSTR(userAgent,'/')+1) AS app_version FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' GROUP BY date, app_version ORDER BY date, app_version
SELECT LEFT(timestamp, 8) AS date, COUNT(*) AS sessions, MID(LEFT(userAgent,INSTR(userAgent,' ')-1),INSTR(userAgent,'/')+1) AS app_version FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' AND wiki = 'dewiki' GROUP BY date, app_version ORDER BY date, app_version
SELECT LEFT(timestamp, 6) AS yearmonth, COUNT(*)/COUNT(DISTINCT event_appInstallID) AS sessionsperuser FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' GROUP BY yearmonth ORDER BY yearmonth

See also Spreadsheet

Event Timeline

May be related to T135854. If the app crashes and restarts, +1 session and possibly no page views. There's some ringing in the previous version on upgrades but I'm not sure how that fits with this hypothesis.

@Niedzielski Thanks, I did a quick test of that hypothesis by looking at the percentage of sessions recorded on dewiki with zero page views. It does not show a noticeable change corresponding to that jump in mid-February, so that seems unlikely to be the entire reason.

ratio of Android app sessions with no pageviews, dewiki, November 2015-May 2016.png (371×600 px, 13 KB)

Data source:

SELECT YEARWEEK(LEFT(timestamp, 8)) AS yearweek, SUM(1) AS totalsessions, SUM(IF(event_totalPages = 0,1,0)) AS sessions_with_no_pageviews, SUM(IF(event_totalPages = 0,1,0))/SUM(1) AS ratio_of_sessions_with_no_pageview FROM log.MobileWikiAppSessions_14031591 WHERE userAgent LIKE '%-r-%' AND wiki = 'dewiki' GROUP BY yearweek ORDER BY yearweek;

@Tbayer So far, I'm as puzzled as you are. Can we get raw pageviews for dewiki that date range, and compare them with the session data?

When I re-run your queries, I'm getting "wrong checksum" errors from the server... in fact, I get the same error when passing any kind of LIKE parameter for userAgent (for this particular schema).

@Tbayer So far, I'm as puzzled as you are. Can we get raw pageviews for dewiki that date range, and compare them with the session data?

I already did that for monthly data, see the second sheet. There was no analogous increase in pageviews in February. If you can think of a more fine-grained pageview query that might be useful, let me know.

When I re-run your queries, I'm getting "wrong checksum" errors from the server... in fact, I get the same error when passing any kind of LIKE parameter for userAgent (for this particular schema).

Ugh. I don't recall encountering this error ever. Perhaps @jcrespo has insight into what it means?

...

When I re-run your queries, I'm getting "wrong checksum" errors from the server... in fact, I get the same error when passing any kind of LIKE parameter for userAgent (for this particular schema).

Ugh. I don't recall encountering this error ever. Perhaps @jcrespo has insight into what it means?

PS: I (re-)ran a shorter version of one of the queries in the task without issues:

mysql:research@analytics-store.eqiad.wmnet [(none)]> SELECT table_name, TABLE_ROWS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA ='staging' ORDER BY TABLE_ROWS DESC;
...
73 rows in set (23 min 8.74 sec)

BTW did you run them on analytics-store or elsewhere? Just in case it has anything to do with T131236.

Aha, I was connecting to s1-analytics-slave... When I connect to s2, the queries work.

Dbrant claimed this task.

Please reopen if there's still more work to be done here.