Page MenuHomePhabricator

Investigate rise in iOS app pageviews since around December 20, 2016
Closed, ResolvedPublic


This looks a bit too good to be true:

iOS app pageviews by version, Nov 14, 2016 - Jan 5, 2017 (Pivot).png (768×1 px, 90 KB)

(Source: Pivot)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Tbayer does 5.3.4 have the same issue? It was released on January 9th so it didn't quite make it onto the chart you posted.

[edit] Nevermind - just found the source link

I went through and ran 5.3.1 and 5.3.4 and browsed some articles while looking at network traffic. I don't see 5.3.4 making more calls to api.php?action=mobileview than 5.3.1 did. One thing that did change was that the WebView gained WikipediaApp as part of it's user agent string. However, the WebView user agent doesn't have the app version information - it's in this format: Mozilla/5.0 (iPhone; CPU iPhone OS 10_2 like Mac OS X) AppleWebKit/602.3.12 (KHTML, like Gecko) WikipediaApp. Also, the WebView mostly makes requests to our internal proxy which grabs the HTML from mobile view. Going to keep investigating.

BTW, in case it may be relevant here, recall also the recent discussion on user agents and the pageview definition at T148663 (with @JAllemandou et al).

@Tbayer it looks like if a user saves a page without viewing it first, the app erroneously makes 2 or more api.php?action=mobileview requests. This is a bug introduced in 5.3.2 which seems to have caused the spike in pageviews. Previously only one request was made. Investigating a fix.

JMinor triaged this task as Medium priority.
JMinor moved this task from Needs Triage to Bug Backlog on the Wikipedia-iOS-App-Backlog board.

@JoeWalsh Great find! But are save actions frequent enough to explain a rise of this magnitude? From Piwik, it seems not.

@Tbayer doesn't seem like it would be enough, even after accounting for piwik only sampling 10% of events. I have a patch up for that fix, but will keep looking for something else to explain the roughly 4x rise in pageviews. It's odd that nothing is jumping out - it seems like to account for that big of a rise it would need to be happening on every article view, but looking at the requests there's at most one mobileview api call per article view, same as it was in 5.3.1.

Perhaps the next step here is to look at a sample of the logs for this version on the api side and see if there is anything unusual about the user agents?

JMinor added a subscriber: JMinor.

@JoeWalsh can you provide a full request example from the app (ie. the full http request including headers)?

GET /w/api.php?action=query&continue=&exchars=525&exintro=1&exlimit=3&explaintext=&format=json&generator=search&gsrinfo=&gsrlimit=3&gsrnamespace=0&gsroffset=0&gsrprop=redirecttitle&gsrsearch=morelike%3ATudor%20period&gsrwhat=text&ns=ppprop&pilimit=3&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&wbptterms=description HTTP/1.1
Accept: */*
Cookie: WMF-Last-Access=25-Jan-2017;
User-Agent: WikipediaApp/ (iOS 10.2; Phone)
Accept-Language: en
Accept-Encoding: gzip
Connection: keep-alive

I ran a test as follows:

  1. switch on the phone
  2. start the Wikipedia app (TestFlight version "5.4.0 (1074)" )
  3. tap on today's featured article in the explore feed
  4. scroll to the bottom of the article

This generated only one pageview in the webrequest table, as expected.

I may repeat this test for some other user actions - suggestions welcome.

SELECT dt, uri_host, uri_path, uri_query
FROM wmf.webrequest
WHERE client_ip = '...'  -- Tilman's home IP, no other users at that time 
AND year = 2017 AND month = 2 AND day = 13 AND hour = 8
AND is_pageview
AND user_agent_map['os_family'] = 'iOS';

dt	uri_host	uri_path	uri_query
2017-02-13T08:43:52	/w/api.php	?action=mobileview&format=json&noheadings=true&page=Marvel%20Science%20Stories&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage%7Crevision&sectionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex&sections=all&thumbwidth=640
1 row selected (40.373 seconds)

Update: I ran some further tests as above, but didn't notice any anomalies yet. Pageviews were registered as expected, including:

  • scrolling down the feed for one week (i.e. beyond the featured article from one week ago): no pageviews, as expected
  • entering a search term and tapping on a search result: one pageview for the opened article only
  • tap the related article link at the bottom of an article: one pageview for the linked article only
  • tapping "Randomizer" repeatedly: one pageview for each shown article only

@Tbayer @JMinor if saved articles are somehow deleted from the filesystem, duplicate requests can occur (I'm seeing 1-3 per article). So there's the potential that users had part of or all of their saved page list re-requested if either a filesystem migration was botched or somehow the OS is deleting files that we don't want to be deleted. @AMroczkowski is going to investigate further to see if anything caused saved articles to be deleted from the filesystem.

Do our users visit and/or save about 4-5 pages per session? The app may have been over-reporting even before it was severely over-reporting. For example, before the spike on Dec 11th, iTunes Connect shows about 430K sessions, whereas the page view count is about 2 million.

[Edit] Corey pointed out that iTunes Connect doesn't fully report sessions - only 26% of users of the app agreed to share data, so that sessions number probably about 3-4x higher.

From what we can tell, I believe the pages/session is more in the 2-3 per user range. This is based on using session counts from iTC vs. piwik article page views. I believe @Tbayer has official pages/user counts he tracks for both apps.

@Tbayer we have a beta that went out Friday with the version number 5.4.0 to a few hundred users. Is there any way we can query for those pageviews and get any indication of if this is fixed?

JMinor added a subscriber: AMroczkowski.

@Tbayer we have a beta that went out Friday with the version number 5.4.0 to a few hundred users. Is there any way we can query for those pageviews and get any indication of if this is fixed?

This obviously depends on the beta adoption rate, persuasive power of the beta invitation mail sent to Mobile-l etc. ;) - but we can at least compare with the previous beta rollout of 5.3.4 on January 3. That one peaked a bit higher (529/day compare to 350/day for 5.4), but that doesn't seem to indicate a significant difference. So I would expect that the new beta did not fix the issue.

Pageviews for 5.4 (annonced as beta on March 3:

Wikimedia iOS app pageviews version 5.4 2017-02-14..2017-03-06.png (843×1 px, 38 KB)

(Source: Pivot)

Pageviews for 5.3.4 (announced as beta on January 3):

Wikimedia iOS app pageviews version 5.3.4 2016-12-25..2017-01-08.png (834×1 px, 37 KB)

(Source: Pivot)

Yes, we do measure pageviews per session (for opted-in readers; if you have access to Hive, you can find this in the wmf.mobile_apps_session_metrics_by_os table). The metrics derived from this that we usually track had not changed notably at the end of December: 10th, 50th and 90th percentile. But I just checked the maximum instead (i.e. the number of pages viewed during the longest session each week), and something clearly happened there:

Wikipedia iOS app - Pageviews of the largest session (per week).png (563×554 px, 47 KB)

So besides something going wrong in the app (or its communication with the servers), we should also consider the possibility of bots. We know that e.g. Googlebot emulates the Android app (T117631), although those should be filtered out in the overall pageview data seen in Pivot (I don't know offhand if the sessions data also excludes them, it needs to be better documented). In any case, I'll take a look soon at the ids that are generating the most pageviews.

Data source:

SELECT * FROM wmf.mobile_apps_session_metrics_by_os WHERE os_family = 'iOS' AND type = 'PageviewsPerSession'"

@Tbayer great find, that's definitely interesting - I'll take another pass looking at what changed to see if there's anywhere the app could get in an infinite loop re-requesting the same article or list of articles.

Still digging deeper into this (and need to write up more results). But I took at look at the two app IDs generating the most pageviews yesterday. And for both, the high number was caused by an excessive amount of action=mobileview API requests for a Wikipedia main page (alongside some normal-looking action=query requests).

COUNT(*) AS views
FROM wmf.webrequest
year = 2017 AND month = 3 AND day = 9 
AND x_analytics_map['wmfuuid'] = '[ID with the most views that day]'
GROUP BY uri_query

uri_query	views
?action=mobileview&format=json&noheadings=true&page=Main%20Page&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage%7Crevision&sectionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex&sections=all&thumbwidth=640	89220
?action=query&continue=&exchars=525&exintro=1&exlimit=3&explaintext=&format=json&generator=search&gsrinfo=&gsrlimit=3&gsrnamespace=0&gsroffset=0&gsrprop=redirecttitle&gsrsearch=morelike%3A[actual enwiki article title]&gsrwhat=text&ns=ppprop&pilimit=3&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&wbptterms=description	2
?action=query&continue=&exchars=525&exintro=1&exlimit=1&explaintext=&format=json&ns=ppprop&pilimit=1&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&titles=Main%20Page&wbptterms=description	1
?action=query&format=json&meta=siteinfo&siprop=general	1
16 rows selected (342.338 seconds)

COUNT(*) AS views
FROM wmf.webrequest
year = 2017 AND month = 3 AND day = 9 
AND x_analytics_map['wmfuuid'] = '[ID with the second most views that day]'
GROUP BY uri_query

uri_query	views
?action=mobileview&format=json&noheadings=true&page=Wikipedia%3AHauptseite&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage%7Crevision&sectionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex&sections=all&thumbwidth=640	71429
?action=query&format=json&meta=siteinfo&siprop=general	3
?action=query&continue=&exchars=525&exintro=1&exlimit=3&explaintext=&format=json&generator=search&gsrinfo=&gsrlimit=3&gsrnamespace=0&gsroffset=0&gsrprop=redirecttitle&gsrsearch=morelike%3A[actual dewiki article title]&gsrwhat=text&ns=ppprop&pilimit=3&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&wbptterms=description	2
?action=query&continue=&exchars=525&exintro=1&exlimit=1&explaintext=&format=json&ns=ppprop&pilimit=1&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&titles=Wikipedia%3AHauptseite&wbptterms=description	2
?action=query&continue=&exchars=525&exintro=1&exlimit=20&explaintext=&format=json&generator=search&gsrinfo=&gsrlimit=20&gsrnamespace=0&gsroffset=0&gsrprop=redirecttitle&gsrsearch=morelike%3AWikipedia%3AHauptseite&gsrwhat=text&ns=ppprop&pilimit=20&piprop=thumbnail&pithumbsize=640&prop=pageterms%7Cpageimages%7Cpageprops%7Crevisions%7Cextracts&rrvlimit=1&rvprop=ids&wbptterms=description	2
10 rows selected (444.398 seconds)


@Tbayer that info about the requests being for the main page helped me track down the issue - if the user has the main page in their saved pages list, the app gets stuck re-requesting it in an infinite loop

@JMinor The problem is around logic that causes the main page to always be re-requested even if it's cached. Should users not be able to save the main page? Should the saved articles fetcher go around that logic and not allow it to be always re-requested?

The user should be able to "save" the main page, and it should be re-requested whenever the user views it, but should not otherwise be re-quested or kept up to date in the background.

Leaving this open and assigned to me for confirmation once users have updated to 5.4.

Per discussion on IRC earlier this week, I just looked into this again. The takeaway is that after the the fix was deployed, pageviews indeed went *almost* back to normal levels, although among those affected by the bug earlier, there still seem to be some unlucky stragglers whose app hasn't yet updated to 5.4 - as of last week, the largest session still had 33k views.

iOS app pageviews, Nov 14, 2016 - Jan 5, 2017 (Pivot).png (742×1 px, 65 KB)
(Source: Pivot)

iOS app pageviews by version, Nov 14, 2016 - Apr 6, 2017 (Pivot).png (741×1 px, 124 KB)

(Source: Pivot)

Wikipedia iOS app - Pageviews of the largest session (per week) ..week until 2017-4-1.png (563×554 px, 50 KB)
(Source, for those with SWAP access)

Epilogue: I have tried to reconstruct, or rather approximate, the actual (human) pageviews by removing main pages of 12 popular Wikipedias
(plus the redirect "Hauptseite" on dewiki , which for some reason a lot of 5.3.x clients had been accessing instead of "Wikipedia:Hauptseite"). The resulting query (below) is less than 1% the true result when checking for the time in December before the bug (i.e. does not remove a lot of actual views), but appears to removes most (if not 100%) of the extraneous views.
It indicates that the currently recorded numbers are still 15-25% too high, because of clients which have not yet updated from the buggy versions. I'm going to use the adjusted query for the time being, which is good enough for correcting our cross-platform mobile and total traffic numbers, but still does not quite seem accurate enough to resume trend analysis for the iOS app's views at the moment.

iOS app pageviews, corrected vs. uncorrected.png (621×1 px, 42 KB)

SELECT year, month, day, CONCAT(year,'-',LPAD(month,2,'0'),'-',LPAD(day,2,'0')) as date,
FROM wmf.pageview_hourly
WHERE (year = 2017 OR (year = 2016 AND month = 12))
AND access_method = 'mobile app'
AND user_agent_map['os_family'] = 'iOS' AND agent_type = 'user'
    NOT ( -- See
                 -- 10 most viewed projects on the iOS app (Jan 2017):
                        (   project = 'en.wikipedia' AND
                                page_title = 'Main_Page'  )
                        (   project = 'de.wikipedia'
                                AND page_title = 'Wikipedia:Hauptseite'  )
                        OR  -- for some strange reason a lot of 5.3.x clients access the redirect instead:
                        (   project = 'de.wikipedia'
                                AND page_title = 'Hauptseite'  )
                        (   project = 'fr.wikipedia'
                                AND page_title = 'Wikipédia:Accueil_principal'  )
                        (   project = 'ja.wikipedia' AND
                                page_title = 'メインページ'   )
                        (   project = 'nl.wikipedia'
                                AND page_title = 'Hoofdpagina'  )
                        (   project = 'es.wikipedia' AND
                                page_title = 'Wikipedia:Portada'  )
                        (   project = 'it.wikipedia'
                                AND page_title = 'Pagina_principale'  )
                        (   project = 'ru.wikipedia' AND
                                page_title = 'Заглавная_страница'  )
                        (   project = 'zh.wikipedia'
                                AND page_title = 'Wikipedia:首页'  )
                        (   project = 'sv.wikipedia' AND
                                page_title = 'Portal:Huvudsida'  )
                        (   project = 'fi.wikipedia' AND
                                page_title = 'Wikipedia:Etusivu'  )
                        (   project = 'pl.wikipedia' AND
                                page_title = 'Wikipedia:Strona_główna'  )
GROUP BY year, month, day ORDER BY year, month, day LIMIT 1000;

Thanks @Tbayer for the thorough follow up.

Epilogue: Over a year later, there still seem to be some faulty clients out there, generating tens of thousands of spurious pageviews per day. (Concretely: In January 2018, the effect of the correction in T154735#3212581 was still around 2%, whereas on each day from Dec 1-Dec 18 2016 - right before the bug occurred - it had between 0.7-0.9%, as an estimate of the overcorrection involved.) But that's low enough now that I'm stopping to use this correction for our general pageview data from January on.