[SPIKE] As a user I'd like to see the top most read articles for my Wikipedia in the Explore feed
Closed, ResolvedPublic
Actions

Description

Use the Pageviews API to add a feed section showing the most popular current articles on your language's Wikipedia.

Top viewed pages will be timely and relevant, and give readers a way to see what other readers are into right now on Wikipedia. Its a natural fit for the purpose and format of the Explore feed.

Step one is to investigate the current suitability and capabilties of the Pageview API. We can then get into potenitial design and product details.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		• BGerstle-WMF	T124716 EPIC: Add Top read articles to the app
		Resolved		Mhurd	T123680 [SPIKE] As a user I'd like to see the top most read articles for my Wikipedia in the Explore feed

Event Timeline

• JMinor created this task.Jan 14 2016, 10:28 PM

• JMinor raised the priority of this task from to Needs Triage.

• JMinor updated the task description. (Show Details)

• JMinor added a project: Wikipedia-iOS-App-Backlog.

• JMinor added subscribers: • JMinor, • BGerstle-WMF, • KHammerstein, • Tnegrin.

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 14 2016, 10:28 PM

• JMinor renamed this task from As a user I'd like to see the top most read articles for my Wikipedia in the Explore feed to [SPIKE] As a user I'd like to see the top most read articles for my Wikipedia in the Explore feed.Jan 15 2016, 8:42 PM

• JMinor moved this task from Needs Triage to Engineering Backlog on the Wikipedia-iOS-App-Backlog board.

• JMinor set Security to None.

• JMinor merged a task: T122230: As a user, I can see the top pages on my selected site(s) so that I can see which articles are most popular.Jan 15 2016, 9:02 PM

Mhurd claimed this task.Jan 17 2016, 7:32 AM

Mhurd edited projects, added Wikipedia-iOS-App-Development; removed Wikipedia-iOS-App-Backlog.

Mhurd moved this task from Tasks from Product Backlog to Doing on the Wikipedia-iOS-App-Development board.

Grabbed this as a fun weekend ticket.

See the PR for a WIP screen recording animation:
https://github.com/wikimedia/wikipedia-ios/pull/388

Will investigate the other open questions around using this API probably (here and there) next week. Already spoke to Dario a bit and was going to chat with Kevin next.

sweet.

@Mhurd Awesome! This will be so useful!

• BGerstle-WMF added a subscriber: Milimetric.Jan 19 2016, 4:13 PM

So far, @Mhurd has found the following issues w/ integrating the API:

Not able to limit number of results
- Additionally, will eventually want to paginate results
Need to perform aggregation of article extract, page image, wikidata desc. etc. (i.e. "card" data)
Need to (heuristically?) filter results to only specific namespaces and/or restrict certain pages (e.g. Main Page, and whatever "-" is)

@Milimetric are there any plans to address these? on a larger scale, how suitable is this API for end-user client consumption?

In T123680#1944226, @BGerstle-WMF wrote:

So far, @Mhurd has found the following issues w/ integrating the API:

@Milimetric are there any plans to address these? on a larger scale, how suitable is this API for end-user client consumption?

I'll re-order the bullet points below and answer, but please create a task for each thing you'd like us to work on, and assign it what you see as the priority. You can tag it Analytics and ping me if we ignore it for too long.

Not able to limit number of results

Easy to do

Additionally, will eventually want to paginate results

Also easy to do, but the server would just simulate this, since it's really fast for us to just get all 1000. We won't be able to get more than 1000 though

Need to perform aggregation of article extract, page image, wikidata desc. etc. (i.e. "card" data)

I personally feel like there should be a separate API for this, see more below

Need to (heuristically?) filter results to only specific namespaces and/or restrict certain pages (e.g. Main Page, and whatever "-" is)

This, like the aggregation above, is hard right now, we don't have a great way to join to mediawiki data to get this information. A join on article title works for the happy cases, but article renames, weird characters, and other annoying corner cases make it an imperfect thing. We are working on getting page_id into the pageview data pipeline, and then all this will be much easier.

@Milimetric thanks!

I personally feel like there should be a separate API for this, see more below

Agreed, this is more of a long-term aspiration, much like the pagination. In fact, we might not need to paginate the pageview API if the end-client doesn't use it directly.

This, like the aggregation above, is hard right now...

Is it not feasible to include the page's namespace when querying the data, filtering at that point? Even exposing the namespace at the output so end/middleware clients can filter is better than nothing.

@Milimetric on second thought, i'll just file tasks for some of this so we can discuss in detail there. the namespace/filtering issue is the biggest dilemma, as we can get by w/ client-side aggregation until an appropriate aggregation strategy is devised for the server side.

@Milimetric one question you didn't address was the stability of the API itself. the Swagger spec declares it as "experimental," would you caution us against using it?

In T123680#1944303, @BGerstle-WMF wrote:

@Milimetric one question you didn't address was the stability of the API itself. the Swagger spec declares it as "experimental," would you caution us against using it?

Sorry about that. I think is stable enough to be used, in my opinion. The output is being cached and we tried to make sure the cache doesn't get too fragmented by the parameters (one of the main reasons we went with RESTBase). So whereas the API itself can only be hit around 200 times per second, the caching should allow us to use it for user-facing features. If we're starting to implement these kinds of features on the sites and apps though, we want to know so we can make some improvements we have prepared (like hosting a router for the API on each wiki so the client doesn't have to do two DNS lookups). But, for now, yes, this should be stable enough to use. The "experimental" label is kind of over-cautious copy-pasting at this point :)

Mhurd moved this task from Doing to Blocked or Waiting on the Wikipedia-iOS-App-Development board.Jan 25 2016, 8:52 PM

Since this is the Spike ticket are we spiked enough? Theres a follow-on ticket for actual implementation:
T124716

MBinder_WMF added a project: iOS-app-v5-production.Jan 27 2016, 7:23 PM

• JMinor triaged this task as Medium priority.Jan 28 2016, 9:50 PM

Resolving spike, the output of which was the task for the actual feature: T124716

MBinder_WMF added a project: iOS-app-feature-TopRead.Jul 9 2016, 12:24 AM

MBinder_WMF added a project: Spike.

MBinder_WMF added a project: Wikipedia-iOS-App-Backlog.Jul 18 2016, 5:38 PM

[SPIKE] As a user I'd like to see the top most read articles for my Wikipedia in the Explore feedClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

[SPIKE] As a user I'd like to see the top most read articles for my Wikipedia in the Explore feed
Closed, ResolvedPublic
Actions

Related Objects
Search...