Page MenuHomePhabricator

Create a public REST endpoint for Trending Edits API
Closed, ResolvedPublic5 Estimated Story Points

Description

As an API consumer I would like to request a list of articles from a public endpoint that have been edited the most over a certain time period.

A/C

  • I only see pages which have been updated within the configured time period (config.purge_strategy.max_age)
  • The pages are sorted by a score (algorithm will be based on Weekipedia).

Route: /v1/feed/trending-edits

Response:

{
  "pages": [
    {
      $merge: [ 'https://en.wikipedia.org/api/rest_v1/page/summary/Statewide opinion polling for the United States presidential election, 2016' ],
      "totalEdits": 15, // TBC: Does apps want to show this or can this be removed?
      "trendiness": 93, // score
      "isNew": false,
      "trendedAt": "2016-11-04T16:19:38.737Z"
    }
  ],
  "ts": "2016-11-04T18:10:21.325Z"
}

Event Timeline

Fjalapeno added subscribers: bearND, GWicke.

@GWicke @bearND Does this need to be under “feed” as well?

@Fjalapeno, I think that makes sense, yes. It fits the description of project-global, time-varying data we had in mind when we created this hierarchy.

I'd put it under /feed, yes. Will there be other trending types? If so, /feed/trending/edit sounds like a winner to me. If we will have only one trending feed, then /feed/trending ought to suffice.

Mind expounding on the difference between the top-level fields page and pages ?

To avoid caching issues with vandalised content, we would want to implement the same content hydration feature here as we do for the featured feed endpoint. Content hydration manst that the service only emits the title (uri) of the articles, and RESTBase populates it with summary content.

We've developed a generalised content hydration mechanics based on the $merge directive, so the service should emit it right away, see T148798 for details.

This comment is about internal service API, not about public API.

Will there be other trending types? If so, /feed/trending/edit sounds like a winner to me. If we will have only one trending feed, then /feed/trending ought to suffice.

+1

+1 to something like /feed/trending/edit.

I have a sense there could be more kinds of trending endpoints[1] over time but eventually they might converge/get aggregated.

Should we use plural for it? /feed/trending/edits

[1] based on edits, views, dynamic change. We already have trending based on views. Fortunately, we haven't exposed the most-read endpoint yet. So, we could easily move it under /feed/trending; let's say as /feed/trending/views.

A big question here is whether the plan is to expose it separately from the featured feed forever or whether we're going to incorporate it into featured feed at some point? Once we expose the public endpoint and release an app that uses it, we're stuck with the forever and can not undeploy it.

@Pchelolo we would want to expose this publicly forever… while the apps may use it in the feed, the web will likely want to use this in another context. Similar to @Jdlrobson's work on Pushipedia

@mobrovac on the page vs pages…

My understanding of that is to provide a simple way to access the most trending article for clients uninterested in showing any kind of list.

@mobrovac on the page vs pages…

My understanding of that is to provide a simple way to access the most trending article for clients uninterested in showing any kind of list.

We can sort the resulting array in order of "trendiness", and it's arguable that body.pages[0] is significantly more complex then body.page.

Anyway page vs pages is super not clear. If you still wanna have 'the most trending one', lest at least name the key differently, like most_trending.

The only reason page is in the prototype was for backwards compatibility with several old service workers which were using it and needed time to update. It's not going to be in the final output as it's duplicate information.

My 2¢:

I'm +1 on /feed/trending/edits. It does sound slightly more natural to me than /feed/trending/edit.

On the schema, lets make sure that we reuse the page summary information, possibly as a sub-object ("summary"). This can be filled in by RESTBase via the hydration mechanism as described in T148798.

@mobrovac I updated the response to remove the "page" key based on @Jdlrobson's comment.

In case it was not clear - the code in weekipedia and wikitrender are throwaway code not meant for production or publication on npm. There will be ample opportunity to guide the format of the response as that logic there gets incorporated more cleanly into the new endpoint.

If you are interested in where the code is hosted please input here:
https://gerrit.wikimedia.org/r/#/c/319826/

If you are interested in what is contained in the page response please follow T145553.
(the current contents of page are all the internals needed - I suspect we actually may want to favour a far more simplified response)

I'm advocating for a simple format for the first couple of releases. We can add things to it later if needed, of course. Just removing fields is a lot harder. I'd like to hear from the iOS team what they plan to use.

I'm pretty sure we need summary data.
+1 to putting the results in a pages array.

Other fields that sound like interesting candidates to me are:

  • a timestamp or two which help clients sort trending edits. What's the difference between start and trendedAt?
  • Some measure of trendiness. Is it score, views, edits?

(Sounds like since the rewrite we cannot easily get reverts and isNew based on info from @Jdlrobson on IRC. Why is that?)

More questions:

  • Would you explain the score values?
  • Is the number of anonEdits included in edits?
  • What are the flags for and what are the possible values?
    • notabilityFlags
    • volatileFlags
    • safe
  • What's index and lastIndex?

What we can probably skip:

  • Info about contributors: contributors, anons, distribution
  • Info about the wiki since this is implied by the request domain, including lang
  • id seems to be a duplicate of title

As I mentioned before don't get too hung up on this response. It's best to look at the trending api service now . Score, flags, index, safe etc don't exist in that.

As I mentioned before don't get too hung up on this response. It's best to look at the trending api service now . Score, flags, index, safe etc don't exist in that.

Then let's please keep the format in the task description up to date, since this needs to be settled ASAP (before the service is deployed in production).

I'm planning on working on T145572 over the next 2 weeks. This task was a useful discussion, but any concrete outcomes should go in the discussion on T145572 or code review against the patch I'll write for that task.

I think we've been looking at this the wrong way round, rather than look at what is available in Weekipedia, it would be more helpful to know how apps are going to use this (is there a mock?) and from that we can work out what information they need to render that experience.

It sounds like at the very least we just need title and [[ T150186 | summary. ]]
Adding new fields as needed will be trivial later on.

PS. What is the difference between this task and T145572?

Jdlrobson updated the task description. (Show Details)
MBinder_WMF set the point value for this task to 5.Dec 5 2016, 6:21 PM

Change 325847 had a related patch set uploaded (by Jdlrobson):
Score and sort pages before exposing

https://gerrit.wikimedia.org/r/325847

Change 325848 had a related patch set uploaded (by Jdlrobson):
Limit content sent in API response

https://gerrit.wikimedia.org/r/325848

Change 325847 merged by Ppchelko:
Score and sort pages before exposing

https://gerrit.wikimedia.org/r/325847

Change 325848 merged by Ppchelko:
Limit content sent in API response

https://gerrit.wikimedia.org/r/325848