Page MenuHomePhabricator

Sub-epic ⚡️ : Create an API service for InteractionTimeline
Closed, ResolvedPublic

Description

Per T182548, getting all the data for InteractionTimeline from the MediaWiki API isn't going to scale for users with lots of edits. We need to build our own API to query the replica databases directly and thus speed up getting results.

The API should be able to return all the revisions performed by two users on overlapping articles for a specific wiki within a specified time range. In other words, it should accept the following parameters: user1, user2, wiki, start, end (and perhaps limit). It should return results in whatever format will be easiest for the existing InteractionTimeline codebase to process.

See https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database for relevant documentation.

Notes from meeting on T184201:

  1. Stack to be used: PHP56/Slim3
  2. API and JS App will live in the same repository (https://github.com/wikimedia/InteractionTimeline)
  3. Everything should live on ToolForge in the interaction-timeline project. TBD if this is doable
  4. Current project needs to be re-structured to accommodate for a client/server application

Related Objects

Event Timeline

Should this be an Epic instead of a Task?

The API should be able to return all the revisions performed by two users on overlapping articles for a specific wiki within a specified time range. In other words, it should accept the following parameters: user1, user2, wiki, start, end (and perhaps limit). It should return results in whatever format will be easiest for the existing InteractionTimeline codebase to process.

As an initial iteration having the timeline interaction endpoint will be sufficient to start with but eventually we should move all calls from Mediawiki API into this one. That means moving fetch wikis, fetch users and fetch diffs into this as well.

Should this be an Epic instead of a Task?

This is a child of T166807: Epic ⚡️ : Interaction Timeline. I see this as similar to T179607: Interaction Timeline V1. Do you think I should organize/label these as Sub-Epic ⚡️ :? I do not have a strong preference — whatever helps others!

@kaldari

The API should live on ToolForge in the interaction-timeline project.

I think you can only use one webservice at a time on ToolForge, no? If that is the case, then I think we'll have to create a new tool for the api. interaction-timeline-api ?

Should this be an Epic instead of a Task?

If it can be completed in a single sprint, then it can be a task, otherwise it has to be an epic. Regardless, feel free to create subtasks.

As an initial iteration having the timeline interaction endpoint will be sufficient to start with but eventually we should move all calls from Mediawiki API into this one. That means moving fetch wikis, fetch users and fetch diffs into this as well.

Why? if it's not transforming anything, I don't see why it wouldn't just hit prod directly. Otherwise it will only be acting as a direct proxy for the existing API. I think we should use prod when possible and if not, use our API. No sense maintaining something we don't have to. :)

I think you can only use one webservice at a time on ToolForge, no? If that is the case, then I think we'll have to create a new tool for the api. interaction-timeline-api ?

Are you running a Docker image or a regular (non-Kubernetes) webservice? @MusikAnimal should be able to help with figuring out the logistics on this. He worked on the XTools API service.

As an initial iteration having the timeline interaction endpoint will be sufficient to start with but eventually we should move all calls from Mediawiki API into this one. That means moving fetch wikis, fetch users and fetch diffs into this as well.

Why? if it's not transforming anything, I don't see why it wouldn't just hit prod directly. Otherwise it will only be acting as a direct proxy for the existing API. I think we should use prod when possible and if not, use our API. No sense maintaining something we don't have to. :)

We certainly don't have to. Maybe we should talk about the pros/cons of moving each individual endpoint over and see if the benefits outweigh the work/headache. The interaction and fetch wikis will benefit from the move for sure. It just bugs me using two different API(s) for the same FE, but that's just me.

I think you can only use one webservice at a time on ToolForge, no? If that is the case, then I think we'll have to create a new tool for the api. interaction-timeline-api ?

Are you running a Docker image or a regular (non-Kubernetes) webservice? @MusikAnimal should be able to help with figuring out the logistics on this. He worked on the XTools API service.

We use a separate VPS instance for the API solely to reduce load on the main app server, since some endpoints (ArticleInfo API specifically) are hit very often -- every few seconds or so. You could do the same with a separate Toolforge tool, but the Interaction Timeline is all frontend, right? If you're querying with AJAX you should be able to use the same webservice.

If you do use a separate Toolforge account, you could add a cron job to run git fetch that checks for a new HEAD or tags, and restart the service as needed, etc. That way you don't need to do it manually. This is how we do it for XTools (both servers use this tactic, actually).

I think you can only use one webservice at a time on ToolForge, no? If that is the case, then I think we'll have to create a new tool for the api. interaction-timeline-api ?

Are you running a Docker image or a regular (non-Kubernetes) webservice? @MusikAnimal should be able to help with figuring out the logistics on this. He worked on the XTools API service.

Right now it's the static-web image.

The docs say to use kubernetes instead of the grid:

Historically all webservices ran on the Grid. All webservices are now encouraged to run on the Kubernetes platform if possible.

I tried at some point to run multiple containers with a kuberneties config file, but it didn't work right (but maybe I just did it wrong). On IRC I was told the work-around was just to create another tool.

It just bugs me using two different API(s) for the same FE, but that's just me.

I see you are a fan of not invented here. :P

I do see the value in consistency, but imho, the "cost" outweigh the benefits.

I see you are a fan of not invented here. :P

Ha! there really is a term for everything, not a fan tho.
fetch wikis and interaction endpoints can definitely benefit from a new endpoint since we need custom logic to display those. Users and diffs don't and I'm fine with leaving those out.

TBolliger renamed this task from Create an API service for InteractionTimeline to Sub-epic ⚡️ : Create an API service for InteractionTimeline.Feb 13 2018, 11:23 PM
TBolliger claimed this task.