Page MenuHomePhabricator

Spike: Investigate Possible Errors for WWT [4 hours]
Closed, ResolvedPublic

Description

As a developer, I want to investigate the different types of errors that may occur in WWT (via WhoColor API or the WWT tool itself), so that we can determine the appropriate error messaging for different WWT errors.

Requirements:

  • Investigate the types of errors that may occur when fetching results in WWT, including malformed pages and local errors
  • Investigate all other potential errors that may be generated by WWT
  • Generate a list of potential errors that includes: 1) a description of the error behavior from the user perspective (i.e. what the users can see), and 2) how users can handle such scenarios (e.g. refresh the page, try again later, contact us, etc)

Event Timeline

ifried created this task.Aug 3 2019, 12:26 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 3 2019, 12:26 AM
ifried updated the task description. (Show Details)Aug 3 2019, 12:26 AM
ifried renamed this task from Add Messaging for API Errors That Can't Be Solved With Refresh for WWT to Spike: Investigate Messaging for API Errors That Can't Be Solved With Refresh for WWT.Aug 5 2019, 9:51 PM
ifried renamed this task from Spike: Investigate Messaging for API Errors That Can't Be Solved With Refresh for WWT to Spike: Investigate Errors from WhoColor API.
ifried updated the task description. (Show Details)
ifried renamed this task from Spike: Investigate Errors from WhoColor API to Spike: Investigate Possible Errors for WWT.Aug 5 2019, 9:54 PM
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)Aug 5 2019, 11:25 PM
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)Aug 5 2019, 11:33 PM
ifried renamed this task from Spike: Investigate Possible Errors for WWT to Spike: Investigate Possible Errors for WWT [4 hours].Aug 6 2019, 11:53 PM
ifried moved this task from To be estimated/discussed to Estimated on the Community-Tech board.
MusikAnimal moved this task from Backlog to In progress on the Who-Wrote-That board.
200 OK

Does not necessarily mean you got the data you need. In addition, you should look for success in the response body. This will be true if all went well. If it is false, a plain English description of the error will be provided by info.

An example is brand new pages that aren't yet in the WikiWho database:

{
  "info": "Requested data is not currently available in WikiWho database. It will be available soon.",
  "success": false,
  "rev_id": 909957138,
  "page_title": "Fogama'a Crater"
}

"Try again later" would be an appropriate message to show to the user.

It seems this same response is given if the page is too long and times out, e.g.:

{
  "info": "Requested data is not currently available in WikiWho database. It will be available soon.",
  "success": false,
  "rev_id": 909848532,
  "page_title": "Timeline of Russian interference in the 2016 United States elections"
}

Unfortunately I don't think the latter is predictable. Large, popular pages such as Barack Obama can cause this error, so it's likely our users will encounter it at some point. Attempts to reload the data are sometimes successful, so "refresh the page" or "try again later" still seems like appropriate messaging.

I don't know of other situations where you get a 200 with success: false, but we should be prepared for it.

400 Bad Request

Nonexistent pages, e.g.:

{
  "error": "The article (sadgdsagadsgdsag) you are trying to request does not exist in english Wikipedia.",
  "success": false,
  "rev_id": null,
  "page_title": "sadgdsagadsgdsag"
}

Invalid namespace (shouldn't happen in our case, since we'll make the Who Wrote That link only be available in the mainspace):

{
  "error": "Only articles! Namespace 1 is not accepted.",
  "success": false,
  "rev_id": null,
  "page_title": "Talk:Hanksy"
}

Nonexistent revision (also should never happen to us):

{
  "error": "The revision (4324324) you are trying to request does not exist!",
  "success": false,
  "rev_id": "4324324",
  "page_title": "Hanksy"
}

All of these suggest we did something wrong, so a "contact us" message probably makes the most sense.

There may be other things that cause 400s.

429 Too Many Requests

We currently are making all requests from the browser (each end user), so we're unlikely to ever hit the 60 requests/minute or 2000/day per-IP limit. We'll probably end up making a proxy for WikiWho on Cloud VPS (like we do for the Google APIs), meaning requests are made from the IP of the VPS instance. However when get there, we can use a WikiWho account and we can get the necessary quota from WikiWho maintainers. So basically we should never see 429s, but still probably worth proper handling. Showing a "try again later" message to the user would be accurate.

503 Service Unavailable

I've never encountered this so I don't know what the response looks like, but there'd be no data to show to the user, so you might as well treat it like an unknown error and show a "contact us" message.

Other issues
  • Running out of memory -- I was able to replicate this by looking at older versions of Timeline of Russian interference in the 2016 United States elections (one of the longest pages on enwiki). Loading this revision made my browser tab freeze. Note we can get the size of the article with prop=info API. If it is crazy high we might disable Who Wrote That on that page. Or we can just let them try to load the results, and if it freezes, it freezes. Everyone's memory capacity will be different, after all.
  • Clientside timeouts -- As noted above, WikiWho apparently returns a 200 with success: false if it times out. I don't know if this is reliable behaviour; we should set some sane timeout on the clientside as well. "Refresh" or "try again later" might be accurate messages for the user.
  • Server unreachable -- I unfortunately deleted the error reports of when this happened in the past, but it was probably some 500-level error. Any 5xx error means we have no data to show to the user, so "contact us" or the like is probably the best thing to say.
  • Something else we haven't thought of -- I would say if none of the above known errors happen, and the tokens hash is missing from the response, assume the worst and show a "contact us" message.

Thanks for all this information, @MusikAnimal 🤩

In the case of Requested data is not currently available in WikiWho database. It will be available soon., do we have any idea what soon might be (ms? hours?)? Based on that we could have a few different versions of the try again later message.

Thanks for all this information, @MusikAnimal 🤩
In the case of Requested data is not currently available in WikiWho database. It will be available soon., do we have any idea what soon might be (ms? hours?)? Based on that we could have a few different versions of the try again later message.

I'm not sure, but from my own observations "hours" would be the safer assumption. It may be that smaller pages for instance are replicated more quickly.

MusikAnimal closed this task as Resolved.Tue, Aug 20, 1:11 AM

I think we're done here. Nothing to QA, and product/design follow-up is at T226760.