Since the collection and assembly of RESTBase page summaries for feed responses was moved into Wikifeeds in T263133, the average latency for the /feed/onthisday/selected endpoint has been extremely high, between 35-40 seconds at the time of writing. It's not clear to me why this endpoint in particular would be slow, although at a glance it does seem to collect a rather large number of summaries.
Even more interesting - when requested with curl the endpoint returns a snappy result. However, https://grafana.wikimedia.org/d/000000577/restbase-external-overview?viewPanel=17&orgId=1 indicates that RESTBase observes these latencies as well.
Ok, observing the logs I see quite a lot of timeouts for http://restbase.discovery.wmnet:7231/de.wikipedia.org/v1/page/summary/Datei%3AAdriana_Bisi_Fabbri_%E2%80%93_Aviatore.tiff
Since it's a file page, it's actually stored it commons, thus RESTBase returns a redirect to https://commons.wikimedia.org/api/rest_v1/page/summary/File%3AAdriana_Bisi_Fabbri_%E2%80%93_Aviatore.tiff
Which is an external URI. I would guess that going into the public internet is prohibited for wikifeeds, thus it times out. Since the errors in summary fetching are ignored, this timeout still produces a 200 result, but drives the latency up.
@Joe is my assumption correct - wikifeeds can't go to the public internet?
There's few options on how we can fix it:
- we can detect that the request is an internal request in RESTBase and return an internal URI.
- we can manually resolve these redirects in wikifeeds. This is easier and more straightforward. However trying to fix it in RESTBase might be more generic since other services talking to it might have similar issues.
Please advise which solution do you think is better?
This is borderline unbreak-now from the apps teams perspective as it's breaking a key component of the apps (the explore feed) for German users. The specific endpoint the apps use that is timing out is https://de.wikipedia.org/api/rest_v1/feed/featured/2020/09/24 cc @Charlotte @JMinor
The redirect links issue needed to be addressed anyway, but FWIW, I think it was a bug in the service code that was causing a Commons image link to be included in the onthisday selected stories links. I'll file a separate task to follow up on that.