Page MenuHomePhabricator

Requests originating from zhwiki wikifeeds caused parsoid outage
Open, MediumPublic

Description

From ops mailing list:

There was a restbase/parsoid outage due to a flood of requests for transforming pages on zhwiki.
This was eventually traced back to requests to the /api/rest_v1/feed/featured/yyyy/mm/dd URLs, which seem to cascade (via restbase, mobileapps, restbase again) into making 10+ requests to parsoid to urls like:

http://zh.wikipedia.org/w/rest.php/zh.wikipedia.org/v3/transform/pagebundle/to/pagebundle/<title>

These requests are expensive to parse causing the outage.

NOTE: This wasn't the first time we saw such an outage, but it was the first time we could identify the origin thanks to moving wikifeeds out of restbase and to the api gateway, with its superior observability stack.

Here is the grafana diagram for parsoid performance where the spike on language conversion is visible.
https://grafana.wikimedia.org/goto/X77YIsmIz?orgId=1

Event Timeline

From my local restbase setup I only end-up querying pagebundle/to/bundle when I am passing a specific locale to /page/html.
For example:

This doesn't hit the transformation that is computationally intensive:

curl -v 127.0.0.1:7233/zh.wikipedia.beta.wmflabs.org/v1/page/html/%E5%8D%97%E5%8C%97%E6%9C%9D

but this does:

curl -v 127.0.0.1:7233/zh.wikipedia.beta.wmflabs.org/v1/page/html/%E5%8D%97%E5%8C%97%E6%9C%9D -H 'accept-language: zh-cn'

When it comes to wikifeeds, the backend GETs page/html endpoint with the locale passed from the clients.

Change 958573 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Disable Zh Language converter

https://gerrit.wikimedia.org/r/958573

Change 958573 abandoned by Subramanya Sastry:

[mediawiki/services/parsoid@master] Disable Zh Language converter

Reason:

I will go with Scott's suggestion here.

https://gerrit.wikimedia.org/r/958573

Change 958593 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Disable Zh Language converter

https://gerrit.wikimedia.org/r/958593

Change 958995 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):

[mediawiki/services/wikifeeds@master] Strip zh variants when calling parsoid

https://gerrit.wikimedia.org/r/958995

Change 958995 merged by jenkins-bot:

[mediawiki/services/wikifeeds@master] Strip zh variants when calling parsoid

https://gerrit.wikimedia.org/r/958995

There is a workaround for wikifeeds that should fix the parsoid outage issue in the short term. Can we re-enable traffic for zhwiki on wikifeeds?

I 've just disabled the rule. It's still present, but inactive. For other SREs having to re-enable it in an emergency:

puppetmaster1001:$ sudo requestctl enable cache-text/wikifeeds_featured
puppetmaster1001:$ sudo requestctl commit

I 've just re-enabled the filter, rejecting traffic, we are meeting issues with high latencies and decreased availability in the parsoid cluster.

Change 959822 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):

[mediawiki/services/wikifeeds@master] Strip zh variants when calling summary

https://gerrit.wikimedia.org/r/959822