Page MenuHomePhabricator

VisualEditor broken on wikitech when codfw is primary: "Error loading data from server: apierror-visualeditor-docserver-http: HTTP 500."
Closed, ResolvedPublic

Description

When trying to use VE on wikitech I get an error message suggesting it got a 500 response.

Normal editing works just fine, FTR.

The full error message I get is

Error loading data from server: apierror-visualeditor-docserver-http: HTTP 500. Would you like to retry?

I strongly suspect this has something to do with the fact all other wikis are currently migrated to codfw.

Event Timeline

Joe triaged this task as High priority.Apr 20 2017, 12:38 PM
Joe updated the task description. (Show Details)
Aklapper renamed this task from VisualEditor is broken on wikitech to VisualEditor broken on wikitech: "Error loading data from server: apierror-visualeditor-docserver-http: HTTP 500.".Apr 20 2017, 10:08 PM

I saw this too right after the DC switch. I imagine it has something to do with the restbase/parsoid setup. I would honestly consider this low to lowest priority. If it doesn't magically get better when eqiad is the primary DC again then we can dig into it.

Jdforrester-WMF lowered the priority of this task from High to Low.Apr 26 2017, 5:31 PM
Jdforrester-WMF added a subscriber: Jdforrester-WMF.

Almost all prod wikis access Parsoid via RESTbase (wgVisualEditorAccessRESTbaseDirectly); this is false in private wikis and wikitech. The former works fine, the latter is broken. Is the codfw RB/Parsoid config mirrored perfectly? Maybe there's a hack for wikitech that's not applied?

Jdforrester-WMF renamed this task from VisualEditor broken on wikitech: "Error loading data from server: apierror-visualeditor-docserver-http: HTTP 500." to VisualEditor broken on wikitech when codfw is primary: "Error loading data from server: apierror-visualeditor-docserver-http: HTTP 500.".May 1 2017, 8:14 PM
Jdforrester-WMF added a subscriber: Tgr.

Confirmed that VE editing works normally following switch back to eqiad as primary DC.

The issue is happening again with the switch over to codfw today:

{"error":{"code":"apierror-visualeditor-docserver-http","info":"HTTP 500","*":"See https://wikitech.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."},"servedby":"labweb1002"}

wtp2013 /srv/log/parsoid/main.log:

{"name":"parsoid","hostname":"wtp2013","pid":12,"level":60,"levelPath":"fatal/request","msg":"Config Request failure for \"https://wikitech.wikimedia.org/w/api.php\": 404","time":"2018-09-11T21:04:56.556Z","v":0}

RB or Parsoid mis-configuration?

RESTBase is still not involved with wikitech, so you can rule RB out of the picture entirely.

RB or Parsoid mis-configuration?

I guess parsoid, but it's probably subtle; AFAIR parsoid fetches info about the wikis from siteinfo, which AFAIR should still be fetched from eqiad.

Oh, yeah, that's probably why it breaks (the edge case we never test).

So the only config difference is that in codfw we call the mediawiki API via HTTPS, while in eqiad we call it via HTTP and I think there are some subtle differences to how we do it, that might explain why wikitech would fail via https - it's not hosted on the main cluster.

Why is Parsoid saying Config Request failure for "https://wikitech.wikimedia.org/w/api.php": 404 though? There has to be more to that request than it's logging.

Ok, I think I found the issue:

in eqiad, we define defaultAPIProxyURI and not mwApiServer, which means we follow this code path in ApiRequest:

https://github.com/wikimedia/parsoid/blob/597932be72082eb00999eb71565c2ac43f724205/lib/mw/ApiRequest.js#L294-L310

as I found out this morning, when we set up the proxy object referenced there during the initialization of the sitemaps, we explicitly set the proxy to null for private wikis, fishbowl wikis and - drumroll - 'labswiki' and 'labtestwiki':

https://github.com/wikimedia/parsoid/blob/994611e64cdd8409c74976657ee642ab4e7a17ec/lib/config/ParsoidConfig.js#L435-L446

This means that our requests to those wikis will not be proxied but will go through the open internet - this also explains why calls to wikitech from parsoid go through cache-text.

Of course this doesn't happen when we use mwApiServer and we just need to add that as a condition in ApiRequest in parsoid.

Change 459912 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[mediawiki/services/parsoid@master] Respect the proxy overriding even when going through the mwApiServer

https://gerrit.wikimedia.org/r/459912

Ok, I think I found the issue:

in eqiad, we define defaultAPIProxyURI and not mwApiServer, which means we follow this code path in ApiRequest:

https://github.com/wikimedia/parsoid/blob/597932be72082eb00999eb71565c2ac43f724205/lib/mw/ApiRequest.js#L294-L310

as I found out this morning, when we set up the proxy object referenced there during the initialization of the sitemaps, we explicitly set the proxy to null for private wikis, fishbowl wikis and - drumroll - 'labswiki' and 'labtestwiki':

https://github.com/wikimedia/parsoid/blob/994611e64cdd8409c74976657ee642ab4e7a17ec/lib/config/ParsoidConfig.js#L435-L446

This means that our requests to those wikis will not be proxied but will go through the open internet - this also explains why calls to wikitech from parsoid go through cache-text.

Of course this doesn't happen when we use mwApiServer and we just need to add that as a condition in ApiRequest in parsoid.

Good find!

Change 459912 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Respect the proxy overriding even when going through the mwApiServer

https://gerrit.wikimedia.org/r/459912

VE is functional on wikitech once more.

Change 463733 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[mediawiki/services/parsoid@master] Fix the logic (and comments) when skipping the funnel into mwApiServer

https://gerrit.wikimedia.org/r/463733

Change 463733 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Fix the logic (and comments) when skipping the funnel into mwApiServer

https://gerrit.wikimedia.org/r/463733

VE is functional on wikitech once more.

It appears to be malfunctioning again. Getting on each edit attempt:

Error loading data from server:
apierror-visualeditor-docserver-http: HTTP 500.

I was going to file a new task, but given this task was active just yesterday, re-opening for now.

Right, but it's seems to have had the opposite effect. Wikitech editing was working until this patch merged yesterday. (Or maybe something else caused it.)

That patch hasn't been deployed yet