Page MenuHomePhabricator

HTTP 400 Error when trying to save an edit on English Wikipedia: Error contacting the Parsoid/RESTBase server
Closed, ResolvedPublic

Description

Reporting on behalf of three editors who were taking part in a (remote and appropriately socially distanced) editing event yesterday.

When trying to save an edit on en.wp they encountered the following error message:

Something went wrong
Error contacting the Parsoid/RESTBase server (HTTP 400)

Details

Event Timeline

One editor reports that the problem is persisting. They have tried with Google Chrome and Microsoft Edge and are running Windows 10.

Aklapper renamed this task from Error when trying to save an edit on Wikipedia to HTTP 400 Error when trying to save an edit on English Wikipedia: Error contacting the Parsoid/RESTBase server.Apr 21 2020, 3:23 PM
Aklapper edited projects, added RESTBase, Parsoid; removed Editing QA.

I assume the editors used VisualEditor? (Removed Editing QA as it is up to teams what they plan to have on their workboard.)

One editor reports that the problem is persisting. They have tried with Google Chrome and Microsoft Edge and are running Windows 10.

Thanks for reporting! Do you know what page they were editing? It is likely Parsoid is crashing on that page. Knowing the page would help us debug and fix the problem.

It is a VisualEditor issue (thanks for removing it from Editing QA, I didn't realise it's up to the teams to sort that out) and seems to only apply to the editor's sandbox, rather than mainspace.

ssastry added a subscriber: Pchelolo.

@Pchelolo can you check if RESTBase has any errors logged for this page in that timeframe? I cannot reproduce this error now that the page has been initialized. I wonder if this is some edge case related to new page creation.

daniel lowered the priority of this task from High to Medium.Apr 28 2020, 8:05 PM

I've managed to reproduce this. It's caused by the following:

2020-04-28 20:40:01 [cc28578c-3fcd-4624-981b-024284b2a578] mw1375 enwiki 1.35.0-wmf.28 VisualEditor WARNING: ApiVisualEditor::requestRestbase: Received HTTP 400 from RESTBase {"code":400,"trace":"#0 /srv/mediawiki/php-1.35.0-wmf.28/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(307): ApiVisualEditor->requestRestbase(Object(Title), 'POST', 'transform/html/...', Array, Array)
#1 /srv/mediawiki/php-1.35.0-wmf.28/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(323): ApiVisualEditorEdit->postData('transform/html/...', Object(Title), Array, Array, NULL)
#2 /srv/mediawiki/php-1.35.0-wmf.28/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(185): ApiVisualEditorEdit->postHTML(Object(Title), '<!doctype html>...', Array, NULL)
#3 /srv/mediawiki/php-1.35.0-wmf.28/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(159): ApiVisualEditorEdit->getWikitextNoCache(Object(Title), Array, Array)
#4 /srv/mediawiki/php-1.35.0-wmf.28/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(404): ApiVisualEditorEdit->getWikitext(Object(Title), Array, Array)
#5 /srv/mediawiki/php-1.35.0-wmf.28/includes/api/ApiMain.php(1580): ApiVisualEditorEdit->execute()
#6 /srv/mediawiki/php-1.35.0-wmf.28/includes/api/ApiMain.php(523): ApiMain->executeAction()
#7 /srv/mediawiki/php-1.35.0-wmf.28/includes/api/ApiMain.php(494): ApiMain->executeActionWithErrorHandling()
#8 /srv/mediawiki/php-1.35.0-wmf.28/api.php(84): ApiMain->execute()
#9 /srv/mediawiki/w/api.php(3): require('/srv/mediawiki/...')
#10 {main}","response":"{\"type\":\"https://mediawiki.org/wiki/HyperSwitch/errors/bad_request\",\"method\":\"post\",\"detail\":\"No or invalid If-Match header supplied, or missing mw:TimeUuid meta element in the supplied HTML.\",\"uri\":\"/en.wikipedia.org/v1/transform/html/to/wikitext/User%3APchelolo%2Fsandbox1/953739893\"}","requestPath":"transform/html/to/wikitext/User%3APchelolo%2Fsandbox1/953739893","requestIfMatch":""}

One of the issues is that right after creating a page, for 5 minutes the 404 response from RESTBase is cached by Varnish! This is definitely incorrect.

Pchelolo raised the priority of this task from Medium to High.EditedApr 28 2020, 8:59 PM
Pchelolo added a project: Traffic.

RESTBase correctly responds for non-existing pages with 404 and the following cache-control:

 curl -i 'http://localhost:7231/en.wikipedia.org/v1/page/html/Lblblbllblblb?redirect=false&stash=true' | grep cache-control

cache-control: private, max-age=0, s-maxage=0, must-revalidate

however, making the same exact request from outside:

curl -i 'https://en.wikipedia.org/api/rest_v1/page/html/Lblblbllblblb?redirect=false&stash=true' | grep cache-control

cache-control: s-maxage=600

Seems like ATS is adding a forced 10 minute caching TTL to these 404 responses from RESTBase. This makes any newly create page uneditable in VE for the first 10 minutes.

I can see that the that sets the caching header was added quite a long time ago, not sure what has changed. RESTBase explicitly requests no caching with the headers it sets, somehow that is no longer respected.

Traffic This seems like a borderline UBN to me, would you mind having a look. All info relevant to you is in my previous comment.

Yeah, the linked Lua code is, I think, trying to emulate what our VCL has always traditionally done similarly as:

// Set a maximum cap on the TTL for 404s. Objects that don't exist now may
// be created later on, and we want to put a limit on the amount of time
// it takes for new resources to be visible.
elsif (beresp.status == 404 && beresp.ttl > <%= @vcl_config.fetch("ttl_cap_404", "10m") %>) {
        set beresp.ttl = <%= @vcl_config.fetch("ttl_cap_404", "10m") %>;
}

The old code only capped 404s downwards to 10 minutes, if they were already cacheable for longer, whereas the Lua unconditionally caches them for 10 minutes regardless of their cacheability or its natural duration, which is clearly not ideal. I'll defer to @ema though on fixing this properly - I think he was trying to avoid error-prone CC-header parsing, and I'm not sure if ATS at this stage has already parsed cacheability for itself and exposed that via other values. Either way, we may have been intentionally trying to force 404-cacheability to deal with some other problem as well, and that fix may need to be preserved in some limited cases?

One of the issues is that right after creating a page, for 5 minutes the 404 response from RESTBase is cached by Varnish! This is definitely incorrect.

is this the same issue as T238716?

Change 593458 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: stop unconditionally cache 404s

https://gerrit.wikimedia.org/r/593458

Change 593458 merged by Ema:
[operations/puppet@production] ATS: stop unconditionally caching 404s

https://gerrit.wikimedia.org/r/593458

Traffic This seems like a borderline UBN to me, would you mind having a look. All info relevant to you is in my previous comment.

Currently there's no way in Lua to find out if the response is cacheable and what the calculated TTL is, so for now I've commented out the code that you rightly identified as being at fault while we work on making ats core expose the information.

Pchelolo claimed this task.

Traffic This seems like a borderline UBN to me, would you mind having a look. All info relevant to you is in my previous comment.

Currently there's no way in Lua to find out if the response is cacheable and what the calculated TTL is, so for now I've commented out the code that you rightly identified as being at fault while we work on making ats core expose the information.

Thank you. Verified that it fixed VE for newly created pages. I'm going to resolve this ticket since CPT work here's done. In case you need it for followups in traffic - please reopen.