Page MenuHomePhabricator

RESTBase HTTP errors preventing VisualEditor from saving
Closed, InvalidPublic

Description

Over the past two weeks or so, a few editors have come into #wikipedia-en-help unable to save their edits using the Visual Editor due to an error contacting the Parsoid/RESTBase server. Those errors include:

The last two editors were advised to copy the text from VE to a text editor, refresh the page, and try again. They did not report back on if it worked or not. No further information was available.

Event Timeline

matmarex added subscribers: Pchelolo, ssastry, Joe.

I don't think there's anything we can do about this in VisualEditor. It seems that folks have already started investigating this problem when it was mentioned on T250620:

@Joe might this be in any way related to the envoy changes?

@Joe might this be in any way related to the envoy changes?

Probably not. https://logstash.wikimedia.org/goto/c9913945a883dafe1617ac23c65cd9a6 shows errors at least as far back as Jan 20 (3 months back and probably logstash doesn't have data before that), but, I see a spike starting April 3. Not sure what happened there.

@Pchelolo do you have any insight here?

Looks like HTTP 409s before April 3 changed to HTTP 404s. See https://logstash.wikimedia.org/goto/be58e75fdd5fbc6f33bc8644530b228a for HTTP 409 graph.
But, overall error rates itself didn't change ( https://logstash.wikimedia.org/goto/39c21af1e325e470b33d30c7d425a644 is the error rate across all non-200 status codes ).

We are waiting on CPT to let us know if they have this covered or if there is an action for the Editing team

This unfortunately doesn't give me a lot of info to dig into.

One thing that I have noticed is T250815 - for new pages the initial 404 is not cached by ATS/Varnish for 10 minutes regardless, which would break VE when trying to edit the page.

T235822 can also be a factor, but as I've indicated on that task, it's by design.

So, I think we should limit this one to 409s. The only reason for RESTBase to respond with a 409 is if the revision that's being edited was deleted while it was edited. This event should probably be very rare in reality, so the logstash link posted by @ssastry indicates a problem. However, it's not happening anymore..

Honestly, I don't know what more can I do here, given that we don't have the page name for a 409 error. I'm going to close this one as invalid. Please reopen if 409 error appears again, but we definitely need the name of the page to investigate properly.

I asked what page they were editing, they didn't tell me, only saying that they were on a PC and it's a new page. If it helps, the 409 was reported 2020-02-23 21:25 UTC according to my IRC logs. The user was asked to refresh and try again at 21:29, they said they would, then they left. I did not find any new pages created around that time that looked to be related. There's been no substantial increase in reported VE/RESTBase errors, so yeah, nothing really to do.