Page MenuHomePhabricator

VisualEditor <-> RESTBase communication and ETags
Closed, ResolvedPublic

Description

There is something weird going on with the way VE sets the If-Match header when sending requests to RESTBase. We are facing two problems:

  • VE sends an ETag that is suffixed with -df (was T230272)
  • VE doesn't send the If-Match header at all (was T206029)

In both cases, it is not clear when/how/why the ETag is modified/removed. We should investigate this further and find the root cause for this. Currently, we have two work-arounds in RESTBase allowing normal operation despite these problems, but they are ugly hacks and should be removed as soon as possible.

Details

SubjectRepoBranchLines +/-
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.3+1 -1
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.3+9 -15
mediawiki/extensions/VisualEditormaster+1 -1
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.3+9 -15
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.3+2 -1
mediawiki/extensions/VisualEditormaster+9 -15
mediawiki/extensions/VisualEditormaster+2 -1
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.2+3 -1
mediawiki/extensions/VisualEditormaster+3 -1
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.2+30 -2
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.2+4 -4
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.1+30 -2
mediawiki/extensions/VisualEditorwmf/1.35.0-wmf.1+4 -4
mediawiki/extensions/VisualEditormaster+4 -4
mediawiki/extensions/VisualEditormaster+30 -2
mediawiki/extensions/VisualEditormaster+16 -0
mediawiki/extensions/VisualEditorwmf/1.34.0-wmf.24+16 -0
mediawiki/extensions/VisualEditorwmf/1.34.0-wmf.25+16 -0
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I checked and tested thoroughly what RESTBase emits, and it never emits an altered ETag under any circumstance. There must be some Etag manipulation going on on the client side (either in the PHP side of JS of the VE code) that alters it in some instances or removes it in other, given that we have evidence both cases mentioned in the description happen regularly (albeit with different frequencies).

There must be some Etag manipulation going on on the client side (either in the PHP side of JS of the VE code) that alters it in some instances or removes it in other, given that we have evidence both cases mentioned in the description happen regularly (albeit with different frequencies).

So, given that I've also ruled out the VE JS or PHP ever altering an ETag, what are we left with? Could this be the request caching layer (Varnish, from code-comments?) meddling with it in an unexpected manner?

There are also VirtualRESTServiceClient and MultiHttpClient between VE and RESTBase, both of which look surprisingly complicated and could have weird stuff lurking in them.

I experienced the HTTP 404 error again in an edit conflict scenario at Nedungayil Sankunni Narayanan Matriculation Higher Secondary School but my two-column edit conflict merge beta feature is disabled.

T233127 is about the HTTP 404 error you can see as a user.

This task is about investigating some other problems that would also result in a HTTP 404 error if we didn't have workarounds for them.

Is there any debugging tool that can be activated and can record all client/server swingers data in case of hitting the error?

There are also VirtualRESTServiceClient and MultiHttpClient between VE and RESTBase, both of which look surprisingly complicated and could have weird stuff lurking in them.

And there's also the server itself. For example Apache will mess with ETag headers like this unless DeflateAlterETag is disabled: http://httpd.apache.org/docs/trunk/mod/mod_deflate.html#deflatealteretag. I have no idea if RESTBase is even using Apache. I am hoping you folks can investigate that.

Is there any debugging tool that can be activated and can record all client/server swingers data in case of hitting the error?

Any debugging tool I can use next time I encounter this?

Is there any debugging tool that can be activated and can record all client/server swingers data in case of hitting the error?

Any debugging tool I can use next time I encounter this?

You can use your browser's developer tools to record network activity and export it as a HAR file. Note that the exported data will include your cookies for the site (which can be used to take over your account), your user-agent, and probably a lot of other personal information. This is very inconvenient and that's why I did not suggest it. (Also, you'd have to activate the recording before hitting the error, or even opening VE, for the recorded data to be potentially useful.)

@mobrovac - Can you see how many folks are currently getting 404 errors (per day), now that we have some work-arounds in place. Just wondering how severe the problem is at this point.

We don't log genuine 404s, as that is prohibitively expensive. We only log 404s that result from ETag mismatches, which happen when VE sends one ETag in the header, but a different ETag is contained in the document itself (present in the mw:TimeUuid meta tag). There were 0 such cases since we deployed the work-around from T230272.

As to what concerns the two issues from this task's description, here are the stats:

  • there are 500-600 occurrences of VE sending a suffixed ETag per day
  • there are ~20000 occurrences of VE not sending the ETag at all (and RB extracting it form the document) per day.

This is a clear indication that there is something rather problematic going on.

kaldari renamed this task from VE <-> RB communication and ETags to VisualEditor <-> RESTBase communication and ETags.Sep 27 2019, 2:55 PM

My current theory is that both VE and RB do everything right, and the etags are mangled/dropped by some proxy or browser extension. There are definitely extensions that will drop etags for "privacy", e.g. https://addons.mozilla.org/en-US/firefox/addon/etag-stoppa/ https://chrome.google.com/webstore/detail/cookiebro/lpmockibcakojclnfmhchibmdpmollgn (I haven't tried installing these, only read the descriptions).

I'm not sure how it's going to help, but I'll go ahead and add some logging for this too in VE API. Maybe we'll see a single user-agent responsible for this or something.

Change 540254 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540254

Change 540428 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.24] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540428

Change 540429 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.25] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540429

Change 540429 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.25] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540429

Change 540428 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.24] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540428

Mentioned in SAL (#wikimedia-operations) [2019-10-02T16:42:42Z] <lucaswerkmeister-wmde@deploy1001> Synchronized php-1.34.0-wmf.24/extensions/VisualEditor/: SWAT: [[gerrit:540428|ApiVisualEditorEdit: Add logging for funny etags (T233320)]] (duration: 01m 03s)

Mentioned in SAL (#wikimedia-operations) [2019-10-02T16:46:25Z] <lucaswerkmeister-wmde@deploy1001> Synchronized php-1.34.0-wmf.25/extensions/VisualEditor/: SWAT: [[gerrit:540427|ApiVisualEditor: Add logging for RESTBase HTTP errors (T233127)]] + [[gerrit:540429|ApiVisualEditorEdit: Add logging for funny etags (T233320)]] (duration: 01m 04s)

My current theory is that both VE and RB do everything right, and the etags are mangled/dropped by some proxy or browser extension. There are definitely extensions that will drop etags for "privacy", e.g. https://addons.mozilla.org/en-US/firefox/addon/etag-stoppa/ https://chrome.google.com/webstore/detail/cookiebro/lpmockibcakojclnfmhchibmdpmollgn (I haven't tried installing these, only read the descriptions).

I'm not sure how it's going to help, but I'll go ahead and add some logging for this too in VE API. Maybe we'll see a single user-agent responsible for this or something.

I use Chrome and the only extension I use is AdBlock but it has wikipedia.org/* whitelisted and doesn't run on it. I will however delete it completely and let you know if this stops the error, albeit I haven't encountered it in a few days

Change 540254 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] ApiVisualEditorEdit: Add logging for funny etags

https://gerrit.wikimedia.org/r/540254

So the logging wasn't actually working, and I spent most of today figuring out why. (Turns out nothing gets logged if you try to log structured data.) We just deployed another set of patches (https://gerrit.wikimedia.org/r/q/Iae778f95774df2e24b30387221e39e097e25a4cf) and the logging should work now, with the results logged under 'channel:VisualEditor': https://logstash.wikimedia.org/goto/4d053e9de35d9fccb40fabcabd46ff00 (this has logs for this task and T233127)

I'm running Chrome with AdBlock and PrivacyBadger.

Change 543011 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Always return 'etag' with 'content' (for private wikis)

https://gerrit.wikimedia.org/r/543011

Change 543012 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/543012

With that patch VE will detect receiving invalid etags when querying RESTBase directly, and attempt querying RESTBase via the MediaWiki API instead. The API returns the etag as part of the normal payload instead of HTTP headers, which should prevent it from being messed with.

We already query RESTBase via the MediaWiki API on private wikis (e.g. Officewiki), so this should work fine. Also, I discovered that we previously actually did not send any etags when editing on private wikis (oops… this is a tiny minority of the logged problems though), which is fixed by the other patch.

We already query RESTBase via the MediaWiki API on private wikis (e.g. Officewiki), so this should work fine. Also, I discovered that we previously actually did not send any etags when editing on private wikis (oops… this is a tiny minority of the logged problems though), which is fixed by the other patch.

@mobrovac reminded me that this is wrong and VE talks directly to Parsoid on private wikis, rather than to RESTBase. However, talking to RESTBase via MediaWiki API on non-private wikis should still be fine (we do that for saving changes now).

Change 543011 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Always return 'etag' with 'content'

https://gerrit.wikimedia.org/r/543011

Change 543012 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/543012

Change 544934 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.1] ApiVisualEditor: Always return 'etag' with 'content'

https://gerrit.wikimedia.org/r/544934

Change 544935 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.1] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/544935

Change 544938 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] ApiVisualEditor: Always return 'etag' with 'content'

https://gerrit.wikimedia.org/r/544938

Change 544939 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/544939

Change 544934 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.1] ApiVisualEditor: Always return 'etag' with 'content'

https://gerrit.wikimedia.org/r/544934

Change 544935 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.1] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/544935

Mentioned in SAL (#wikimedia-operations) [2019-10-21T17:08:06Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.1/extensions/VisualEditor/: Update VisualEditor for set of back-ports in wmf.1 T233320, T234564, T235959 (duration: 00m 56s)

Change 544938 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] ApiVisualEditor: Always return 'etag' with 'content'

https://gerrit.wikimedia.org/r/544938

Change 544939 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] Detect mangled etags from RESTBase and retry via MediaWiki API

https://gerrit.wikimedia.org/r/544939

Mentioned in SAL (#wikimedia-operations) [2019-10-21T18:41:29Z] <urbanecm@deploy1001> Synchronized php-1.35.0-wmf.2/extensions/VisualEditor: SWAT: a4ab456: TreeModifier: Ignore removed nodes properly when normalizing from a text node (T235959); rEVEDecb453238265: Update VE core submodule to a4ab456dc0 (T235959); rEVEDa850cee49807: ApiVisualEditor: Always return etag with content (T233320) (duration: 00m 55s)

Change 545015 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Don't retry when etag is undefined because the page doesn't exist

https://gerrit.wikimedia.org/r/545015

Change 545015 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Don't retry when etag is undefined because the page doesn't exist

https://gerrit.wikimedia.org/r/545015

Change 545231 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] Don't retry when etag is undefined because the page doesn't exist

https://gerrit.wikimedia.org/r/545231

Change 545231 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.2] Don't retry when etag is undefined because the page doesn't exist

https://gerrit.wikimedia.org/r/545231

Change 545656 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Don't log missing ETags when creating a new page, that's normal

https://gerrit.wikimedia.org/r/545656

Change 545657 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Return 'etag' with 'content' for preloaded content

https://gerrit.wikimedia.org/r/545657

With the new logging, I discovered that we lose the etag when starting an edit with preloaded content, e.g. using the form at https://en.wikipedia.org/wiki/Wikipedia:Article_wizard/CreateDraft. However that should not be a problem for RESTBase, since we normally wouldn't send an etag when creating a new page, and preloaded content is only used for creating new pages.

Anyway, that's making a mess of the logs, so let's fix it first before I look at them again.

Change 545656 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Don't log missing ETags when creating a new page, that's normal

https://gerrit.wikimedia.org/r/545656

Change 545657 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Return 'etag' with 'content' for preloaded content

https://gerrit.wikimedia.org/r/545657

Change 546444 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] Don't log missing ETags when creating a new page, that's normal

https://gerrit.wikimedia.org/r/546444

Change 546445 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] ApiVisualEditor: Return 'etag' with 'content' for preloaded content

https://gerrit.wikimedia.org/r/546445

Change 546444 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] Don't log missing ETags when creating a new page, that's normal

https://gerrit.wikimedia.org/r/546444

Change 546445 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] ApiVisualEditor: Return 'etag' with 'content' for preloaded content

https://gerrit.wikimedia.org/r/546445

Mentioned in SAL (#wikimedia-operations) [2019-10-28T11:33:33Z] <urbanecm@deploy1001> Synchronized php-1.35.0-wmf.3/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: SWAT: rEVED8caf681a3675: Dont log missing ETags when creating a new page, thats normal (T233320) (duration: 00m 54s)

https://gerrit.wikimedia.org/r/546445 was reverted because it resulted in errors like "PHP Notice: Undefined index: etag".

Change 546463 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Fix preload handling further

https://gerrit.wikimedia.org/r/546463

Change 546463 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] ApiVisualEditor: Fix preload handling further

https://gerrit.wikimedia.org/r/546463

Change 546515 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] Revert "Revert "ApiVisualEditor: Return 'etag' with 'content' for preloaded content""

https://gerrit.wikimedia.org/r/546515

Change 546517 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] ApiVisualEditor: Fix preload handling further

https://gerrit.wikimedia.org/r/546517

Change 546515 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] Revert "Revert "ApiVisualEditor: Return 'etag' with 'content' for preloaded content""

https://gerrit.wikimedia.org/r/546515

Change 546517 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.35.0-wmf.3] ApiVisualEditor: Fix preload handling further

https://gerrit.wikimedia.org/r/546517

Mentioned in SAL (#wikimedia-operations) [2019-10-28T18:24:08Z] <urbanecm@deploy1001> Synchronized php-1.35.0-wmf.3/extensions/VisualEditor/includes/ApiVisualEditor.php: SWAT: rEVEDb19ad5f3bbb3: Revert "Revert "ApiVisualEditor: Return etag with content for preloaded content""; rEVED4f3b724f951d: ApiVisualEditor: Fix preload handling further (T233320) (duration: 00m 53s)

Over the last four days, there are 44,717 hits in our logs (https://logstash.wikimedia.org/goto/eb1c2883431f493e9ac41665a906b5b9).

Out of those, 43,676 are the message "ApiVisualEditor::execute: Client reported bad ETag: {badetag}, expected: {goodetag}", out of which vast majority is an etag with the "-df" suffix (I don't know how to filter for that in Kibana to get exact numbers, but out of the maximum 500 search results it can display, all 500 had this problem).

This would indicate that my workaround is in fact working, and my guess was correct: something between the user's browser and RESTBase is mangling the headers (T233320#5540141). VE now recovers from this problem by querying RESTBase via the MediaWiki API instead of directly, which allows us to get the un-mangled etag (T233320#5573696), but this becomes a performance problem for us now – for affected users, we basically have to load the page for editing twice.

@mobrovac Given this, I don't think the ETag header is a good way of identifying individual parse results. Do you have any other ideas on how this could be handled? Can we identify the parse results by SHA1 of the content or something? (The <meta property="mw:TimeUuid" …> tag, which you're planning to remove in T128525, would also be a good way from my perspective ;) )


The other 1,041 hits include:

  • 638 hits: ApiVisualEditorEdit::postData: Received funny ETag from client: {etag}
    • 2 of these are from private wikis, because the ETag formats differ there, this is harmless
    • Most of the remaining have empty etags and are related to recently created pages; this may be some other issue with proxies messing with caching
  • 403 hits: ApiVisualEditor::requestRestbase: Received HTTP {code} from RESTBase
    • 255 hits: code=400. Many of these correspond to the empty etags from the previous point, others may be bugs in VE
    • 74 hits: code=404. Per earlier analysis on T233127, these are probably cases where the user kept VE open for more than 24 hours (or got outdated content from some proxy messing with caching), or where the page genuinely has been deleted while they were editing
    • 67 hits: code=500. These are probably genuine bugs in RESTBase or Parsoid
    • 6 hits: code=504
    • 1 hit: code=429

Thank you, @matmarex, for taking the time to look into this!

This would indicate that my workaround is in fact working, and my guess was correct: something between the user's browser and RESTBase is mangling the headers (T233320#5540141). VE now recovers from this problem by querying RESTBase via the MediaWiki API instead of directly, which allows us to get the un-mangled etag (T233320#5573696), but this becomes a performance problem for us now – for affected users, we basically have to load the page for editing twice.

I agree that's not at all desirable. For now, we can keep the -df work-around in RB, and I will try to see if something can be done on the Varnish level (as this is the only entity sitting in-between RB and client-side VE). Speaking of which, can you confirm that (a) all df-suffixed ETags are received by the client-side VE code when calling directly RB; and (b) this happens mainly for large pages that would benefit from compression (and that the client sends the Accept header allowing the server to compress it)?

@mobrovac Given this, I don't think the ETag header is a good way of identifying individual parse results. Do you have any other ideas on how this could be handled? Can we identify the parse results by SHA1 of the content or something? (The <meta property="mw:TimeUuid" …> tag, which you're planning to remove in T128525, would also be a good way from my perspective ;) )

The Etag is a fairly unique identifier as-is. Computing the SHA1 is much more expensive.


The other 1,041 hits include:

  • 638 hits: ApiVisualEditorEdit::postData: Received funny ETag from client: {etag}
    • 2 of these are from private wikis, because the ETag formats differ there, this is harmless
    • Most of the remaining have empty etags and are related to recently created pages; this may be some other issue with proxies messing with caching

We should focus on transforms here, because that's what interests us the most. Regardless of whether a page is new or not, VE asks RB for the render, and RB provides the ETag with it, which is then sent back by VE in the form of the If-Match header during transforms. So far so good. From what I can tell, the problem here arises when VE doesn't ask for the HTML, but instead uses a cached version apparently. This part is very vague to me. Where is it getting this cached version from? Why does this version not contain the Etag?

  • 403 hits: ApiVisualEditor::requestRestbase: Received HTTP {code} from RESTBase
    • 255 hits: code=400. Many of these correspond to the empty etags from the previous point, others may be bugs in VE

I agree that the most likely (and most common) cause here is the missing ETag in the request when even the mwTimeUuid meta property is not available.

  • 74 hits: code=404. Per earlier analysis on T233127, these are probably cases where the user kept VE open for more than 24 hours (or got outdated content from some proxy messing with caching), or where the page genuinely has been deleted while they were editing

+1

  • 67 hits: code=500. These are probably genuine bugs in RESTBase or Parsoid
  • 6 hits: code=504

Both of these usually happen when Parsoid can't parse the page at all for various reasons, but given their low volume I think we can put this aside for now.

  • 1 hit: code=429

Oops, so somebody managed to ask for the same page more than 5 times in a row too fast :) Likely a bot I'd say.


On the RESTBase side, the most numerous recorded cases were the ones where VE does not provide an If-Match header during transforms - there were 1122 such occurrences in the last 7 days, so I think we need to prioritise finding out how and why that happens. Curiously, there were also 615 occurrences of VE exceeding the rate limit for transforms. This may be due to retries. Next, there are 44 cases where VE calls transform/hmtl/to/wikitext/{title}/{revision} with an If-Match header that contains the revision 0 instead of {revision} (i.e. the one indicated in the URI). Finally, there are 64 50x statuses, all of which happen due to Parsoid timing out during parsing or transforms.

On the RESTBase side, the most numerous recorded cases were the ones where VE does not provide an If-Match header during transforms - there were 1122 such occurrences in the last 7 days, so I think we need to prioritise finding out how and why that happens.

In VE's source mode, if you paste HTML and tell it you want to convert it into wikitext, it doesn't send the etag. That'd be my guess as the source there.

matmarex claimed this task.

I think this is sufficiently resolved, my last comment is a summary: T233320#5638542. If you notice VE behaving wrong in some other way, please file another task and I'll investigate.


Sorry I didn't reply to your comment earlier, here are answers to your questions in case they're still relevant:

For now, we can keep the -df work-around in RB, and I will try to see if something can be done on the Varnish level (as this is the only entity sitting in-between RB and client-side VE). Speaking of which, can you confirm that (a) all df-suffixed ETags are received by the client-side VE code when calling directly RB; and (b) this happens mainly for large pages that would benefit from compression (and that the client sends the Accept header allowing the server to compress it)?

(a) Yes (b) No idea

From what I can tell, the problem here arises when VE doesn't ask for the HTML, but instead uses a cached version apparently. This part is very vague to me. Where is it getting this cached version from? Why does this version not contain the Etag?

It's vague because I'm only speculating. I'm assuming these users are behind some corporate/parental/antivirus/privacy/etc. proxy/filter/etc., and that software removes or changes the etag, presumably because it's buggy. I haven't seen one of those in person, but what else could possibly be happening?

On the RESTBase side, the most numerous recorded cases were the ones where VE does not provide an If-Match header during transforms - there were 1122 such occurrences in the last 7 days, so I think we need to prioritise finding out how and why that happens.

In VE's source mode, if you paste HTML and tell it you want to convert it into wikitext, it doesn't send the etag. That'd be my guess as the source there.

That's right, I had a quick look at the code and we don't send en etag there. This seems to me like the correct behavior.