Page MenuHomePhabricator

Stashing: revid mismatch between URI and Etag
Closed, ResolvedPublic

Description

In certain cases (yet to be determined), stashing transform requests has a mismatch in the revid between the ETag and URI parameter. In all of these cases, the request is sent to /transform/html/to/wikitext/{title}/{revision}, but the ETag is 0/{timeuuid}/stash. This suggests that in a previous call RESTBase set 0 for the revision id when it should have set it to the appropriate value, since these requests are not for new pages.

An example request is for /es.wikipedia.org/v1/transform/html/to/wikitext/Mar_de_amor/120200307 sent with the If-Match: W/"0/d14208f0-eec3-11e9-9a9b-fd292f60a221/stash" header. RESTBase responds with a 404 because it can't find the stashed content with revid 120200307:

cassandra@cqlsh:wikipedia_T_parsoidd3o5Dn1wcj_Xve2tXe4_rtmeWSU> select key,headers,tid from data where "_domain"='es.wikipedia.org' and key='Mar_de_amor:120200307:d14208f0-eec3-11e9-9a9b-fd292f60a221';

 key | headers | tid
-----+---------+-----

(0 rows)

Alas, the content is there, just with revid 0:

cassandra@cqlsh:wikipedia_T_parsoidd3o5Dn1wcj_Xve2tXe4_rtmeWSU> select key,headers,tid from data where "_domain"='es.wikipedia.org' and key='Mar_de_amor:0:d14208f0-eec3-11e9-9a9b-fd292f60a221';

 key                                                | headers                                                                                       | tid
----------------------------------------------------+-----------------------------------------------------------------------------------------------+--------------------------------------
 Mar_de_amor:0:d14208f0-eec3-11e9-9a9b-fd292f60a221 | {"etag":"\"0/d14208f0-eec3-11e9-9a9b-fd292f60a221/stash\"","content-type":"application/json"} | d283fed0-eec3-11e9-9a9b-fd292f60a221

(1 rows)

Related Objects

StatusSubtypeAssignedTask
OpenReleaseNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenFeatureNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenBUG REPORTNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedmatmarex
Resolvedmatmarex
Resolved mobrovac
OpenNone
Resolvedcscott
ResolvedABreault-WMF
Resolvedcscott
Opencscott
Resolvedssastry
OpenJgiannelos
OpenJgiannelos

Event Timeline

mobrovac triaged this task as High priority.

Yup, confirmed the problem is RESTBase not honouring its own ETag. I did a thorough investigation and the problem arises when clients call /transform/wikitext/to/html/{title} followed by /transform/html/to/wikitext/{title}/{revision}. Instead of trusting the ETag, RESTBase uses the {revision} parameter which may or may not be correct. Because the chain tranform wt2html -> html2wt should be idempotent, RESTBase should use the ETag rather than the provided revision, which should be used only as a fall-back mechanism.

Mentioned in SAL (#wikimedia-operations) [2019-10-16T03:35:37Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@320f3a5]: Parsoid: Use the ETag for retrieving stashed content - T235465

Mentioned in SAL (#wikimedia-operations) [2019-10-16T03:49:14Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@320f3a5]: Parsoid: Use the ETag for retrieving stashed content - T235465 (duration: 13m 37s)