Page MenuHomePhabricator

Batching Extension API: Extension tag handling totally broken
Closed, ResolvedPublic

Description

RT diff on the "enwiki:Panic! at the Disco" page without the batching API

RT diff on the "enwiki:Panic! at the Disco" page with the batching API

Some explanations in order. The rt-testing script does the following to identify diffs

  1. wt_orig -> HTML_orig
  2. HTML_orig -> wt_rt
  3. return if diff(wt_orig, wt_rt) is empty
  4. wt_rt -> HTML_2
  5. html_diff = diff(HTML_orig, HTML_2)
  6. if html_diff is empty, report syntactic diffs in rt-ing
  7. if not, report semantic diffs in rt-ing and report the html diff for the wikitext that differed (this html is identified using DSR offsets in the HTML DOMs).

See checkIfSignificant in https://github.com/wikimedia/parsoid/blob/master/tests/roundtrip-test.js for the code that does all this.

So, with the batching API, step 6 is reporting no html diffs, and without the batching API, step 6 is reporting the HTML diff shown in that link above. This is a false semantic diff since the difference is in the id attribute of the <map> element that the timeline extension generates.

Some possibilities for why this might be the case.

  1. Timeline extension is likely using some page-global state to compute the id attribute. In this case, it seems that the batching API doesn't use or doesn't have access to the page global state. It doesn't seem like a catastrophic thing in this example. Is this basically undefined behavior (a la T110239) that generates different results in different implementations? Or is this a bug?
  1. If however, the timeline extension is not using page global state, something is broken with the non-batching code, or with our rt-tesing code.

In any case, this mystery needs further investigation.

Event Timeline

ssastry assigned this task to tstarling.
ssastry raised the priority of this task from to Medium.
ssastry updated the task description. (Show Details)
ssastry added subscribers: ssastry, Aklapper.

If we can satisfactorily explain this mystery today, we can turn batching on in production tomorrow. If not, we will defer this to later this week (off-band deploy) or next week. I would like to turn this on in production, so it would be great to prioritize this investigation.

ssastry set Security to None.

This can reproduced locally.

Start your parsoid server on localhost (with / without batching) and run ( assuming your parsoid service runs on http://localhost:8000 )

$ node roundtrip-test.js 'Panic!_at_the_Disco' --parsoidURL http://localhost:8000
tstarling renamed this task from Batching Extension API: Unexplained change from false semantic diffs to syntactic diffs in rt-testing in some scenarios to Batching Extension API: Extension tag handling totally broken.Sep 9 2015, 4:51 AM

This was due to an error in mangleParserResponse() which caused the result of expanding any extension tag to always be an empty string. Parsoid would replace the empty string with a placeholder and reexpand it with perfect RT results. Thus the issue only showed up in RT testing in a case where an expected RT error was suppressed.

Change 237027 had a related patch set uploaded (by Tim Starling):
Fix totally broken interpretation of parse batch response

https://gerrit.wikimedia.org/r/237027

Change 237027 merged by jenkins-bot:
Fix totally broken interpretation of parse batch response

https://gerrit.wikimedia.org/r/237027