Page MenuHomePhabricator

Linter extension is currently incompatible with Parsoid/PHP
Closed, ResolvedPublic

Description

See T237326#5641753 and later.

Parsoid/JS and Linter extension were good friends because the DSR offsets generated by Parsoid/JS were UCS2 offsets which are compatible with the front-end JS code to highlight the lint section in wikitext because the backend and front-end are both JS.

However, Parsoid/PHP emits UTF8 byte offsets which the front-end JS code cannot interpret. So, before we can enable storage of lints with Parsoid/PHP code, we need to address this encoding incompatibility. There are couple different approaches.

  1. Parsoid/PHP already has offset conversion code to convert between these encodings. So, we will have to enable offset conversion code whenever a page has non-zero lints. This can either be done during the parse or via a job-queue. Doing this during the parse is simplest complexity-wise. We haven't benchmarked the cost of offset converstion and it may turn out that this is a small fraction of total parse costs in which case we can do that.
  2. Implement offset conversion code in the front-end JS code. This is also doable, but will add overheads to every edit request.

In either case, we need to pause collection of new lints while we resolve this problem. If we implement solution 1, above no changes are needed anywhere. If we implement solution 2. we will have to clear all stored lints before storing fresh ones.

Given that linting is non-critical functionality right now, I am proposing we don't make this a blocker on Parsoid/PHP rollout ( T229015: Tracking: Direct live production traffic at Parsoid/PHP ). We might be able to study the feasibility of solution 1 and implement it quickly afterwards.

Event Timeline

ssastry triaged this task as Medium priority.Nov 6 2019, 7:28 PM
ssastry created this task.
ssastry moved this task from Backlog to Bugs, Notices, Crashers on the Parsoid-PHP board.
ssastry moved this task from Backlog to Parsoid on the MediaWiki-extensions-Linter board.

Change 549947 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Linting: Convert DSR offsets to 'ucs2' before saving them

https://gerrit.wikimedia.org/r/549947

Change 549947 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Linting: Convert DSR offsets to 'ucs2' before saving them

https://gerrit.wikimedia.org/r/549947

We went with option #1 here. We don't expect perf. issues but given our instrumentation, we'll know if something is off here and can revisit this.