Page MenuHomePhabricator

Preparing Flow for Parsoid-PHP switch
Closed, ResolvedPublic0 Estimated Story Points

Description

This is a tracker task to figure out testing / qa needs and any work / changes needed to VisualEditor to switchover from Parsoid/JS to Parsoid/PHP.

T229015: Tracking: Direct live production traffic at Parsoid/PHP is the tracker task for deployment. But, we don't anticipate a deploy before mid-September 2019 at this time. But, let us handle this as if we are going to be switching over Flow to Parsoid/PHP around end September / early October.

Since Flow does not talk to RESTBase, Flow would need a config change to update its endpoints. At this time, we are planning to switch Parsoid clients one at a time. Currently, we are considering switching over Flow as the last Parsoid client. We are assuming we will have all the bugs ironed out by that time and can do a simple switchover without needing traffic partitioning. But, if it becomes necessary, Flow will need to manage a new cookie.

While Flow's HTML storage has data-parsoid DSR offsets stored there, since Flow does not use selective serialization, these offsets are ignored by Parsoid. Nevertheless, it might be appropriate to null out the old offsets (or have some suitable protection against future use) since the offsets generated by Parsoid/JS code are invalid in Parsoid/PHP land.

We will also do an early deployment to the beta cluster so that clients can do early testing there. However, Flow will nevertheless need some update / code to pick its endpoint on the Beta cluster.

But, please comment on the ticket / edit the description adding any other requirements to ensure we cover all our bases.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

If Parsoid doesn't make breaking changes that cause old HTML to fail to serialize back to wikitext (and Subbu told me that was not intended), then this should not cause any problems. We'd have to update the endpoint config but that's it, and we can test that in beta when the time comes.

ssastry triaged this task as High priority.Nov 4 2019, 4:02 AM

Parsoid-PHP is now in beta ( T231569#5540707 ). Can you please test Flow against it? We would ideally like to switch over all services to Parsoid-PHP before the week before Thanksgiving.

cc @Etonkovidova ( and @MMiller_WMF: given @ssastry's proposed deployment we probably need to put this QA task into current sprint.)

I just re-checked on betalabs and when posting to Flow-based pages got the error message (filed as T234242)

[XcNLYKwQBGoAADkzCXcAAAAE] Exception caught: Request to parsoid for "html" to "wikitext" conversion of content connected to title "Topic:U76p1teboaxb1a1b" failed: (curl error: 28) Timeout was reached

As soon as this is corrected, the testing should take ~2hours.

I just re-checked on betalabs and when posting to Flow-based pages got the error message (filed as T234242)

[XcNLYKwQBGoAADkzCXcAAAAE] Exception caught: Request to parsoid for "html" to "wikitext" conversion of content connected to title "Topic:U76p1teboaxb1a1b" failed: (curl error: 28) Timeout was reached

As soon as this is corrected, the testing should take ~2hours.

Thanks! I don't know what is broken with T234242 (I asked there) ... but, in order to test against Parsoid-PHP, a Flow developer needs to write some code updating the configuration to issue requests to Parsoid/PHP. AFAIK, that hasn't happened yet. @kostajh or @Catrope will need to do that.

Change 549445 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/Flow@master] Implement X-Parsoid-Variant

https://gerrit.wikimedia.org/r/549445

Change 549448 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[operations/mediawiki-config@master] Flow: Configure enwiki beta to use Parsoid PHP

https://gerrit.wikimedia.org/r/549448

Change 549637 had a related patch set uploaded (by Catrope; owner: Catrope):
[operations/mediawiki-config@master] beta: Point Parsoid to parsoid-php instead of parsoid-js

https://gerrit.wikimedia.org/r/549637

Change 549445 abandoned by Catrope:
Implement X-Parsoid-Variant

Reason:
Superseded by https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/ /549637

https://gerrit.wikimedia.org/r/549445

Change 549448 abandoned by Catrope:
Flow: Configure enwiki beta to use Parsoid PHP

Reason:
Superseded by https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/ /549637

https://gerrit.wikimedia.org/r/549448

Change 549637 merged by jenkins-bot:
[operations/mediawiki-config@master] beta: Point Parsoid to parsoid-php instead of parsoid-js

https://gerrit.wikimedia.org/r/549637

Checked Flow pages on betalabs - with T234242 being resolved, all seem to be working now.

Change 549875 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[operations/mediawiki-config@master] [Beta] Flow: Use Parsoid/PHP

https://gerrit.wikimedia.org/r/549875

test.wikipedia.org and test2.wikipedia.org have been switched to serve Parsoid/PHP HTML only. Please test and report any issues you spot.

test.wikipedia.org and test2.wikipedia.org have been switched to serve Parsoid/PHP HTML only. Please test and report any issues you spot.

But, this probably doesn't affect Flow since Flow doesn't go through RESTBase, right? Flow still needs its own config patches.

Change 549875 merged by jenkins-bot:
[operations/mediawiki-config@master] [Beta] Use Parsoid/PHP for Flow

https://gerrit.wikimedia.org/r/549875

Mentioned in SAL (#wikimedia-operations) [2019-11-20T15:42:49Z] <mobrovac@deploy1001> Synchronized wmf-config/LabsServices.php: [BETA-ONLY] Switch Flow to use Parsoid/PHP - T229078 (duration: 00m 52s)

After that beta cluster sync, opening https://en.wikipedia.beta.wmflabs.org/wiki/Topic:Rjxs31tvmzxiisxu gives me this error:

An error has occurred while processing HTML/wikitext conversion.

Return to Main Page

[XdVgUqwQBHcAAG8WiAcAAAAQ] /wiki/Topic:Rjxs31tvmzxiisxu Flow\Exception\NoParserException from line 174 of /srv/mediawiki/php-master/extensions/Flow/includes/Conversion/Utils.php: Request to parsoid for "wikitext" to "html" conversion of content connected to title "Topic:Rjxs31tvmzxiisxu" failed: 404

Backtrace:

#0 /srv/mediawiki/php-master/extensions/Flow/includes/Conversion/Utils.php(66): Flow\Conversion\Utils::parsoid(string, string, string, Title)
#1 /srv/mediawiki/php-master/extensions/Flow/includes/Model/AbstractRevision.php(430): Flow\Conversion\Utils::convert(string, string, string, Title)
#2 /srv/mediawiki/php-master/extensions/Flow/includes/Parsoid/ContentFixer.php(41): Flow\Model\AbstractRevision->getContent(string)
#3 /srv/mediawiki/php-master/extensions/Flow/includes/Templating.php(158): Flow\Parsoid\ContentFixer->getContent(Flow\Model\PostRevision)
#4 /srv/mediawiki/php-master/extensions/Flow/includes/Formatter/RevisionFormatter.php(289): Flow\Templating->getContent(Flow\Model\PostRevision, string)
#5 /srv/mediawiki/php-master/extensions/Flow/includes/Formatter/TopicFormatter.php(45): Flow\Formatter\RevisionFormatter->formatApi(Flow\Formatter\TopicRow, Flow\View)
#6 /srv/mediawiki/php-master/extensions/Flow/includes/Block/Topic.php(648): Flow\Formatter\TopicFormatter->formatApi(Flow\Model\Workflow, array, Flow\View)
#7 /srv/mediawiki/php-master/extensions/Flow/includes/Block/Topic.php(553): Flow\Block\TopicBlock->renderTopicApi(array)
#8 /srv/mediawiki/php-master/extensions/Flow/includes/View.php(234): Flow\Block\TopicBlock->renderApi(array)
#9 /srv/mediawiki/php-master/extensions/Flow/includes/View.php(71): Flow\View->buildApiResponse(Flow\WorkflowLoader, array, string, array)
#10 /srv/mediawiki/php-master/extensions/Flow/includes/Actions/Action.php(112): Flow\View->show(Flow\WorkflowLoader, string)
#11 /srv/mediawiki/php-master/extensions/Flow/includes/Actions/ViewAction.php(20): Flow\Actions\FlowAction->showForAction(string, OutputPage)
#12 /srv/mediawiki/php-master/extensions/Flow/includes/Actions/Action.php(50): Flow\Actions\ViewAction->showForAction(string)
#13 /srv/mediawiki/php-master/includes/MediaWiki.php(514): Flow\Actions\FlowAction->show()
#14 /srv/mediawiki/php-master/includes/MediaWiki.php(304): MediaWiki->performAction(Article, Title)
#15 /srv/mediawiki/php-master/includes/MediaWiki.php(967): MediaWiki->performRequest()
#16 /srv/mediawiki/php-master/includes/MediaWiki.php(530): MediaWiki->main()
#17 /srv/mediawiki/php-master/index.php(46): MediaWiki->run()
#18 /srv/mediawiki/w/index.php(3): require(string)
#19 {main}

Change 552088 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/core@master] Parsoid VRS: Add the Host header

https://gerrit.wikimedia.org/r/552088

The above patch fixed the 404 issue.

Change 552088 merged by jenkins-bot:
[mediawiki/core@master] Parsoid VRS: Add the Host header

https://gerrit.wikimedia.org/r/552088

Change 552097 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/core@wmf/1.35.0-wmf.5] Parsoid VRS: Add the Host header

https://gerrit.wikimedia.org/r/552097

Change 552097 merged by jenkins-bot:
[mediawiki/core@wmf/1.35.0-wmf.5] Parsoid VRS: Add the Host header

https://gerrit.wikimedia.org/r/552097

Mentioned in SAL (#wikimedia-operations) [2019-11-20T18:17:25Z] <mobrovac@deploy1001> Synchronized php-1.35.0-wmf.5/includes/libs/virtualrest/ParsoidVirtualRESTService.php: Parsoid VRS: Add the Host header - T229015 T229078 T229074 (duration: 00m 52s)

@Etonkovidova Can you run your Flow tests on Beta cluster again? Basic tests as well as more complex tests to make sure all content use cases continue to work.

Note that Flow is not yet on group 1 wikis. Only on group 0 wikis (so, mw.org, officewiki, test and test2 wikis). T229015 has the deployment timeline.

@Etonkovidova Can you run your Flow tests on Beta cluster again? Basic tests as well as more complex tests to make sure all content use cases continue to work.

Done on beta cluster and testwiki - all seems to be good. Will re-check after T229015.