- Name of tool/project: Parsoid-PHP
- Project home page: https://www.mediawiki.org/wiki/Parsing/Notes/Moving_Parsoid_Into_Core and https://www.mediawiki.org/wiki/Parsoid
- Name of team requesting review: Parsing
- Primary contact: Subbu Sastry
- Target date for deployment: Early September 2019
- Link to code repository / patchset: https://gerrit.wikimedia.org/r/q/project:mediawiki%252Fservices%252Fparsoid. I have cut a T227209 branch of this repository. Please checkout that branch to start security review.
Description of the tool/project
I will just point to https://www.mediawiki.org/wiki/Parsoid for now. I can follow up with any additional info as required.
But, specifically, the context for this is that we are porting Parsoid to PHP which will be integrated into core as a composer library and will run in-process on REST API (see T221738 ) requests made to MediaWiki. We want to deploy this in July/August.
Parsoid/JS (currently deployed on the Wikimedia cluster) is not exposed directly to the internet. All requests to it go through RESTBase exposed REST API for wikis (Ex: https://en.wikipedia.org/api/rest_v1/ ). But, with the Parsoid/PHP offering which will be integrated into core, we can similarly deploy to an internal cluster that is not directly accessible on the internet and disable it on the app cluster and elsewhere where the MediaWiki API is exposed to the internet.
Description of how the tool will be used at WMF
This will replace Parsoid/JS which is used by VisualEditor, Flow, ContentTranslation, Android App, Kiwix. Eventually, this will serve all page view and edit views.
List dependencies, or upstream projects that this project relies on.
The composer.json file in the repository is the authoritative source for this.
- In production mode, all except two libraries are used by mediawiki core. The two new libraries are wikipeg and zest. wikipeg is a wikimedia fork of the pegjs project zest is a wikimedia port of zest.js repository. Both these libraries are developed / maintained by Wikimedia engineers.
- In developer mode, all except one library are used by mediawiki core. The exception is alea which is a wikimedia port of alea.js. This port is by Wikimedia engineers and is also maintained by us.
Has this project been reviewed before?
Not that I know of
Working test environment
Please link or describe setup process for setting up a test environment.
php bin/parserTests.php runs tests in all but the selser mode across all test files (~6K tests and finishes in ~55 secs on my laptop).
php bin/parserTests.php --wt2html --wt2wt --html2wt --html2html --selser auto runs tests in the specified modes across all test files (~26K tests and takes ~5m on my laptop)
Add the --quiet option to suppress verbose output.
All parser tests should be green.
PHP Unit Tests:
composer test will run phpcs, phpunit, and phan jobs.
Running in integrated / production mode
For both of these above scenarios (parser tests and unit tests), you don't need a working MediaWiki installation. You can run those standalone simply by checking out the Parosid repository and running those commands after installing the vendor modules.
But, to test and review the Parsoid REST API, you will need to run in MW-integration mode. https://github.com/wikimedia/parsoid/tree/T227209/extension has the instructions for running Parsoid in integrated mode.
Parsoid/JS API is documented on wiki. Parsoid/PHP supports all of those endpoints (there are still test failures on the API tests and Arlo is working through those).
Parsoid/PHP in integrated mode also exists on scandium. Assuming you have access to the server, you can hit the Parsoid API by logging to that server or via an ssh tunnel. We'll send the curl commands to access those endpoints if you need them.
Running Parsoid/PHP against an external wiki
From a security review point of view, Parsoid when run in this mode is simply yet another MediaWiki API client. I am including it here simply for completeness' sake.
For development and debugging purposes, Parsoid also supports accessing an external MediaWiki installation via its action API. Parsoid/JS doesn't require that external wiki to have installed the ParsoidBatchAPI extension, but Parsoid/PHP depends on that currently.
The bin/parse.js script lets you point Parsoid to an arbitrary mediawiki installation. bin/parse.php also supports this. bin/parse.php --help should be informative if you wish to run this.
Name of team responsible for tool/project after deployment and primary contact.