[ This is still work in progress. Creating a stub with some information flled in. Will remove the WIP tag once it is all filled out. ]
###Project Information
* Name of tool/project: Parsoid-PHP
* Project home page: https://www.mediawiki.org/wiki/Parsing/Notes/Moving_Parsoid_Into_Core and https://www.mediawiki.org/wiki/Parsoid
* Name of team requesting review: Parsing
* Primary contact: Subbu Sastry
* Target date for deployment: Early September 2019
* Link to code repository / patchset: https://gerrit.wikimedia.org/r/q/project:mediawiki%252Fservices%252Fparsoid. I have cut a `T227209` branch of this repository. Please checkout that branch to start security review.
###Description of the tool/project
I will just point to https://www.mediawiki.org/wiki/Parsoid for now. I can follow up with any additional info as required.
But, specifically, the context for this is that we are porting Parsoid to PHP which will be integrated into core as a composer library and will run in-process on REST API (see T221738 ) requests made to MediaWiki. We want to deploy this in July/August.
Parsoid/JS (currently deployed on the Wikimedia cluster) is not exposed directly to the internet. All requests to it go through RESTBase exposed REST API for wikis (Ex: https://en.wikipedia.org/api/rest_v1/ ). But, with the Parsoid/PHP offering which will be integrated into core, we can similarly deploy to an internal cluster that is not directly accessible on the internet and disable it on the app cluster and elsewhere where the MediaWiki API is exposed to the internet.
###Description of how the tool will be used at WMF
This will replace Parsoid/JS which is used by VisualEditor, Flow, ContentTranslation, Android App, Kiwix. Eventually, this will serve all page view and edit views.
###Dependencies
//List dependencies, or upstream projects that this project relies on.//
The composer.json file in the repository is the authoritative source for this.
* In production mode, all except two libraries are used by mediawiki core. The two new libraries are `wikipeg` and `zest`. wikipeg is a wikimedia fork of the pegjs project zest is a wikimedia port of zest.js repository. Both these libraries are developed / maintained by Wikimedia engineers.
* In developer mode, all except one library are used by mediawiki core. The exception is alea which is a wikimedia port of alea.js. This port is by Wikimedia engineers and is also maintained by us.
###Has this project been reviewed before?
Not that I know of
###Working test environment
//Please link or describe setup process for setting up a test environment.//
**Parser Tests**
`php bin/parserTests.php` runs tests in all but the selser mode across all test files (~6K tests and finishes in ~55 secs on my laptop).
`php bin/parserTests.php --wt2html --wt2wt --html2wt --html2html --selser auto` runs tests in the specified modes across all test files (~26K tests and takes ~5m on my laptop)
Add the `--quiet` option to suppress verbose output.
As of Aug 19, we have 99.4% of tests green.
**PHP Unit Tests:**
`composer php-test` will run phpcs, phpunit, and phan jobs.
**Running in integrated / production mode**
For both of these above scenarios (parser tests and unit tests), you don't need a working MediaWiki installation. You can run those standalone simply by checking out the Parosid repository and running those commands after installing the vendor modules.
But, to test and review the Parsoid REST API, you will need to run in MW-integration mode. `https://github.com/wikimedia/parsoid/tree/T227209/extension` has the instructions for running Parsoid in integrated mode.
Parsoid/JS API is documented [[https://www.mediawiki.org/wiki/Parsoid/API|on wiki]]. Parsoid/PHP supports all of those endpoints (* FIXME: pb2pb is in gerrit and hasn't been reviewed and merged yet. What else? *).
Parsoid/PHP in integrated mode also exists on scandium. Assuming you have access to the server, you can hit the Parsoid API by logging to that server or via an ssh tunnel. We'll send the curl commands to access those endpoints if you need them.
**Running Parsoid/PHP against an external wiki**
From a security review point of view, Parsoid when run in this mode is simply yet another MediaWiki API client. I am including it here simply for completeness' sake.
For development and debugging purposes, Parsoid also supports accessing an external MediaWiki installation via its action API. Parsoid/JS doesn't require that external wiki to have installed the ParsoidBatchAPI extension, but Parsoid/PHP depends on that currently (* TODO: Verify *).
The `bin/parse.js` script lets you point Parsoid to an arbitrary mediawiki installation. The `bin/parse.php` script is still under development and doesn't yet support that. It defaults to the Wikimedia enwiki installation. `bin/parse.php --help` should be informative if you wish to run this.
###Post-deployment
//Name of team responsible for tool/project after deployment and primary contact.//