Besides converting wikitext to HTML, Parsoid has the ability to convert HTML to wikitext. This opens up the possibility of taking HTML from different applications and converting them to wikitext.
Examples:
- converting phabricator output to wikitext
- converting markdown to wikitext
- converting google doc output to wikitext
- converting word doc output to wikitext
For #2, there is https://gerrit.wikimedia.org/r/#/c/225253/.
For #3, a quick prototype is at http://gwicke.github.io/paste2wiki/
For #4, ckeditor plugin for paste from word might be useful to look at to borrow tricks from for normalizing that HTML for conversion to wikitext
Strictly speaking Parsoid is not required, but Parsoid could provide a simpler / unified interface for this.
So, the goals of this project would be to:
(a) develop a webservice that provides these conversion utilities in a single place.
(b) talks to Parsoid under the hood to do these necessary conversions
(c) does any necessarily HTML cleanup / normalization to nudge Parsoid to provide clean wikitext
(d) tweak Parsoid's HTML to wikitext code to better enable these transformations
An example of (d) is T127207
Ideally, this code will be part of a NPM package that can then be used in Parsoid (and elsewhere that might benefit from it).
Details:
- Primary mentor: <>
- Co-mentor: <>
- Other mentors: (optional, Phabricator username) <>
- Skills: node.js, some familiarity with wikitext
- Estimated project time for a senior contributor: 2-4 weeks
- Microtasks: T129562