Page MenuHomePhabricator

Consider using Parsoid's scrubWikitext
Closed, ResolvedPublic

Description

This is a parameter for html2wt (which we use anytime someone edits existing content with a wikitext editor) that applies certain normalizations. Per an email from @ssastry, the main goal is to help solve VE problems that are best addressed on the Parsoid side.

Currently, it is opt-in and doesn't do anything, but normalizations will soon be added (if they haven't been already).

Event Timeline

Mattflaschen-WMF raised the priority of this task from to Needs Triage.
Mattflaschen-WMF updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mattflaschen-WMF renamed this task from Investigate Parsoid's scrubWikitext to Consider using Parsoid's scrubWikitext.Apr 23 2015, 3:07 AM
Mattflaschen-WMF set Security to None.

Change 206186 had a related patch set uploaded (by Catrope):
Pass scrubWikitext=true to Parsoid

https://gerrit.wikimedia.org/r/206186

How can this be tested? What is it supposed to do?

According to description, "it is opt-in and doesn't do anything", so can't really test much yet. But setting the param seems fine.

Change 206186 merged by jenkins-bot:
Pass scrubWikitext=true to Parsoid

https://gerrit.wikimedia.org/r/206186

We have one normalization already deployed:

  • Strip empty headings : <h1></h1> or <h2></h2>, etc. will be removed. Otherwise, they will serialize to =<nowiki/>=, ==<nowiki/>==, etc.