Rewrite WikiTextContentCleaner to support nested templates

Authored by thiemowmde on Jun 22 2018, 4:58 PM.

Unpublished Commit · Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.
This commit has been deleted in the repository: it is no longer reachable from any branch, tag, or ref.


Rewrite WikiTextContentCleaner to support nested templates

The original implementation was not able to understand nested templates. New
plan: The regex only searches for the start of a template. This bit of the
wikitext can easily be detected.

From this position, we need to run an actual tokenizer to find the parameters
that belong to a template. This is for a later patch.

My user script https://de.wikipedia.org/wiki/Benutzer:TMg/autoFormatter.js/Beta.js
contains such a parser already.

Bug: T194505
Change-Id: I33a9d21a346ace3bca1119b8a65f387db1707f86