Introduce NamespaceUnlocalizer

Authored by thiemowmde on May 7 2019, 4:07 PM.


Introduce NamespaceUnlocalizer

A previous patch set used MediaWikiTitleCodec to parse the string, and
Title::getFullText() to reassemble it. Main issues with this:

  • The parser does way to many things we don't want here, e.g. silently stripping the leading colon, decoding encoded character references, and validating UTF-8.
  • It was not possible to get the canonical namespace name.

Because of the later we need to manually reassemble. But why disassemble
first when we don't care about most of the elements and features? This
code really only cares about the very first prefix.

As a consequence of this decision this parser intentionally accepts a few
more links that look like links, but will not be accepted by the actual
TitleParser later. As this code only accepts known namespace names and not
arbitrary strings, the assumption these are meant to be links should be safe.

Bug: T213821
Change-Id: I02b32a699fff8d4afb072506c6683f9fff9ed29d