Page MenuHomePhabricator

Implement TOC comparison in Linter extension
Open, MediumPublic

Description

This is a first step towards T310511: Metadata comparison testing between Parsoid and the legacy parser along the lines suggested in T310511#10956792:

This can be based on the code we added to the Linter to do real time performance comparsions: T393399.

Same codepath as before, we have parallel ParserOutputs available, record percentage identical /non-html/ metadata.

  • add "ParserOutput::compareMetadata()" function
    • return value is array of differences
  • also emit stats for each component of the metadata non-equal, so we can see whether it is categories, etc that are culprits
    • can we get stats of "% = except for indicators" eg? maybe not.

We're going to compare TOC metadata specifically to start (rather than compare *all* metadata) in order to get metrics and identify remaining bugs for T331483: Resolve differences between Parsoid & legacy parser TOC metadata output for template, extension, and parser-function generated content.

We can start by emitting a "hidden" lint for any page where the TOCData differs between legacy and Parsoid. Investigating those will probably reveal more fine-grained lints we can apply (eg, "missing entries", "wrong section ID", etc).

Related Objects

StatusSubtypeAssignedTask
OpenReleaseNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenFeatureNone
OpenNone
OpenNone
Resolvedcscott
OpenNone
OpenNone
OpenBUG REPORTNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenJgiannelos

Event Timeline

Change #1189524 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):

[mediawiki/extensions/Linter@master] Compare TOCData output between legacy and parsoid

https://gerrit.wikimedia.org/r/1189524

MSantos triaged this task as Medium priority.Fri, Nov 21, 10:26 AM