Page MenuHomePhabricator

Metadata comparison testing between Parsoid and the legacy parser
Open, MediumPublic

Description

We should add some code to verify that Parsoid and the legacy parser always generate exactly the same ParserOutput metadata (not just categories, but also the various flags).

Related Objects

StatusSubtypeAssignedTask
OpenReleaseNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenFeatureNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenBUG REPORTNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedcscott
OpenNone
Resolvedmatmarex
OpenNone
OpenNone
OpenNone
Resolvedcscott
OpenNone
ResolvedNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedcscott
Resolvedcscott
OpenNone
ResolvedDogu
ResolvedBUG REPORTJgiannelos
OpenNone
ResolvedBUG REPORTJgiannelos
OpenNone
OpenNone
DuplicateNone
Resolvedmatmarex
Resolvedmatmarex
Resolved mobrovac
Resolved mobrovac
Resolved mobrovac
OpenNone
Resolvedssastry
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedcscott
ResolvedABreault-WMF
Resolvedcscott
Opencscott
Resolvedssastry
OpenJgiannelos
OpenJgiannelos

Event Timeline

ssastry triaged this task as Medium priority.Jun 14 2022, 9:25 PM

This can be based on the code we added to the Linter to do real time performance comparsions: T393399.

Same codepath as before, we have parallel ParserOutputs available, record percentage identical /non-html/ metadata.

  • add "ParserOutput::compareMetadata()" function
    • return value is array of differences
  • also emit stats for each component of the metadata non-equal, so we can see whether it is categories, etc that are culprits
    • can we get stats of "% = except for indicators" eg? maybe not.