This is a first step towards T310511: Metadata comparison testing between Parsoid and the legacy parser along the lines suggested in T310511#10956792:
This can be based on the code we added to the Linter to do real time performance comparsions: T393399.
Same codepath as before, we have parallel ParserOutputs available, record percentage identical /non-html/ metadata.
- add "ParserOutput::compareMetadata()" function
- return value is array of differences
- also emit stats for each component of the metadata non-equal, so we can see whether it is categories, etc that are culprits
- can we get stats of "% = except for indicators" eg? maybe not.
We're going to compare TOC metadata specifically to start (rather than compare *all* metadata) in order to get metrics and identify remaining bugs for T331483: Resolve differences between Parsoid & legacy parser TOC metadata output for template, extension, and parser-function generated content.
We can start by emitting a "hidden" lint for any page where the TOCData differs between legacy and Parsoid. Investigating those will probably reveal more fine-grained lints we can apply (eg, "missing entries", "wrong section ID", etc).