Page MenuHomePhabricator

CommonsMetadata DataCollector::verifyAttributeMetadata should use ParserOutput::getRawText()
Open, Needs TriagePublic

Description

The DataCollector::verifyAttributeMetadata method should use ::getRawText(), not ::getText(), when examining parser output, since it is invoked before the parser output is stored in the case and ::getText() mutates the ParserOutput in place.

If it really needs the post-processing of ::getText(), it should invoke the OutputPipeline in a way which clones the ParserOutput.

See T365036: JSON serialization failures on media files and T365433: Cannot save VisualEditor content in File namespace.