Page MenuHomePhabricator

CommonsMetadata DataCollector::verifyAttributionMetadata should use ParserOutput::getRawText()
Closed, ResolvedPublic

Description

The DataCollector::verifyAttributionMetadata method should use ::getRawText(), not ::getText(), when examining parser output, since it is invoked before the parser output is stored in the case and ::getText() mutates the ParserOutput in place.

If it really needs the post-processing of ::getText(), it should invoke the OutputPipeline in a way which clones the ParserOutput.

See T365036: JSON serialization failures on media files and T365433: Cannot save VisualEditor content in File namespace.

Event Timeline

Tgr renamed this task from CommonsMetadata DataCollector::verifyAttributeMetadata should use ParserOutput::getRawText() to CommonsMetadata DataCollector::verifyAttributionMetadata should use ParserOutput::getRawText().Aug 4 2024, 3:25 PM
Tgr updated the task description. (Show Details)

Just using getRawText() should be fine.

Change #1091275 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/extensions/CommonsMetadata@master] Replace uses of deprecated ParserOutput::getText()

https://gerrit.wikimedia.org/r/1091275

Change #1091275 merged by jenkins-bot:

[mediawiki/extensions/CommonsMetadata@master] Replace uses of deprecated ParserOutput::getText()

https://gerrit.wikimedia.org/r/1091275