Page MenuHomePhabricator

Flow stores invalid UTF8 sequences in ParserOutput::mExternalLinks
Closed, ResolvedPublic

Description

When adding a link with broken URL encoded character (e.g. %E8) Flow will append the urldecode version of this URL to the ParserOutput.
urldecode produces invalid UTF8 sequences for such characters.
The consequence is that jobs using ParserOutput::getExternalLinks() in their content are unable to be pushed because json_encode will refuse to encode such characters.

Details

Related Gerrit Patches:
mediawiki/extensions/Flow : masterUse ParserOutput::addExternalLink

Event Timeline

dcausse created this task.Jul 2 2019, 3:02 PM
Restricted Application added a project: Growth-Team. · View Herald TranscriptJul 2 2019, 3:02 PM
dcausse renamed this task from Flow stores invalid UTF8 sequences in ParserOutput::mExternalLinks when parsing to Flow stores invalid UTF8 sequences in ParserOutput::mExternalLinks.Jul 2 2019, 3:03 PM

Change 520258 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/Flow@master] Use ParserOutput::addExternalLink

https://gerrit.wikimedia.org/r/520258

dcausse triaged this task as Medium priority.Jul 2 2019, 3:08 PM
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board.

Change 520258 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Use ParserOutput::addExternalLink

https://gerrit.wikimedia.org/r/520258

kostajh closed this task as Resolved.Jul 17 2019, 11:59 PM
kostajh added a subscriber: kostajh.

Marking resolved per https://phabricator.wikimedia.org/T227098#5304446, but please re-open if this needs more work.

Change 526731 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] Normalize request param name

https://gerrit.wikimedia.org/r/526731

tagged the wrong ticket in the patch, please ignore.