Page MenuHomePhabricator

TextPassDumper.php messages on retry of fetch of content have extra junk in them
Open, MediumPublic

Description

Instead of, for example,

getting/checking text 1965 failed (Received text is unplausible for id 1965) (Will retry 4 more times)

in the dump output logs, we now see entries like

getting/checking text 1959 failed (Received text is unplausible for id 1959) for revision 1959
      3
      
       (Will retry 4 more times)

Event Timeline

ArielGlenn triaged this task as Medium priority.Apr 3 2020, 7:38 AM
ArielGlenn created this task.
ArielGlenn updated the task description. (Show Details)

I've pushed through https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/585682/ for now to keep my exception notifier happy.

A check of the prefetch content file for angwikibooks against log file entries shows that the extra item being stuffed onto the end of $thisRev is the parentid, along with various newlines.

Nope. It's the user id, it turns out.

It looks like adding the logging of $thisRev was added in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/556346/ and so the behavior of concatenating extra stuff onto the revision may have been like that forever. The cast to int in the invocation of $this->prefetch->prefetch() saves us from any problems there. I am not sure what the correct approach to fixing this is, whether to avoid stuffing extra stuff onto the end, clean up the value that ends up in the log message, etc.

ArielGlenn renamed this task from TextPassDumper.php messages on retry of fetch of content have extra junk in tem to TextPassDumper.php messages on retry of fetch of content have extra junk in them.Apr 3 2020, 11:04 AM