Page MenuHomePhabricator

Revision::getContentInternal() is not handling text load failure properly
Closed, ResolvedPublic

Description

In that function, it calls $this->loadText() to load text. In loadText(), it calls self::getRevisionText() and it's said to "@return String: text the text requested or false on failure". When loading of text fails, loadText() returns false, then false is passed to $handler->unserializeContent(), causing exception "TextContent expects a string in the constructor."

TextContent expects a string in the constructor.
Backtrace:
#0 /home/liangent/code/gerrit/mediawiki/core/includes/content/WikitextContent.php(8): TextContent->construct(false, 'wikitext')
#1 /home/liangent/code/gerrit/mediawiki/core/includes/content/ContentHandler.php(1185): WikitextContent->
construct(false)
#2 /home/liangent/code/gerrit/mediawiki/core/includes/Revision.php(979): WikitextContentHandler->unserializeContent(false, 'text/x-wiki')
#3 /home/liangent/code/gerrit/mediawiki/core/includes/Revision.php(921): Revision->getContentInternal()
#4 /home/liangent/code/gerrit/mediawiki/core/includes/Revision.php(897): Revision->getContent(1, NULL)
#5 /home/liangent/code/gerrit/mediawiki/core/extensions/Wikipedia/classes/WikipediaHooks.php(273): Revision->getText()
...


Version: unspecified
Severity: normal

Details

Reference
bz41244

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 1:04 AM
bzimport set Reference to bz41244.
bzimport added a subscriber: Unknown Object (MLST).

sumanah wrote:

Liangent, about how often does this affect the end user, and under what circumstances? Thanks for the report.

Liangent: thanks for pointing this out.

This looks like something that should be fixed but is at the same time not high priority, as it is about the nice handling of a failure case which should not happen to begin with. (Unless I misunderstand the issue, which is quite possible, as I did not look at it that much.)

(In reply to comment #2)

Liangent: thanks for pointing this out.

This looks like something that should be fixed but is at the same time not high
priority, as it is about the nice handling of a failure case which should not
happen to begin with. (Unless I misunderstand the issue, which is quite
possible, as I did not look at it that much.)

I'm running a bot using MediaWiki as a framework on Toolserver with replicated DB. Since text is not available in DB, when $rev->getText() is called, text is loaded from my modified Revision::loadText() via network to WMF server which has a higher chance to fail. In past, $rev->getText() returns false and my bot code can ignore that. Now my bot process is just killed.

It's possible to catch the exception and check for the string "TextContent expects a string in the constructor." (because I don't want to ignore other kinds of exceptions) but determining error type by looking at the human-readable error message sounds nasty.

(In reply to comment #5)

https://gerrit.wikimedia.org/r/#/c/29597/

This makes the caller unable to distinguish loading failure or success with really blank content. See my use case above.

(In reply to comment #7)

https://gerrit.wikimedia.org/r/#/c/29597/ has been merged. Close bug as Fixed?

Not so satisfied with this resolution... see my comment above.

I'll let Daniel take a look at that and see how we can handle these cases in a nicer way. He will be back next week.

THIMC: As Liangent points out, the above patch does not restore the previous behavior of getText() to return false on failure. However, that was never documented behavior, and getText() itself is now deprecated. Because of this, I think it's fine not to try to restore that behavior (and return an empty string instead). The problem with returning fdalse on failure is that now, getText() calls getContent(), and getContent() will return null if the content failed to load or was not available to the desired audient - you can't tell which.

Closed older resolved bugs as verified.