Page MenuHomePhabricator

Commons file: Exception encountered, of type "Exception"
Closed, ResolvedPublic

Description

Opening https://commons.wikimedia.org/wiki/File:Georg_August_Samuel_von_Nassau-Idstein.jpg results in error message: Exception encountered, of type "Exception"

https://de.wikipedia.org/wiki/Datei:Georg_August_Samuel_von_Nassau-Idstein.jpg shows the file.

What is wrong?

Event Timeline

Thgoiter raised the priority of this task from to Needs Triage.
Thgoiter updated the task description. (Show Details)
Thgoiter subscribed.
Restricted Application added subscribers: Steinsplitter, Aklapper. · View Herald Transcript
2015-10-29 19:58:17 mw1210 commonswiki exception ERROR: [bb4f3777] /wiki/File:Georg_August_Samuel_von_Nassau-Idstein.jpg   Exception from line 957 of /srv/mediawiki/php-1.27.0-wmf.4/includes/MagicWord.php: MagicWordArray::matchAndRemove: preg_match_all returned false {"exception_id":"bb4f3777"} 
[Exception Exception] (/srv/mediawiki/php-1.27.0-wmf.4/includes/MagicWord.php:957) MagicWordArray::matchAndRemove: preg_match_all returned false
  #0 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(4349): MagicWordArray->matchAndRemove(string)
  #1 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(1264): Parser->doDoubleUnderscore(string)
  #2 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(441): Parser->internalParse(string)
  #3 /srv/mediawiki/php-1.27.0-wmf.4/includes/OutputPage.php(1738): Parser->parse(string, Title, ParserOptions, boolean, boolean, integer)
  #4 /srv/mediawiki/php-1.27.0-wmf.4/includes/OutputPage.php(1680): OutputPage->addWikiTextTitle(string, Title, boolean, boolean, boolean)
  #5 /srv/mediawiki/php-1.27.0-wmf.4/includes/page/ImagePage.php(209): OutputPage->addWikiText(string)
  #6 /srv/mediawiki/php-1.27.0-wmf.4/includes/actions/ViewAction.php(44): ImagePage->view()
  #7 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(457): ViewAction->show()
  #8 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(254): MediaWiki->performAction(ImagePage, Title)
  #9 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(669): MediaWiki->performRequest()
  #10 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(474): MediaWiki->main()
  #11 /srv/mediawiki/php-1.27.0-wmf.4/index.php(41): MediaWiki->run()
  #12 /srv/mediawiki/w/index.php(3): include(string)
  #13 {main}

The direct cause is https://gerrit.wikimedia.org/r/#/c/246719/ / T115514. Seems to have something to do with __MAGIC__ words but the page doesn't have any.

tgr@terbium:~$ mwscript eval.php --wiki=commonswiki
> $mwa = MagicWord::getDoubleUnderscoreArray();
> var_dump($mwa->getRegex());
array(2) {
  [0]=>
  string(295) "/(?P<a_notoc>__NOTOC__)|(?P<a_nogallery>__NOGALLERY__)|(?P<a_forcetoc>__FORCETOC__)|(?P<a_toc>__TOC__)|(?P<a_noeditsection>__NOEDITSECTION__)|(?P<a_notitleconvert>__NOTITLECONVERT__)|(?P<b_notitleconvert>__NOTC__)|(?P<a_nocontentconvert>__NOCONTENTCONVERT__)|(?P<b_nocontentconvert>__NOCC__)/iuS"
  [1]=>
  string(245) "/(?P<a_newsectionlink>__NEWSECTIONLINK__)|(?P<a_nonewsectionlink>__NONEWSECTIONLINK__)|(?P<a_hiddencat>__HIDDENCAT__)|(?P<a_index>__INDEX__)|(?P<a_noindex>__NOINDEX__)|(?P<a_staticredirect>__STATICREDIRECT__)|(?P<a_disambiguation>__DISAMBIG__)/S"
}
> foreach ( $mwa->getRegex() as $regex ) { var_dump(preg_match_all($regex, '', $matches, PREG_SET_ORDER)); }
int(0)
int(0)

Change 249873 had a related patch set uploaded (by Gergő Tisza):
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249873

Change 249873 merged by jenkins-bot:
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249873

Change 249888 had a related patch set uploaded (by Gergő Tisza):
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249888

Change 249888 merged by jenkins-bot:
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249888

It's a PREG_BAD_UTF8_ERROR apparently.

That's interesting. It should be pretty impossible to put invalid utf8 into a wikipage. I wonder if that file was uploaded via gwtoolset or importImages.php scripts, which might have less utf8 checking.

/me also wonders what happened to our nice error handler that displayed prettily in the skin.

Change 249898 had a related patch set uploaded (by Gergő Tisza):
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249898

Change 249901 had a related patch set uploaded (by Gergő Tisza):
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249901

That's interesting. It should be pretty impossible to put invalid utf8 into a wikipage. I wonder if that file was uploaded via gwtoolset or importImages.php scripts, which might have less utf8 checking.

Its actually erroring on the metadata table

Why is Parser->doDoubleUnderscore even called on the metadata table?

For historical reasons, MW's exif handling is extremely nutty.

The entire metadata table gets run through the parser.

Change 249898 merged by jenkins-bot:
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249898

Change 249901 merged by jenkins-bot:
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249901

Tgr claimed this task.

Exception ugliness is T117128.
Lack of UTF sanitization of the metadata table is T117129.
The more generic issue of why the table is parsed is T117130.