Page MenuHomePhabricator

Commons file: Exception encountered, of type "Exception"
Closed, ResolvedPublic

Description

Opening https://commons.wikimedia.org/wiki/File:Georg_August_Samuel_von_Nassau-Idstein.jpg results in error message: Exception encountered, of type "Exception"

https://de.wikipedia.org/wiki/Datei:Georg_August_Samuel_von_Nassau-Idstein.jpg shows the file.

What is wrong?

Details

Event Timeline

Thgoiter created this task.Oct 29 2015, 5:48 PM
Thgoiter raised the priority of this task from to Needs Triage.
Thgoiter updated the task description. (Show Details)
Thgoiter added a subscriber: Thgoiter.
Restricted Application added a project: Multimedia. · View Herald TranscriptOct 29 2015, 5:48 PM
Restricted Application added subscribers: Steinsplitter, Aklapper. · View Herald Transcript
Tgr added a subscriber: Tgr.EditedOct 29 2015, 7:58 PM
2015-10-29 19:58:17 mw1210 commonswiki exception ERROR: [bb4f3777] /wiki/File:Georg_August_Samuel_von_Nassau-Idstein.jpg   Exception from line 957 of /srv/mediawiki/php-1.27.0-wmf.4/includes/MagicWord.php: MagicWordArray::matchAndRemove: preg_match_all returned false {"exception_id":"bb4f3777"} 
[Exception Exception] (/srv/mediawiki/php-1.27.0-wmf.4/includes/MagicWord.php:957) MagicWordArray::matchAndRemove: preg_match_all returned false
  #0 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(4349): MagicWordArray->matchAndRemove(string)
  #1 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(1264): Parser->doDoubleUnderscore(string)
  #2 /srv/mediawiki/php-1.27.0-wmf.4/includes/parser/Parser.php(441): Parser->internalParse(string)
  #3 /srv/mediawiki/php-1.27.0-wmf.4/includes/OutputPage.php(1738): Parser->parse(string, Title, ParserOptions, boolean, boolean, integer)
  #4 /srv/mediawiki/php-1.27.0-wmf.4/includes/OutputPage.php(1680): OutputPage->addWikiTextTitle(string, Title, boolean, boolean, boolean)
  #5 /srv/mediawiki/php-1.27.0-wmf.4/includes/page/ImagePage.php(209): OutputPage->addWikiText(string)
  #6 /srv/mediawiki/php-1.27.0-wmf.4/includes/actions/ViewAction.php(44): ImagePage->view()
  #7 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(457): ViewAction->show()
  #8 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(254): MediaWiki->performAction(ImagePage, Title)
  #9 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(669): MediaWiki->performRequest()
  #10 /srv/mediawiki/php-1.27.0-wmf.4/includes/MediaWiki.php(474): MediaWiki->main()
  #11 /srv/mediawiki/php-1.27.0-wmf.4/index.php(41): MediaWiki->run()
  #12 /srv/mediawiki/w/index.php(3): include(string)
  #13 {main}
Restricted Application added a subscriber: Matanya. · View Herald TranscriptOct 29 2015, 7:58 PM
Tgr added a comment.EditedOct 29 2015, 8:19 PM

The direct cause is https://gerrit.wikimedia.org/r/#/c/246719/ / T115514. Seems to have something to do with __MAGIC__ words but the page doesn't have any.

Tgr added a comment.Oct 29 2015, 8:32 PM
tgr@terbium:~$ mwscript eval.php --wiki=commonswiki
> $mwa = MagicWord::getDoubleUnderscoreArray();
> var_dump($mwa->getRegex());
array(2) {
  [0]=>
  string(295) "/(?P<a_notoc>__NOTOC__)|(?P<a_nogallery>__NOGALLERY__)|(?P<a_forcetoc>__FORCETOC__)|(?P<a_toc>__TOC__)|(?P<a_noeditsection>__NOEDITSECTION__)|(?P<a_notitleconvert>__NOTITLECONVERT__)|(?P<b_notitleconvert>__NOTC__)|(?P<a_nocontentconvert>__NOCONTENTCONVERT__)|(?P<b_nocontentconvert>__NOCC__)/iuS"
  [1]=>
  string(245) "/(?P<a_newsectionlink>__NEWSECTIONLINK__)|(?P<a_nonewsectionlink>__NONEWSECTIONLINK__)|(?P<a_hiddencat>__HIDDENCAT__)|(?P<a_index>__INDEX__)|(?P<a_noindex>__NOINDEX__)|(?P<a_staticredirect>__STATICREDIRECT__)|(?P<a_disambiguation>__DISAMBIG__)/S"
}
> foreach ( $mwa->getRegex() as $regex ) { var_dump(preg_match_all($regex, '', $matches, PREG_SET_ORDER)); }
int(0)
int(0)

Change 249873 had a related patch set uploaded (by Gergő Tisza):
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249873

Change 249873 merged by jenkins-bot:
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249873

Change 249888 had a related patch set uploaded (by Gergő Tisza):
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249888

Change 249888 merged by jenkins-bot:
Include preg_last_error() in error message when preg_* fails

https://gerrit.wikimedia.org/r/249888

Tgr added a comment.Oct 29 2015, 9:43 PM

It's a PREG_BAD_UTF8_ERROR apparently.

That's interesting. It should be pretty impossible to put invalid utf8 into a wikipage. I wonder if that file was uploaded via gwtoolset or importImages.php scripts, which might have less utf8 checking.

/me also wonders what happened to our nice error handler that displayed prettily in the skin.

Change 249898 had a related patch set uploaded (by Gergő Tisza):
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249898

Change 249901 had a related patch set uploaded (by Gergő Tisza):
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249901

That's interesting. It should be pretty impossible to put invalid utf8 into a wikipage. I wonder if that file was uploaded via gwtoolset or importImages.php scripts, which might have less utf8 checking.

Its actually erroring on the metadata table

Tgr added a comment.Oct 29 2015, 10:05 PM

Why is Parser->doDoubleUnderscore even called on the metadata table?

For historical reasons, MW's exif handling is extremely nutty.

The entire metadata table gets run through the parser.

Change 249898 merged by jenkins-bot:
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249898

Change 249901 merged by jenkins-bot:
Revert throwing exceptions on preg_* failures

https://gerrit.wikimedia.org/r/249901

Tgr closed this task as Resolved.Oct 29 2015, 11:31 PM
Tgr claimed this task.

Exception ugliness is T117128.
Lack of UTF sanitization of the metadata table is T117129.
The more generic issue of why the table is parsed is T117130.

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 3 2015, 10:45 PM