Page MenuHomePhabricator

PHP Notice: Uninitialized string offset: includes/libs/JavaScriptMinifier.php on line 571
Closed, ResolvedPublic

Description

-rakkaus:#mediawiki-i18n- [18-Nov-2014 14:18:47 UTC] PHP Notice:  Uninitialized string offset: 710 in /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php on line 571

Details

Reference
bz73556

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 3:51 AM
bzimport set Reference to bz73556.
Nemo_bis created this task.Nov 18 2014, 2:50 PM

Excerpt:

class JavaScriptMinifier {
public static function minify( $s, $statementsOnOwnLine = false, $maxLineLength = 1000 ) {
  // ..

  $out = '';
  $pos = 0;
  $length = strlen( $s );
  $lineLength = 0;
  $last = ';';
  while( $pos < $length ) {
    // ..

    $out .= $token;
    $lineLength += $end - $pos;
    $last = $s[$end - 1];
    $pos = $end;

The while loop manipulates $end to reflect where we are in the code. In some cases it advanced artificially based on changes or assumptions it made. I guess it must be off in some cases.

Does this one happen often on translatewiki? Can you produce a stack trace (so that we know whether it's triggered by ResourceLoaderFileModule, WikiModule, something else, etc. And ideally also an excerpt of $s (e.g. the first 20 characters) which will probably identify what the input script was.

@Nemo: Can you help get a test case? By getting a stack trace we'll narrow down whehter this is from a file module, wiki module or other generated script. And by getting the argument we'll identify the script in question.

Krinkle updated the task description. (Show Details)Nov 24 2014, 3:29 PM
Krinkle set Security to None.

I finally got a backtrace and a URL. There must be some sort of cache as the notice is not reproducible by accessing the url.

It seems the reason is missing / from beginning of a comment: https://translatewiki.net/wiki/User:Arystanbek/common.js

2014-12-04 11:48:09 translatewiki.net translatewiki_net-bw_: [6d10d1d5] /w/load.php?debug=false&lang=kk-cyrl&modules=user&only=scripts&skin=vector&user=Arystanbek&version=20141204T114746Z&*   ErrorException from line 571 of /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php: PHP Notice: Uninitialized string offset: 710
#0 /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php(571): MWExceptionHandler::handleError(8, 'Uninitialized s...', '/www/translatew...', 571, Array)
#1 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(194): JavaScriptMinifier::minify('/*?User:Arystan...', false, 1000)
#2 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(1035): ResourceLoader->filter('minify-js', '/*?User:Arystan...')
#3 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(647): ResourceLoader->makeModuleResponse(Object(ResourceLoaderContext), Array, Array)
#4 /www/translatewiki.net/w/load.php(45): ResourceLoader->respond(Object(ResourceLoaderContext))
#5 {main}
Jdforrester-WMF removed Catrope as the assignee of this task.Apr 17 2015, 5:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2015, 9:12 PM

I can still reproduce this:

> $a = json_decode("\" * \u0411\u0435\u0442 \u043c\u04d9\u0442\u0456\u043d\u0456\u043d \u041c\u0435\u0434\u0438\u0430\u0423\u0438\u043a\u0438 \u0445\u0430\u0431\u0430\u0440 \u0430\u0442\u0442\u0430\u0440\u044b\u043d\u0430 \u0430\u0443\u044b\u0441\u0442\u044b\u0440\u044b\u043f \u043a\u04e9\u0440\u0441\u0435\u0442\u0435\u0442\u0456\u043d \u0441\u0456\u043b\u0442\u0435\u043c\u0435\u043d\u0456 \u049b\u04b1\u0440\u0430\u043b\u0434\u0430\u0440 \u0431\u04e9\u043b\u0456\u043c\u0456\u043d\u0435 \u049b\u043e\u0441\u0443\n * Revision: 1.0\n * Author: Edokter\n * *Source code: [[:en:MediaWiki:Gadget-ShowMessageNames.js]]\n *\/\n \n$( document ).ready( function() {\n  mw.util.addPortletLink(\n    'p-tb',\n    location.href.replace( location.hash, '' ) + ( location.search ? '&' : '?' ) + 'uselang=qqx',\n    '\u0416\u04af\u0439\u0435 \u0445\u0430\u0431\u0430\u0440\u044b\u043d\u044b\u04a3 \u0430\u0442\u0430\u0443\u043b\u0430\u0440\u044b',\n    't-messagenames',\n    '\u0411\u0435\u0442 \u043c\u04d9\u0442\u0456\u043d\u0456\u043d \u041c\u0435\u0434\u0438\u0430\u0423\u0438\u043a\u0438 \u0445\u0430\u0431\u0430\u0440 \u0430\u0442\u0430\u0443\u043b\u0430\u0440\u044b\u043d\u0430 \u0430\u0443\u044b\u0441\u0442\u044b\u0440\u044b\u043f \u043a\u04e9\u0440\u0441\u0435\u0442\u0443'\n  );\n});\"");
> echo JavaScriptMinifier::minify($a);

PHP Notice:  Uninitialized string offset: 643 in /vagrant/mediawiki/includes/libs/JavaScriptMinifier.php on line 579
PHP Stack trace:
PHP   3. eval() /vagrant/mediawiki/maintenance/eval.php:78
PHP   4. JavaScriptMinifier::minify() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1
Krinkle lowered the priority of this task from Normal to Low.May 12 2017, 5:55 PM
Od1n added a subscriber: Od1n.Dec 15 2017, 5:45 AM

Test case: JavaScriptMinifier::minify('*/');

The * is recognized as a punctuation character, and the / is recognized as a regex start, that's where there are increments of 2, $end reaches 3, then we hit the "break" in the loop and $end is staying overflown.

Restricted Application added a project: Performance-Team. · View Herald TranscriptDec 15 2017, 5:45 AM
Od1n added a comment.Dec 15 2017, 6:09 AM

Possible fix, inserting near line 484:

	// Regexp literal, search to the end, skipping over backslash escapes and
	// character classes
	for( ; ; ) {
		do{
			$end += strcspn( $s, '/[\\', $end ) + 2;
		} while( $end - 2 < $length && $s[$end - 2] === '\\' );
		$end--;
		if( $end - 1 >= $length || $s[$end - 1] === '/' ) {
			break;
		}
		do{
			$end += strcspn( $s, ']\\', $end ) + 2;
		} while( $end - 2 < $length && $s[$end - 2] === '\\' );
		$end--;
	};
+	if( $end > $length ) {
+		$end--;
+	}
	// Search past the regexp modifiers (gi)
	while( $end < $length && ctype_alpha( $s[$end] ) ) {
		$end++;
	}

Change 399861 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Improve docs around parsing of regexp literals

https://gerrit.wikimedia.org/r/399861

Krinkle claimed this task.Dec 22 2017, 6:04 PM

@Od1n Thank you for your patch. To our knowledge, the impact of this PHP notice is low because the fallback behaviour when the error happens seems to result in the correct outcome. As such, this has relatively low priority, and relatively high risk of breaking something. As such, I cannot promise that this will be implemented soon.

However, I will review the code thoroughly and consider your patch. As a starting point, I have improved the test coverage of this class, removed some unused code, and improved the documentation.

Od1n added a comment.Dec 23 2017, 2:37 AM

Indeed, the error happens only on invalid input. Still, there is a clear possibility of overflow here, fixing it is good to do. You should consider as luck it resulted in just a php notice :)

But please consider my above patch only as a starting point, a demonstration of concept. It probably could be improved.

Change 399861 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Improve docs around parsing of regexp literals

https://gerrit.wikimedia.org/r/399861

Krinkle added a comment.EditedJan 2 2018, 10:15 PM

@Od1n Now that we have full code/test coverage, I'd be willing to consider your patch. I'll look at it this later this week.

If you are familiar using the command prompt (also known as Terminal), could you consider submitting the patch to Gerrit? See Developer access and Gerrit for details. Alternatively, if you Git already, you can use Patch uploader (no git-review, ssh or Gerrit account needed).

Change 403331 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Improve docs for parsing of string literals

https://gerrit.wikimedia.org/r/403331

Change 403332 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in string and regexp parsing

https://gerrit.wikimedia.org/r/403332

@Od1n I've used your patch as starting point, but applied it slightly differently.

  1. Instead of checking outside the do-while loop, I've placed the check closer to the point where I believe the incorrect offset is created. I did so by changing an existing check, this helps avoid introducing a second check and should make the code easier to follow. However, if you believe there are cases where this problem can be reached for the other branches in the do-while loop, let me know and we can consider adding a second check there, and/or by placing it outside the loop like you had originally.
  1. I noticed the same problem exists with the parsing of string literals, and applied a similar fix there.

The added tests in https://gerrit.wikimedia.org/r/403332 consistently produce PHP Notice: Uninitialized string offset when run without the patch to JavaScriptMinifier.php.

Od1n added a comment.Jan 10 2018, 3:05 AM

Thank you for your work on this.

I trust you about it, you probably have better understood that code than myself.

I think the most important is to apply the "overflow fixes" only in code paths where such overflows can technically happen. If ever there are doubts, better fix a bit less than too much.

Od1n added a comment.Jan 13 2018, 8:42 AM

So, you don't think there is the same issue with the 2nd do/while of the for loop? I haven't checked it thoroughly, but it just looks very similar, with these n+2 steps. That's why I added the "overflow fix" after the for loop in my demonstration patch, to cover the two do/while's.

Would you be able to write a test that breaks the 2nd do/while?

Change 403331 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Improve docs for parsing of string literals

https://gerrit.wikimedia.org/r/403331

Change 403332 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in string and regexp parsing

https://gerrit.wikimedia.org/r/403332

So, you don't think there is the same issue with the 2nd do/while of the for loop? I haven't checked it thoroughly, but it just looks very similar, with these n+2 steps. That's why I added the "overflow fix" after the for loop in my demonstration patch, to cover the two do/while's.
Would you be able to write a test that breaks the 2nd do/while?

Yep, looks like that case can be similarly broken as well.

Change 406972 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in regexp char class parsing

https://gerrit.wikimedia.org/r/406972

Krinkle closed this task as Resolved.Feb 3 2018, 1:12 AM
Krinkle closed this task as Resolved.