Page MenuHomePhabricator

Micro-optimize ApiResult::isMetadataKey with str_starts_with once we support PHP8+
Open, Needs TriagePublic

Description

ApiResult::isMetadataKey is an extremely hot method that was already micro-optimized in the past, going from substr to string index access (r660062) and then from string index access to ord (r662076). Any optimization needed not only to be as fast as possible, but also support the string potentially being empty or it being an integer. While ord() is the fastest solution we could find in PHP <8, PHP 8.0 introduces str_starts_with, which seems roughly equally faster, and also gracefully handles the integer and empty string cases.

I wrote a quick and oversimplified test script:

<?php

$num = 100000000;
$str = 'abcdefghijklmno';

$s = hrtime(true);
for ( $i = 0; $i < $num; $i++ ) {
    $res1 = ord( $str ) === 95;
}
$time = (hrtime( true ) - $s)/10**6;
echo "ORD: $time\n";

$s = hrtime(true);
for ( $i = 0; $i < $num; $i++ ) {
    $res3 = str_starts_with( $str, '_' );
}
$time = (hrtime( true ) - $s)/10**6;
echo "START: $time\n";

which yields results like these on my machine (ignore PHP8.2 being much slower, likely some local issue):

$ php8.1 test.php 
ORD: 1588.489258
START: 1560.760053

$ php8.2 test.php 
ORD: 5365.255071
START: 5210.782069

Other test runs report str_starts_with as being slower than ord, but the values are always very close, so we will need some more accurate benchmarking. Also, and most importantly, str_starts_with is much more readable, so we should switch to it if possible once we support PHP 8+.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
StalledKrinkle
Resolvedtstarling
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
Resolvedtstarling
ResolvedReedy
ResolvedBUG REPORTtstarling
Resolvedtstarling
ResolvedDaimona
ResolvedDaimona
ResolvedNone
ResolvedJdforrester-WMF
ResolvedBUG REPORTNone
Resolvedtstarling
ResolvedJdforrester-WMF
Resolvedssastry
Resolvedkostajh
Resolvedkostajh
Resolvedthiemowmde
Resolvedtstarling
Resolvedtstarling
ResolvedBUG REPORTLucas_Werkmeister_WMDE
Resolvedhoo
Resolvedhoo
ResolvedJdforrester-WMF
Resolvedthiemowmde
Resolvedkostajh
ResolvedUmherirrender
ResolvedPRODUCTION ERROR brooke
ResolvedTheresNoTime
Resolvedtstarling
OpenJdforrester-WMF
Resolvedlarissagaulia
ResolvedJMeybohm
ResolvedMoritzMuehlenhoff
OpenNone
Resolvedjijiki
Resolvedaaron
Openjijiki
In ProgressClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
StalledClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedJoe
Resolvedcolewhite
ResolvedClement_Goubert
ResolvedClement_Goubert
In ProgressClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
InvalidClement_Goubert
ResolvedJoe
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedJoe
ResolvedJoe
ResolvedJoe
ResolvedJMeybohm
ResolvedJoe
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
DeclinedClement_Goubert
ResolvedClement_Goubert
Openelukey
StalledKrinkle
Resolvedjijiki
ResolvedJoe
ResolvedJoe
ResolvedClement_Goubert
ResolvedBUG REPORTClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedJoe
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedJclark-ctr
ResolvedJMeybohm
ResolvedJoe
ResolvedJoe
ResolvedNone
Resolvedjijiki
Resolvedjijiki
Resolveddancy
Resolveddancy
ResolvedJoe
ResolvedJoe
Resolvedjeena
ResolvedJoe
ResolvedJoe
Resolveddancy
ResolvedJoe
Resolved dpifke
Resolveddancy
ResolvedJoe
ResolvedClement_Goubert
Resolvedcolewhite
Resolvedjijiki
Resolved dpifke
ResolvedLegoktm
ResolvedClement_Goubert
ResolvedJMeybohm
ResolvedClement_Goubert
ResolvedClement_Goubert
OpenNone
OpenClement_Goubert
In ProgressClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
Resolvedhnowlan
Resolvedakosiaris
Openhnowlan
ResolvedClement_Goubert
ResolvedNone
ResolvedDreamy_Jazz
ResolvedPRODUCTION ERRORDreamy_Jazz
Resolvedkostajh
Resolvedjijiki
OpenNone
Resolvedkamila
ResolvedClement_Goubert
ResolvedClement_Goubert
Resolvedakosiaris
OpenNone
Resolvedakosiaris
Resolveddancy
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
ResolvedClement_Goubert
OpenClement_Goubert
Openjijiki
ResolvedJoe
Stalledjijiki
OpenNone
ResolvedJdforrester-WMF
Resolvedjijiki
ResolvedJdforrester-WMF

Event Timeline

We have/require symfony/polyfill-php80 and symfony/polyfill-php81...

Is there much (any?) performance loss to swapping to using it now, for those who aren't on PHP 8.0+ already?

We have/require symfony/polyfill-php80 and symfony/polyfill-php81...

Is there much (any?) performance loss to swapping to using it now, for those who aren't on PHP 8.0+ already?

Not just "much", but a massive lot unfortunately:

<?php

$num = 100000000;
$str = 'abcdefghijklmno';

$s = hrtime(true);
for ( $i = 0; $i < $num; $i++ ) {
    $res1 = ord( $str ) === 95;
}
$time = (hrtime( true ) - $s)/10**6;
echo "ORD: $time\n";

$s = hrtime(true);
for ( $i = 0; $i < $num; $i++ ) {
    $res3 = str_starts_with( $str, '_' );
}
$time = (hrtime( true ) - $s)/10**6;
echo "START: $time\n";

class SymfonyPhp80 {
	public static function str_starts_with2(string $haystack, string $needle): bool
    {
        return 0 === strncmp($haystack, $needle, \strlen($needle));
	}
}

function str_starts_with2(?string $haystack, ?string $needle): bool { return SymfonyPhp80::str_starts_with2($haystack ?? '', $needle ?? ''); }

$s = hrtime(true);
for ( $i = 0; $i < $num; $i++ ) {
    $res3 = str_starts_with2( $str, '_' );
}
$time = (hrtime( true ) - $s)/10**6;
echo "START SYM: $time\n";
$ php8.1 test.php 
ORD: 1573.309082
START: 1598.216216
START SYM: 6385.82479

(Also note that in this test run ord() is faster than str_starts_with())

Using the polyfill involves a userland function call, two null coalesces, a userland method call, and an underlying implementation that uses strncmp, strlen, and a comparison. This wouldn't matter in ordinary code, but since this code is (or at least was) extremely hot, any slowdown matters, particularly a 4x one.

I wonder if this method really is that hot—in the api.php-specific wall time flamegraph for May 8th, I get no hits at all for the method name, and ApiResult shows up as ~0.1%, which is so little that I have trouble finding it in the flamegraph. In the CPU time graph, ApiResult::applyTransformations shows up as responsible for ~0.4% of CPU time, but isMetadataKey is not visible there either.

I368cd3b526edeb247a6ebda7420897a51357407d appears to have been based on an XHGui profile, which tends to overstate the impact of small functions called relatively often.