This is an annex to T99740 and Iaa5e32830dc1bb710b9e0f1a681afe91e521ece9. Here are some statistics for the three methods (CDB, current PHP arrays, proposed PHP arrays). Summary: * generation time: 5x quicker than CDB, 2x quicker than current PHP arrays * size: 10% smaller than CDB, 30% smaller than current PHP arrays Stats accross the 440 languages (414 cache files): CDB / current PHP arrays / proposed PHP arrays * time: mean: 0.78493354060433 / 0.29702293276787 / 0.14556954828176 * time: variance: 0.001585028865205 / 0.00016412325920965 / 9.3818872110887E-5 * time: stddev: 0.039812420991508 / 0.012811060034581 / 0.0096860142530809 * size: total: 472M / 611M / 426M * size: mean: 1.14M / 1.48M / 1.03M Some methodological precisions: * MW version: CDB and current PHP arrays: 757b54b, proposed PHP arrays: b481d81 * time in seconds, computed with 5 consecutive measures for each language * computed on my laptop (Debian) on an ext4 filesystem * tested code (in maintenance/eval.php): $lc = new LocalisationCache( $conf ); $lc->getItem( $code, 'namespaceNames' ); for each code in MediaWiki\Languages\Data\Names::$names * variance is unbiaised sample variance (factor N-1 but not N) * stddev is sqrt( unbiaised sample variance ), hence it is biaised * size computed with `du -hs` * LocalisationCache class is recreated each time else computed language is kept in memory Code to be executed with `php maintenance/eval.php` ``` $names = MediaWiki\Languages\Data\Names::$names; $conf = [ 'store' => 'array', 'storeDirectory' => '/tmp/files' ]; $internal_N = 5; $times = []; foreach( $names as $code => $name ) { $time = microtime(true); for( $i=0; $i<$internal_N; $i++ ) { $lc = new LocalisationCache( $conf ); $lc->getItem( $code, 'namespaceNames' ); array_map('unlink', glob("/tmp/files/*.php")); } $time = microtime(true) - $time; $times[$code] = $time / $internal_N; echo "$code - {$times[$code]} - $name\n"; } $n = count( $times ); $mean = array_sum($times) / $n; var_dump( $mean ); # Mean $carry = 0.0; foreach( $times as $time ) { $d = $time - $mean; $carry += $d * $d; } var_dump( $carry / ($n-1) ); # Unbiaised sample variance var_dump( sqrt( $carry / ($n-1) ) ); # (Biased) sample standard deviation ```