since hhvm is missing the dba_* functions, bin/refreshCdbJsonFiles shabang had to be hacked to explicitly use #!/usr/bin/env php5. Rather than depend on php5, we should rewrite this in python as it is part of scap.
Description
Revisions and Commits
rMSCA Scap | |||
Restricted Differential Revision | rMSCAba8e257037e0 Rewrite refreshCdbJsonFiles in python |
Related Objects
- Mentioned In
- T125477: refreshCdbJsonFiles in scap fails on mira due to missing dba_open function in hhvm
- Mentioned Here
- T103886: Translation cache exhaustion caused by changes to PHP code in file scope
T119637: Update HHVM package to recent release
T99740: Use static php array files for l10n cache at WMF (instead of CDB)
Event Timeline
I don't know how relevant it is, but there is a longer term plan to use plain php files intead of cdb files (T99740). This use case has been specifically optimized by the hhvm team (http://hhvm.com/blog/9293/lockdown-results-and-hhvm-performance).
Using PHP files for l10n cache is blocked on T103886: Translation cache exhaustion caused by changes to PHP code in file scope which is in turn blocked on T119637: Update HHVM package to recent release.
The refreshCdbJsonFiles script is fairly simple at its heart. It generates JSON dumps and MD5 checksums for all the l10n cache CDB files by forking N child processes in parallel. The choice of PHP as the implementation language is mostly a historical accident of it being written by @aaron before scap was converted to Python. Since the script worked and was fairly sane code I never bothered to reimplement in Python. Scap has the inverse operation of updating all the l10n CDB files from the JSON dumps implemented in the scap.tasks.merge_cdb_updates() function which shows how to accomplish the parallel processing using multiprocessing.Pool.