There is some existing code which could be helpful here, though instead of parsing Special:Version, we'd probably want to parse programmatic config files instead, for a more accurate representation.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
In Progress | sbassett | T343366 [EPIC] Production Risk Assessment Work - Phase 2 | |||
Open | Mstyles | T348780 Integrate a risk factor related to how many production projects an extension or skin is deployed |
Event Timeline
per conversation with @sbassett I will use the initialise settings file in wmf config to determine which extensions are deployed on a particular wiki.
Just a very quick proof of concept, this seems to work for me to process the current Wikimedia config CS.php and IS.php files (and it's quite fast):
#!/usr/bin/env php <?php declare(strict_types=1); define( "AST_PHP_VERSION", 80 ); if( PHP_SAPI !== 'cli' ) { exit("Please run as a PHP CLI!"); } if( ! isset($argv[1]) ) { exit("Please provide a file name or string of PHP code as an argument!"); } if( ! in_array( "ast", get_loaded_extensions() ) ) { exit("Please ensure the php/ast module is installed!"); } $ast = null; if( is_file( $argv[1] ) ) { $ast = ast\parse_file( $argv[1], $version=AST_PHP_VERSION ); } else if( is_string( $argv[1] ) ) { $ast = ast\parse_code( $argv[1], $version=AST_PHP_VERSION ); } else { exit("A file name or valid PHP code string were not provided as an argument!"); } if( $ast !== null ) { echo( $ast ); } else { exit("AST unable to be generated!"); }
#!/usr/bin/env python3 # -*- coding: utf-8 -*- import os import requests import subprocess import sys def get_cfg_file(): cfg_files = { "CS.php": "https://noc.wikimedia.org/conf/CommonSettings.php.txt", "IS.php": "https://noc.wikimedia.org/conf/InitialiseSettings.php.txt", } cfg_file_name = "" cfg_file_contents = "" print(sys.argv) if len(sys.argv) == 2: if sys.argv[1] == "CS.php" or sys.argv[1] == "IS.php": cfg_file_name = sys.argv[1] http_response = requests.get(cfg_files[cfg_file_name], timeout=5) if http_response.status_code == 200: cfg_file_contents = http_response.text f = open(cfg_file_name, "w") f.write(cfg_file_contents) f.close() else: raise ValueError("Please specify either CS.php or IS.php as your argument.") else: raise ValueError("Please specify either CS.php or IS.php as your argument.") return cfg_file_name def run_ast_cmd(): cfg_file_name = get_cfg_file() if cfg_file_name != "": try: ast_data = subprocess.run( ["php", "ast.php", cfg_file_name], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, encoding='utf-8', ) except subprocess.CalledProcessError as err: print(str(err)) finally: print(str(ast_data.stdout)) if __name__ == "__main__": run_ast_cmd()
@sbassett this looks really good! glad it's fast since the other methods were not as fast.
Yes, this definitely works and is very fast. Though there might be more benefits to using PHP-Parser instead of php-ast, which is maintained by the same person who maintains php-ast. PHP-Parser is definitely slower, but has better support for traversing the generated ast nodes and converting back and forth in a couple of ways: php -> ast -> php and php -> ast -> json, which will likely be handy for our intended use-case.
Works just as well (and seemingly as fast) with PHP-Parser. We just need to update ast.php and bring in the new dependency via composer:
#!/usr/bin/env php <?php declare( strict_types = 1 ); require "vendor/autoload.php"; use PhpParser\ParserFactory; if( PHP_SAPI !== 'cli' ) { exit( "Please run as a PHP CLI!" ); } if( !isset( $argv[1] ) ) { exit( "Please provide a file name or string of PHP code as an argument!" ); } $ast = null; $php_code = ""; if( is_file( $argv[1] ) ) { $php_code = file_get_contents( $argv[1] ); } else if( is_string( $argv[1] ) ) { $php_code = $argv[1]; } else { exit( "A file name or valid PHP code string were not provided as an argument!" ); } $parser = ( new ParserFactory() )->createForHostVersion(); try { $stmts = $parser->parse( $php_code ); $ast = json_encode( $stmts, JSON_PRETTY_PRINT ); } catch ( PhpParser\Error $e ) { echo 'Parse Error: ', $e->getMessage(); } if( $ast !== null ) { echo $ast; } else { exit( "AST unable to be generated!" ); }
{ "require": { "nikic/php-parser": "^5.0" } }
mstyles opened https://gitlab.wikimedia.org/repos/security/wikimedia-code-health-check/-/merge_requests/37
Extensions in WMF production