Page MenuHomePhabricator

Integrate a risk factor related to how many production projects an extension or skin is deployed
Open, Needs TriagePublic8 Estimated Story Points

Description

There is some existing code which could be helpful here, though instead of parsing Special:Version, we'd probably want to parse programmatic config files instead, for a more accurate representation.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mstyles set the point value for this task to 2.Oct 12 2023, 6:14 PM
Mstyles changed the point value for this task from 2 to 4.

per conversation with @sbassett I will use the initialise settings file in wmf config to determine which extensions are deployed on a particular wiki.

Mstyles changed the point value for this task from 4 to 8.Dec 21 2023, 8:11 PM

Just a very quick proof of concept, this seems to work for me to process the current Wikimedia config CS.php and IS.php files (and it's quite fast):

ast.php
#!/usr/bin/env php 
<?php

declare(strict_types=1);

define( "AST_PHP_VERSION", 80 );

if( PHP_SAPI !== 'cli' ) { 
    exit("Please run as a PHP CLI!");
}

if( ! isset($argv[1]) ) { 
    exit("Please provide a file name or string of PHP code as an argument!");
}

if( ! in_array( "ast", get_loaded_extensions() ) ) { 
    exit("Please ensure the php/ast module is installed!");
}

$ast = null;
if( is_file( $argv[1] ) ) { 
    $ast = ast\parse_file( $argv[1], $version=AST_PHP_VERSION );
}
else if( is_string( $argv[1] ) ) { 
    $ast = ast\parse_code( $argv[1], $version=AST_PHP_VERSION );
}
else {
    exit("A file name or valid PHP code string were not provided as an argument!");
}

if( $ast !== null ) { 
    echo( $ast );
}
else {
    exit("AST unable to be generated!");
}
run_ast.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-


import os
import requests
import subprocess
import sys 


def get_cfg_file():
    cfg_files = { 
        "CS.php": "https://noc.wikimedia.org/conf/CommonSettings.php.txt",
        "IS.php": "https://noc.wikimedia.org/conf/InitialiseSettings.php.txt",
    }   

    cfg_file_name = ""
    cfg_file_contents = ""
    print(sys.argv)
    if len(sys.argv) == 2:
        if sys.argv[1] == "CS.php" or sys.argv[1] == "IS.php":
            cfg_file_name = sys.argv[1]
            http_response = requests.get(cfg_files[cfg_file_name], timeout=5)
            if http_response.status_code == 200:
                cfg_file_contents = http_response.text
                f = open(cfg_file_name, "w")
                f.write(cfg_file_contents)
                f.close()
        else:
            raise ValueError("Please specify either CS.php or IS.php as your argument.")
    else:
        raise ValueError("Please specify either CS.php or IS.php as your argument.")

    return cfg_file_name


def run_ast_cmd():
    cfg_file_name = get_cfg_file()

    if cfg_file_name != "": 
        try:
            ast_data = subprocess.run(
                ["php", "ast.php", cfg_file_name],
                stdout=subprocess.PIPE,
                stderr=subprocess.DEVNULL,
                encoding='utf-8',
            )   
        except subprocess.CalledProcessError as err:
            print(str(err))
        finally:
            print(str(ast_data.stdout))


if __name__ == "__main__":
    run_ast_cmd()

@sbassett this looks really good! glad it's fast since the other methods were not as fast.

Yes, this definitely works and is very fast. Though there might be more benefits to using PHP-Parser instead of php-ast, which is maintained by the same person who maintains php-ast. PHP-Parser is definitely slower, but has better support for traversing the generated ast nodes and converting back and forth in a couple of ways: php -> ast -> php and php -> ast -> json, which will likely be handy for our intended use-case.

Works just as well (and seemingly as fast) with PHP-Parser. We just need to update ast.php and bring in the new dependency via composer:

ast.php
#!/usr/bin/env php
<?php

declare( strict_types = 1 );

require "vendor/autoload.php";

use PhpParser\ParserFactory;

if( PHP_SAPI !== 'cli' ) {
	exit( "Please run as a PHP CLI!" );
}

if( !isset( $argv[1] ) ) {
	exit( "Please provide a file name or string of PHP code as an argument!" );
}

$ast = null;
$php_code = "";

if( is_file( $argv[1] ) ) {
	$php_code = file_get_contents( $argv[1] );
}
else if( is_string( $argv[1] ) ) {
	$php_code = $argv[1];
}
else {
	exit( "A file name or valid PHP code string were not provided as an argument!" );
}

$parser = ( new ParserFactory() )->createForHostVersion();

try {
	$stmts = $parser->parse( $php_code );
	$ast = json_encode( $stmts, JSON_PRETTY_PRINT );
} catch ( PhpParser\Error $e ) {
	echo 'Parse Error: ', $e->getMessage();
}

if( $ast !== null ) {
	echo $ast;
}
else {
	exit( "AST unable to be generated!" );
}
composer.json
{
    "require": {
        "nikic/php-parser": "^5.0"
    }
}