Page MenuHomePhabricator

Change maintenance/findHooks.php to use YAML [Outreachy microtask]
Closed, DeclinedPublic

Description

The MediaWiki maintenance script maintenance/findHooks.php reads docs/hooks.txt and does stuff with it. T115388 will change hooks.txt to be a YAML file instead plain text. Update the maintenance script so it can handle that.

You probably don't want to wait until T115388 is finished, just hand-modify a few entries, delete the rest, and use that for testing,

Ensuring that MediaWiki core always has access to a YAML parser is a bit beyond the scope of a microtask; you can just assume that the yaml package is always available. This means that your patch cannot be merged; be sure to clarify that in the commit description. (The convention is to put something like [DO NOT MERGE] at the start of the commit description.)

Event Timeline

Tgr assigned this task to Shrutika719.
Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr added a subscriber: Tgr.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aklapper set Security to None.

[removed Outreachy-Round-11 form task as this is not an Outreachy project (proposal)]

@Tgr I am almost done with the microtask assigned to me. Is there any other microtask that needs to be done before submitting the proposal? If yes, kindly specify one.

No, although you are welcome to do more, but it's not required, and the proposal is more important.

@Tgr I am currently working on the proposal itself, if I get selected for this project I would like to continue the work throughout November so kindly look for some more tasks.

@Shrutika719: If you are also okay with going broader, see https://www.mediawiki.org/wiki/Annoying_little_bugs for more task ideas in Wikimedia in general.

@Tgr I am facing some problem in submitting gerrit patch that is why I am submitting the modified findHooks.php file here. Can it be reviewed from phabricator? If not, kindly help me out.

<?php
/**
 * Simple script that try to find documented hook and hooks actually
 * in the code and show what's missing.
 *
 * This script assumes that:
 * - hooks names in hooks.yml are at the beginning of a line and single quoted.
 * - hooks names in code are the first parameter of wfRunHooks.
 *
 * if --online option is passed, the script will compare the hooks in the code
 * with the ones at http://www.mediawiki.org/wiki/Manual:Hooks
 *
 * Any instance of wfRunHooks that doesn't meet these parameters will be noted.
 *
 * Copyright © Antoine Musso
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 * http://www.gnu.org/copyleft/gpl.html
 *
 * @file
 * @ingroup Maintenance
 * @author Antoine Musso <hashar at free dot fr>
 */

require_once "YamlParser.php";

class YamlParserTest extends PHPUnit_Framework_TestCase
{
    private $yamlParser;

    public function setup() {
        $this->yamlParser = new YamlParser();
    }

    public function testMainArrayKeys() {
        $parsedYaml    = $this->yamlParser->load("hooks.yml");
        $mainArrayKeys = array_keys($parsedYaml);
        $expectedKeys  = array("description", "argument");

        $this->assertEquals($expectedKeys, $mainArrayKeys);

    }

   
  
}

require_once "yaml/sfYaml.php";

class YamlParser
{
    public function load($filePath) {
        try {
            return sfYaml::load($filePath);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }

    public function dump($array) {
        try {
            return sfYaml::dump($array);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }
}

class YamlParserException extends Exception
{
    public function __construct($message = "", $code = 0, $previous = NULL) {
        if (version_compare(PHP_VERSION, "5.3.0") < 0) {
            parent::__construct($message, $code);
        }
        else {
            parent::__construct($message, $code, $previous);
        }
    }
}

require_once __DIR__ . '/Maintenance.php';

/**
 * Maintenance script that compares documented and actually present mismatches.
 *
 * @ingroup Maintenance
 */
class FindHooks extends Maintenance {
  /*
   * Hooks that are ignored
   */
  protected static $ignore = array( 'testRunLegacyHooks' );

  public function __construct() {
    parent::__construct();
    $this->mDescription = 'Find hooks that are undocumented, missing, or just plain wrong';
    $this->addOption( 'online', 'Check against MediaWiki.org hook documentation' );
  }

  public function getDbType() {
    return Maintenance::DB_NONE;
  }

  public function execute() {
    global $IP;

    $documented = $this->getHooksFromDoc( $IP . '/docs/hooks.yml' );
    $potential = array();
    $bad = array();

    // TODO: Don't hardcode the list of directories
    $pathinc = array(
      $IP . '/',
      $IP . '/includes/',
      $IP . '/includes/actions/',
      $IP . '/includes/api/',
      $IP . '/includes/cache/',
      $IP . '/includes/changes/',
      $IP . '/includes/changetags/',
      $IP . '/includes/clientpool/',
      $IP . '/includes/content/',
      $IP . '/includes/context/',
      $IP . '/includes/dao/',
      $IP . '/includes/db/',
      $IP . '/includes/debug/',
      $IP . '/includes/deferred/',
      $IP . '/includes/diff/',
      $IP . '/includes/exception/',
      $IP . '/includes/externalstore/',
      $IP . '/includes/filebackend/',
      $IP . '/includes/filerepo/',
      $IP . '/includes/filerepo/file/',
      $IP . '/includes/gallery/',
      $IP . '/includes/htmlform/',
      $IP . '/includes/installer/',
      $IP . '/includes/interwiki/',
      $IP . '/includes/jobqueue/',
      $IP . '/includes/json/',
      $IP . '/includes/logging/',
      $IP . '/includes/mail/',
      $IP . '/includes/media/',
      $IP . '/includes/page/',
      $IP . '/includes/parser/',
      $IP . '/includes/password/',
      $IP . '/includes/rcfeed/',
      $IP . '/includes/resourceloader/',
      $IP . '/includes/revisiondelete/',
      $IP . '/includes/search/',
      $IP . '/includes/site/',
      $IP . '/includes/skins/',
      $IP . '/includes/specialpage/',
      $IP . '/includes/specials/',
      $IP . '/includes/upload/',
      $IP . '/includes/utils/',
      $IP . '/languages/',
      $IP . '/maintenance/',
      $IP . '/maintenance/language/',
      $IP . '/tests/',
      $IP . '/tests/parser/',
      $IP . '/tests/phpunit/suites/',
    );

    foreach ( $pathinc as $dir ) {
      $potential = array_merge( $potential, $this->getHooksFromPath( $dir ) );
      $bad = array_merge( $bad, $this->getBadHooksFromPath( $dir ) );
    }

    $potential = array_unique( $potential );
    $bad = array_diff( array_unique( $bad ), self::$ignore );
    $todo = array_diff( $potential, $documented, self::$ignore );
    $deprecated = array_diff( $documented, $potential, self::$ignore );

    // let's show the results:
    $this->printArray( 'Undocumented', $todo );
    $this->printArray( 'Documented and not found', $deprecated );
    $this->printArray( 'Unclear hook calls', $bad );

    if ( count( $todo ) == 0 && count( $deprecated ) == 0 && count( $bad ) == 0 ) {
      $this->output( "Looks good!\n" );
    } else {
      $this->error( 'The script finished with errors.', 1 );
    }
  }

  /**
   * Get the hook documentation, either locally or from MediaWiki.org
   * @param string $doc
   * @return array Array of documented hooks
   */
  private function getHooksFromDoc( $doc ) {
    if ( $this->hasOption( 'online' ) ) {
      return $this->getHooksFromOnlineDoc();
    } else {
      return $this->getHooksFromLocalDoc( $doc );
    }
  }

  /**
   * Get hooks from a local file (for example docs/hooks.yml)
   * @param string $doc Filename to look in
   * @return array Array of documented hooks
   */
  private function getHooksFromLocalDoc( $doc ) {
    $m = array();
    $content = file_get_contents( $doc );
    preg_match_all( "/\n'(.*?)':/", $content, $m );

    return array_unique( $m[1] );
  }

  /**
   * Get hooks from www.mediawiki.org using the API
   * @return array Array of documented hooks
   */
  private function getHooksFromOnlineDoc() {
    $allhooks = $this->getHooksFromOnlineDocCategory( 'MediaWiki_hooks' );
    $removed = $this->getHooksFromOnlineDocCategory( 'Removed_hooks' );
    return array_diff( $allhooks, $removed );
  }

  /**
   * @param string $title
   * @return array
   */
  private function getHooksFromOnlineDocCategory( $title ) {
    $params = array(
      'action' => 'query',
      'list' => 'categorymembers',
      'cmtitle' => "Category:$title",
      'cmlimit' => 500,
      'format' => 'json',
      'continue' => '',
    );

    $retval = array();
    while ( true ) {
      $json = Http::get(
        wfAppendQuery( 'http://www.mediawiki.org/w/api.php', $params ),
        array(),
        __METHOD__
      );
      $data = FormatJson::decode( $json, true );
      foreach ( $data['query']['categorymembers'] as $page ) {
        if ( preg_match( '/Manual\:Hooks\/([a-zA-Z0-9- :]+)/', $page['title'], $m ) ) {
          $retval[] = str_replace( ' ', '_', $m[1] );
        }
      }
      if ( !isset( $data['continue'] ) ) {
        return $retval;
      }
      $params = array_replace( $params, $data['continue'] );
    }
  }

  /**
   * Get hooks from a PHP file
   * @param string $file Full filename to the PHP file.
   * @return array Array of hooks found
   */
  private function getHooksFromFile( $file ) {
    $content = file_get_contents( $file );
    $m = array();
    preg_match_all(
      '/(?:wfRunHooks|Hooks\:\:run|ContentHandler\:\:runLegacyHooks)\(\s*([\'"])(.*?)\1/',
      $content,
      $m
    );

    return $m[2];
  }

  /**
   * Get hooks from the source code.
   * @param string $path Directory where the include files can be found
   * @return array Array of hooks found
   */
  private function getHooksFromPath( $path ) {
    $hooks = array();
    $dh = opendir( $path );
    if ( $dh ) {
      while ( ( $file = readdir( $dh ) ) !== false ) {
        if ( filetype( $path . $file ) == 'file' ) {
          $hooks = array_merge( $hooks, $this->getHooksFromFile( $path . $file ) );
        }
      }
      closedir( $dh );
    }

    return $hooks;
  }

  /**
   * Get bad hooks (where the hook name could not be determined) from a PHP file
   * @param string $file Full filename to the PHP file.
   * @return array Array of bad wfRunHooks() lines
   */
  private function getBadHooksFromFile( $file ) {
    $content = file_get_contents( $file );
    $m = array();
    # We want to skip the "function wfRunHooks()" one.  :)
    preg_match_all( '/(?<!function )wfRunHooks\(\s*[^\s\'"].*/', $content, $m );
    $list = array();
    foreach ( $m[0] as $match ) {
      $list[] = $match . "(" . $file . ")";
    }

    return $list;
  }

  /**
   * Get bad hooks from the source code.
   * @param string $path Directory where the include files can be found
   * @return array Array of bad wfRunHooks() lines
   */
  private function getBadHooksFromPath( $path ) {
    $hooks = array();
    $dh = opendir( $path );
    if ( $dh ) {
      while ( ( $file = readdir( $dh ) ) !== false ) {
        # We don't want to read this file as it contains bad calls to wfRunHooks()
        if ( filetype( $path . $file ) == 'file' && !$path . $file == __FILE__ ) {
          $hooks = array_merge( $hooks, $this->getBadHooksFromFile( $path . $file ) );
        }
      }
      closedir( $dh );
    }

    return $hooks;
  }

  /**
   * Nicely output the array
   * @param string $msg A message to show before the value
   * @param array $arr
   * @param bool $sort Whether to sort the array (Default: true)
   */
  private function printArray( $msg, $arr, $sort = true ) {
    if ( $sort ) {
      asort( $arr );
    }

    foreach ( $arr as $v ) {
      $this->output( "$msg: $v\n" );
    }
  }
}

$maintClass = 'FindHooks';
require_once RUN_MAINTENANCE_IF_MAIN;

@Shrutika719: no, code review (and the eventual merging of new code) needs to happen on gerrit. Can you describe your difficulties?

@Ankitashukla, I am unable to solve the issue because my semester exams were going on, I have just one day left with me, can you help me out with this?

Aklapper added a subscriber: Shrutika719.

Removing assignee as there has been no progress lately, so someone else could claim this task in theory.

Not sure if it makes sense to keep this task open, though?