Page MenuHomePhabricator

Change maintenance/findHooks.php to use YAML [Outreachy microtask]
Open, Needs TriagePublic

Description

The MediaWiki maintenance script maintenance/findHooks.php reads docs/hooks.txt and does stuff with it. T115388 will change hooks.txt to be a YAML file instead plain text. Update the maintenance script so it can handle that.

You probably don't want to wait until T115388 is finished, just hand-modify a few entries, delete the rest, and use that for testing,

Ensuring that MediaWiki core always has access to a YAML parser is a bit beyond the scope of a microtask; you can just assume that the yaml package is always available. This means that your patch cannot be merged; be sure to clarify that in the commit description. (The convention is to put something like [DO NOT MERGE] at the start of the commit description.)

Event Timeline

Tgr created this task.Oct 20 2015, 12:20 AM
Tgr assigned this task to Shrutika719.
Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr added a subscriber: Tgr.
Restricted Application added a project: Documentation. · View Herald TranscriptOct 20 2015, 12:20 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aklapper set Security to None.

[removed Outreachy-Round-11 form task as this is not an Outreachy project (proposal)]

@Tgr I am almost done with the microtask assigned to me. Is there any other microtask that needs to be done before submitting the proposal? If yes, kindly specify one.

Tgr added a comment.Oct 22 2015, 8:17 PM

No, although you are welcome to do more, but it's not required, and the proposal is more important.

@Tgr I am currently working on the proposal itself, if I get selected for this project I would like to continue the work throughout November so kindly look for some more tasks.

@Shrutika719: If you are also okay with going broader, see https://www.mediawiki.org/wiki/Annoying_little_bugs for more task ideas in Wikimedia in general.

Shrutika719 added a comment.EditedOct 25 2015, 7:30 PM

@Tgr I am facing some problem in submitting gerrit patch that is why I am submitting the modified findHooks.php file here. Can it be reviewed from phabricator? If not, kindly help me out.

<?php
/**
 * Simple script that try to find documented hook and hooks actually
 * in the code and show what's missing.
 *
 * This script assumes that:
 * - hooks names in hooks.yml are at the beginning of a line and single quoted.
 * - hooks names in code are the first parameter of wfRunHooks.
 *
 * if --online option is passed, the script will compare the hooks in the code
 * with the ones at http://www.mediawiki.org/wiki/Manual:Hooks
 *
 * Any instance of wfRunHooks that doesn't meet these parameters will be noted.
 *
 * Copyright © Antoine Musso
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 * http://www.gnu.org/copyleft/gpl.html
 *
 * @file
 * @ingroup Maintenance
 * @author Antoine Musso <hashar at free dot fr>
 */

require_once "YamlParser.php";

class YamlParserTest extends PHPUnit_Framework_TestCase
{
    private $yamlParser;

    public function setup() {
        $this->yamlParser = new YamlParser();
    }

    public function testMainArrayKeys() {
        $parsedYaml    = $this->yamlParser->load("hooks.yml");
        $mainArrayKeys = array_keys($parsedYaml);
        $expectedKeys  = array("description", "argument");

        $this->assertEquals($expectedKeys, $mainArrayKeys);

    }

   
  
}

require_once "yaml/sfYaml.php";

class YamlParser
{
    public function load($filePath) {
        try {
            return sfYaml::load($filePath);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }

    public function dump($array) {
        try {
            return sfYaml::dump($array);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }
}

class YamlParserException extends Exception
{
    public function __construct($message = "", $code = 0, $previous = NULL) {
        if (version_compare(PHP_VERSION, "5.3.0") < 0) {
            parent::__construct($message, $code);
        }
        else {
            parent::__construct($message, $code, $previous);
        }
    }
}

require_once __DIR__ . '/Maintenance.php';

/**
 * Maintenance script that compares documented and actually present mismatches.
 *
 * @ingroup Maintenance
 */
class FindHooks extends Maintenance {
  /*
   * Hooks that are ignored
   */
  protected static $ignore = array( 'testRunLegacyHooks' );

  public function __construct() {
    parent::__construct();
    $this->mDescription = 'Find hooks that are undocumented, missing, or just plain wrong';
    $this->addOption( 'online', 'Check against MediaWiki.org hook documentation' );
  }

  public function getDbType() {
    return Maintenance::DB_NONE;
  }

  public function execute() {
    global $IP;

    $documented = $this->getHooksFromDoc( $IP . '/docs/hooks.yml' );
    $potential = array();
    $bad = array();

    // TODO: Don't hardcode the list of directories
    $pathinc = array(
      $IP . '/',
      $IP . '/includes/',
      $IP . '/includes/actions/',
      $IP . '/includes/api/',
      $IP . '/includes/cache/',
      $IP . '/includes/changes/',
      $IP . '/includes/changetags/',
      $IP . '/includes/clientpool/',
      $IP . '/includes/content/',
      $IP . '/includes/context/',
      $IP . '/includes/dao/',
      $IP . '/includes/db/',
      $IP . '/includes/debug/',
      $IP . '/includes/deferred/',
      $IP . '/includes/diff/',
      $IP . '/includes/exception/',
      $IP . '/includes/externalstore/',
      $IP . '/includes/filebackend/',
      $IP . '/includes/filerepo/',
      $IP . '/includes/filerepo/file/',
      $IP . '/includes/gallery/',
      $IP . '/includes/htmlform/',
      $IP . '/includes/installer/',
      $IP . '/includes/interwiki/',
      $IP . '/includes/jobqueue/',
      $IP . '/includes/json/',
      $IP . '/includes/logging/',
      $IP . '/includes/mail/',
      $IP . '/includes/media/',
      $IP . '/includes/page/',
      $IP . '/includes/parser/',
      $IP . '/includes/password/',
      $IP . '/includes/rcfeed/',
      $IP . '/includes/resourceloader/',
      $IP . '/includes/revisiondelete/',
      $IP . '/includes/search/',
      $IP . '/includes/site/',
      $IP . '/includes/skins/',
      $IP . '/includes/specialpage/',
      $IP . '/includes/specials/',
      $IP . '/includes/upload/',
      $IP . '/includes/utils/',
      $IP . '/languages/',
      $IP . '/maintenance/',
      $IP . '/maintenance/language/',
      $IP . '/tests/',
      $IP . '/tests/parser/',
      $IP . '/tests/phpunit/suites/',
    );

    foreach ( $pathinc as $dir ) {
      $potential = array_merge( $potential, $this->getHooksFromPath( $dir ) );
      $bad = array_merge( $bad, $this->getBadHooksFromPath( $dir ) );
    }

    $potential = array_unique( $potential );
    $bad = array_diff( array_unique( $bad ), self::$ignore );
    $todo = array_diff( $potential, $documented, self::$ignore );
    $deprecated = array_diff( $documented, $potential, self::$ignore );

    // let's show the results:
    $this->printArray( 'Undocumented', $todo );
    $this->printArray( 'Documented and not found', $deprecated );
    $this->printArray( 'Unclear hook calls', $bad );

    if ( count( $todo ) == 0 && count( $deprecated ) == 0 && count( $bad ) == 0 ) {
      $this->output( "Looks good!\n" );
    } else {
      $this->error( 'The script finished with errors.', 1 );
    }
  }

  /**
   * Get the hook documentation, either locally or from MediaWiki.org
   * @param string $doc
   * @return array Array of documented hooks
   */
  private function getHooksFromDoc( $doc ) {
    if ( $this->hasOption( 'online' ) ) {
      return $this->getHooksFromOnlineDoc();
    } else {
      return $this->getHooksFromLocalDoc( $doc );
    }
  }

  /**
   * Get hooks from a local file (for example docs/hooks.yml)
   * @param string $doc Filename to look in
   * @return array Array of documented hooks
   */
  private function getHooksFromLocalDoc( $doc ) {
    $m = array();
    $content = file_get_contents( $doc );
    preg_match_all( "/\n'(.*?)':/", $content, $m );

    return array_unique( $m[1] );
  }

  /**
   * Get hooks from www.mediawiki.org using the API
   * @return array Array of documented hooks
   */
  private function getHooksFromOnlineDoc() {
    $allhooks = $this->getHooksFromOnlineDocCategory( 'MediaWiki_hooks' );
    $removed = $this->getHooksFromOnlineDocCategory( 'Removed_hooks' );
    return array_diff( $allhooks, $removed );
  }

  /**
   * @param string $title
   * @return array
   */
  private function getHooksFromOnlineDocCategory( $title ) {
    $params = array(
      'action' => 'query',
      'list' => 'categorymembers',
      'cmtitle' => "Category:$title",
      'cmlimit' => 500,
      'format' => 'json',
      'continue' => '',
    );

    $retval = array();
    while ( true ) {
      $json = Http::get(
        wfAppendQuery( 'http://www.mediawiki.org/w/api.php', $params ),
        array(),
        __METHOD__
      );
      $data = FormatJson::decode( $json, true );
      foreach ( $data['query']['categorymembers'] as $page ) {
        if ( preg_match( '/Manual\:Hooks\/([a-zA-Z0-9- :]+)/', $page['title'], $m ) ) {
          $retval[] = str_replace( ' ', '_', $m[1] );
        }
      }
      if ( !isset( $data['continue'] ) ) {
        return $retval;
      }
      $params = array_replace( $params, $data['continue'] );
    }
  }

  /**
   * Get hooks from a PHP file
   * @param string $file Full filename to the PHP file.
   * @return array Array of hooks found
   */
  private function getHooksFromFile( $file ) {
    $content = file_get_contents( $file );
    $m = array();
    preg_match_all(
      '/(?:wfRunHooks|Hooks\:\:run|ContentHandler\:\:runLegacyHooks)\(\s*([\'"])(.*?)\1/',
      $content,
      $m
    );

    return $m[2];
  }

  /**
   * Get hooks from the source code.
   * @param string $path Directory where the include files can be found
   * @return array Array of hooks found
   */
  private function getHooksFromPath( $path ) {
    $hooks = array();
    $dh = opendir( $path );
    if ( $dh ) {
      while ( ( $file = readdir( $dh ) ) !== false ) {
        if ( filetype( $path . $file ) == 'file' ) {
          $hooks = array_merge( $hooks, $this->getHooksFromFile( $path . $file ) );
        }
      }
      closedir( $dh );
    }

    return $hooks;
  }

  /**
   * Get bad hooks (where the hook name could not be determined) from a PHP file
   * @param string $file Full filename to the PHP file.
   * @return array Array of bad wfRunHooks() lines
   */
  private function getBadHooksFromFile( $file ) {
    $content = file_get_contents( $file );
    $m = array();
    # We want to skip the "function wfRunHooks()" one.  :)
    preg_match_all( '/(?<!function )wfRunHooks\(\s*[^\s\'"].*/', $content, $m );
    $list = array();
    foreach ( $m[0] as $match ) {
      $list[] = $match . "(" . $file . ")";
    }

    return $list;
  }

  /**
   * Get bad hooks from the source code.
   * @param string $path Directory where the include files can be found
   * @return array Array of bad wfRunHooks() lines
   */
  private function getBadHooksFromPath( $path ) {
    $hooks = array();
    $dh = opendir( $path );
    if ( $dh ) {
      while ( ( $file = readdir( $dh ) ) !== false ) {
        # We don't want to read this file as it contains bad calls to wfRunHooks()
        if ( filetype( $path . $file ) == 'file' && !$path . $file == __FILE__ ) {
          $hooks = array_merge( $hooks, $this->getBadHooksFromFile( $path . $file ) );
        }
      }
      closedir( $dh );
    }

    return $hooks;
  }

  /**
   * Nicely output the array
   * @param string $msg A message to show before the value
   * @param array $arr
   * @param bool $sort Whether to sort the array (Default: true)
   */
  private function printArray( $msg, $arr, $sort = true ) {
    if ( $sort ) {
      asort( $arr );
    }

    foreach ( $arr as $v ) {
      $this->output( "$msg: $v\n" );
    }
  }
}

$maintClass = 'FindHooks';
require_once RUN_MAINTENANCE_IF_MAIN;
Tgr added a comment.Oct 25 2015, 9:05 PM

@Shrutika719: no, code review (and the eventual merging of new code) needs to happen on gerrit. Can you describe your difficulties?

Hi @Shrutika719 Were you able to resolve the issue?

@Ankitashukla, I am unable to solve the issue because my semester exams were going on, I have just one day left with me, can you help me out with this?