Page MenuHomePhabricator
Paste P2218

Convert hooks.txt to YAML format for T115338
ActivePublic

Authored by Akangupt on Oct 22 2015, 5:59 PM.
Referenced Files
F2953771: Convert hooks.txt to YAML format for T115338
Nov 11 2015, 9:25 AM
F2911921: Convert hooks.txt to YAML format for T115338
Nov 3 2015, 4:45 PM
F2906510: Convert hooks.txt to YAML format for T115338
Nov 2 2015, 11:14 AM
F2757586: Convert hooks.txt to YAML format for T115338
Oct 22 2015, 5:59 PM
<?php
/**
* Place this script in the doc/ directory of MediaWiki and run it.
*
* This script creats a new file hooks.yaml from hooks.txt to make it machine readable.
* @author akangupt < akanksha2879@gmail.com >
*/
$hookFile = 'hooks.txt'; # complete path for file hooks.txt
if ( !file_exists( "hooks.txt" ) ) {
die( "Error: hooks.txt not found\n" );
} else {
$hooksData = file_get_contents( $hookFile );
}
# convert all the tabs in four spaces because YAML forbids the tabs.
$data = preg_replace( "/\t/", " ", $hooksData );
$hookYamlFile = 'hooks.yaml';
# create hooks.yaml
$handle = fopen( $hookYamlFile, 'w' ) or die( 'Cannot open file: '.$hookYamlFile.' \n' );
# split all the text in two parts.
# Explanatory text(text before the hooks) and rest of the text
$text = preg_split( "/(?=\'\w+\:*\w*\:*\w*\'\:)/", $data, 2 );
if ( $text === "" ) {
echo "Warning: No explanatory text found in hooks.txt\n";
}
# comment the explanatory text
$explanatoryText = preg_replace( '/^/m', '# ', $text[0] );
fwrite( $handle, $explanatoryText );
hooksInYaml( $handle, $text[1], $hookFile );
function hooksInYaml( $handle, $string, $hookFile ) {
# extract each hook from rest of the text
$hooks = preg_split( "/\r?\n\r?\n/", $string );
foreach ( $hooks as $hook ) {
# if extracted hook doesn't start with a hook name
# i.e. explanatory part so comment that
$isText = preg_match( "/^\s*\'(\w+\:*\w*\:*\w*)\'\s*\:/msi", $hook );
if ( $isText != 1 ) {
$explanatoryText = preg_replace( '/^/m', '# ', $hook );
fwrite( $handle, $explanatoryText );
} else {
$pattern = "/\s*\'(\w+\:*\w*\:*\w*)\'\s*:\s*(DEPRECATED)*!*(\r?\n)*\s*(.*?)(?:(?=\n\&\\$\w+\s*\:|\n\\$\w+\s*\:|\Z))\s*(?:(.*?)\Z)/msi";
# $spaces - indentation which should be added to each line
$spaces = " ";
if ( preg_match( $pattern, $hook, $matches ) ) {
# $matches[1] has hook's name.
$string = $matches[1].":\n";
if ( $matches[1] === "" ) {
die( "Error: Couldn't recognize the hook\n" );
}
# $matches[2] has information if hook is deprecated or not.
if( $matches[2] != '' ) {
$string .= " DEPRECATED";
if( $matches[3] === '' ) {
# if $matches[3] is empty
# i.e. there is a suggestion to use some other hooks.
# $matches[4] contains all the text
# which is between 'DEPRECATED!' keyword and explanation of arguments.
# So, to get the suggestion of other hook,
# split $matches[4] in description of hook and suggestion of other hook.
$text = preg_split( "/\.\n/", $matches[4], 2 );
# text[0] contains the suggestion of other hook to be used.
$string .= multiline( $text[0].".", $spaces );
# text[1] contains description of hook.
$string .= "\n Description".multiline( $text[1], $spaces );
} else {
# if $matches[3] is not empty i.e. hook is deprecated
# but there is no suggestion of other hook to be used in place of this hook.
$string .= ":\n";
# $matches[4] contains description of hook
$string .= " Description".multiline( $matches[4], $spaces );
}
} else {
if ( $matches[4] != '' ) {
if (preg_match( "/^(\&\\$\w+\s*\:|\\$\w+\s*\:)/", $matches[4]) ) {
# This covers the case when a hook doesn't have a description.
# i.e. $matches[4] contains an argument.
$string .= " Description:";
$matches[5] = $matches[4]."\n".$matches[5];
} else {
$string .= " Description".multiline( $matches[4], $spaces );
}
}
}
fwrite( $handle, $string );
# $matches[5] has all the arguments.
splitArguments( $handle, $matches[5], $hookFile );
fwrite( $handle, "\n\n" );
}
}
}
}
# returns a string with literal block if
# a string is multiline or has a character that
# doesn't belong to A-Z a-z 0-9 _ . space
function multiline( $finalString, $spaces ) {
$string = preg_match( "/\n(?!$)|[^\w\s\.]/", $finalString ) ? (": |\n") : (": ");
if ( $string === ": " ) {
# don't add indentation in the string if string is singal line.
return $string.$finalString;
}
$pattern = "/\n\s\s|\n(?!$)/";
$replacement = "\n$spaces";
$finalString = preg_replace( $pattern, $replacement, $finalString );
return $string.$spaces.$finalString;
}
# function to split the arguments
function splitArguments( $handle, $string, $hookFile ) {
fwrite( $handle, "\n arguments:" );
$pattern = "/\s*(\&\\$\w+|\\$\w+)\s*\:\s*(.*?(?:(?=\n\&\\$\w+\s*\:|\n\\$\w+\s*\:)|\Z))/msi";
if ( preg_match_all( $pattern, $string, $matches, PREG_SET_ORDER )) {
$spaces = " ";
foreach ( $matches as $match ) {
$string = "\n - \"".$match[1]."\"";
$string .= multiline( $match[2], $spaces );
fwrite( $handle, $string );
}
}
}
?>

Event Timeline

Akangupt changed the title of this paste from untitled to Convert hooks.txt to YAML format for T115338.
Akangupt updated the paste's language from autodetect to php.

Some nitpicks:

  • there should be some instructions in a comment at the beginning of the file ("place this script in the doc/ directory of MediaWiki and run it" or something like that).
  • there should be basic error handling (output a message if the file could not be opened, and more importantly, warn the user if a hook could not be converted, ie. if you could not convert something that's not at the beginning or the end of the file)
  • instead of changing hook.txt in-place, it should write the output to hook.yaml.
  • IMO the preamble would be more readable if even empty lines were commented out.

Other than that, looks solid. Nice regexp-fu :)

there should be some instructions in a comment at the beginning of the file ("place this script in the doc/ directory of MediaWiki and run it" or something like that).

So, are we planning to place the script in core?

IMO the preamble would be more readable if even empty lines were commented out.

I have edited the script to comment out the empty lines. But as discussed here tabs need to be converted to spaces in hooks.txt. Hence, I will upload the hooks.yaml after I get updated hooks.txt.
Should I submit a patch for updated hooks.txt or wait for one?

So, are we planning to place the script in core?

We aren't. But the patch might not get merged for a while, and hook.txt changes all the time, so the script will need to be re-run a few times. Someone else might get involved in that; in general it is a good practice to always include usage documentation with code.

Should I submit a patch for updated hooks.txt or wait for one?

You should, or you can just handle tab -> space conversion in the script.

Now, script handles the tabs too. It searches for the tabs and replaces with four spaces.

In hooks.txt :

'SpecialResetTokensTokens': Called when building token list for
SpecialResetTokens.
&$tokens: array of token information arrays in the format of
	array(
		'preference' => '<preference-name>',
		'label-message' => '<message-key>',
	)

In hooks.yaml :

  • Before:
SpecialResetTokensTokens:
  Description: |
    Called when building token list for
    SpecialResetTokens.
  arguments:
    - "&$tokens": |
        array of token information arrays in the format of
        	array(
        'preference' => '<preference-name>',
        'label-message' => '<message-key>',
        	)
  • After:
SpecialResetTokensTokens:
  Description: |
    Called when building token list for
    SpecialResetTokens.
  arguments:
    - "&$tokens": |
        array of token information arrays in the format of
          array(
              'preference' => '<preference-name>',
              'label-message' => '<message-key>',
          )

The script works for me, a few minor comments:

$data = preg_replace( "@\t@", " ", $hooksData );
$explanatoryText = preg_replace( '~^(.*)$~m', '# $0', $text[0] );

why @ and '~'? Any delimiter works, but '/' is the standard.

$explanatoryText = preg_replace( '~^(.*)$~m', '# $0', $text[0] );

You're just changing the beginning of each line, so no need to capture the rest of it.

$explanatoryText = preg_replace( '/^/m', '# ', $text[0] );

function wfHooksInYaml( $handle, $string, $hookFile ) {

Why is this prefixed wf? That's a convention for global PHP functions in MediaWiki.