Page MenuHomePhabricator

Improve WikiProject template --> WikiProject mapping
Closed, ResolvedPublic

Description

Most WikiProject templates are straightforward matches to their WikiProject.

E.g. "Template:WikiProject Biography" is used by "Wikipedia:WikiProject Biography"

But in some cases (especially with old WikiProjects), redirects are used. E.g. the article on Army uses "Template:WPMILHIST" which redirects to "Template:WikiProject Military history".

So we need a strategy for identifying all of the redirect templates that match a particular WikiProject.

Essentially we want something that looks like this:

"WikiProject Military history" --> ["WikiProject Military history", "Mil Hist", "MILHIST", "MilHist", "Milhist", "Military history", "WikiProject Colditz", "WikiProject MILHIST", "WikiProject Military", "WikiProject Military History", "WikiProject War", "WP Mil", "WP MILHIST", "WP Military History", "WP ilitary history", "WPCAS", "WPMH", "WPMIL", "WPMILHIST", "WPMilhist"]

Script call signature

Generate a mapping between WikiProject templates and the all of the templates that redirect to them.

Usage: 
    wikiproject_template_map  (-h | --help)
    wikiproject_template_map  <taxon>... 
                              [--ua-email=<address>] [--threads=<num>]
                              [--verbose] [--debug]

Options:
    -h --help     Prints this documentation
    <taxon>       A yaml file containing partial or whole taxonomy.  Multiple
                  files will be merged.
    --ua-email=<address>  An email address to be included as a user-agent
                          header for requests to the MediaWiki API.
    --threads=<num>  How many threads to run in parallel [default: 4]
    -d --debug  Print log information while running
    -v --verbose  Print log information while running

Usage

wikitaxon.yaml:

STEM:
  Science:
   - WikiProject Women scientists
$ python wikiproject_template_map wikitaxon.yaml

Prints:

WikiProject Women scientists:
 - WikiProject Women scientist
 - WikiProject Women Scientists
 - Wikiproject Women Scientists
 - WP Women Scientists
 - WPWS

We can make one API call to both: Check if Template:<WikiProject Name> is itself a redirect and what the canonical template is named and get all of the redirects to the canonical template.

For an example, let's consider "WikiProject Women Scientists" -- which is slightly different than the target WikiProject but will be useful for demonstration purposes:
https://en.wikipedia.org/w/api.php?formatversion=2&action=query&prop=linkshere&lhshow=redirect&lhnamespace=10&lhlimit=500&redirects=true&titles=Template:WikiProject_Women_Scientists returns:

{
    "batchcomplete": true,
    "query": {
        "redirects": [
            {
                "from": "Template:WikiProject Women Scientists",
                "to": "Template:WikiProject Women scientists"
            }
        ],
        "pages": [
            {
                "pageid": 37510649,
                "ns": 10,
                "title": "Template:WikiProject Women scientists",
                "linkshere": [
                    {
                        "pageid": 37985986,
                        "ns": 10,
                        "title": "Template:WPWS",
                        "redirect": true
                    },
                    {
                        "pageid": 38402696,
                        "ns": 10,
                        "title": "Template:WikiProject Women Scientists",
                        "redirect": true
                    },
                    {
                        "pageid": 50798492,
                        "ns": 10,
                        "title": "Template:WikiProject Women scientist",
                        "redirect": true
                    },
                    {
                        "pageid": 53977302,
                        "ns": 10,
                        "title": "Template:Wikiproject Women Scientists",
                        "redirect": true
                    },
                    {
                        "pageid": 53977303,
                        "ns": 10,
                        "title": "Template:WP Women Scientists",
                        "redirect": true
                    }
                ]
            }
        ]
    }
}

In the "redirects" block, we can see that our target name was resolved to "WikiProject Women scientists". We can also see that there are 5 relevant redirects including the mis-capitalization we originally queried with.