HomePhabricator

plugin: get template field values for all sitelinks

Authored by dachary on Oct 15 2016, 9:26 AM.

Description

plugin: get template field values for all sitelinks

For a given item, extract the templates found in the interlink pages.
Extract the value of a given field and return a map that looks like:

{
  'fr': 'License Publique Générale GNU',
  'en': 'GNU General Public License',
}

Only consider the templates matching the pattern specified in
lang2pattern. For instance Infobox etc. To limit the chances of
conflicting values should another template have the same value for a
given field.

The value is extracted from the field named after the 'en' entry of
lang2field. For instance if lang2field['en'] = 'License', the license
field will be extracted. If lang2field['fr'] does not exist, the french
translation as returned by the translate_title method will be used. If
the lang2field['zh'] = 'license' exists, it is used and no attempt is
made to translate the english word. It is not uncommon for some
wikipedia to use fields that are not in the native language.

Change-Id: I43178b93a3e2e4445c2b73742bef6f072b65d3f2
Signed-off-by: Loic Dachary <loic@dachary.org>

Details

Committed
dacharyOct 15 2016, 9:52 AM
Parents
rPBFB2119e4a4f19b: plugin: get the title of langlinks for a page
Branches
Unknown
Tags
Unknown
References
refs/changes/57/316057/1
ChangeId
I43178b93a3e2e4445c2b73742bef6f072b65d3f2