Page MenuHomePhabricator

Expose gender of gendered namespace aliases in siteinfo API
Open, Needs TriagePublic

Description

Currently, MediaWiki API doesn’t have any indication of gender for namespace aliases that are made for gender support. This information will be useful for me, specifically, to provide proper gender support of user page/talk links using right gender information of people in a Discord bot used by Russian Wikipedia community (which has gendered namespace aliases).

Because there is no indication of gender of namespace aliases right now, there’s no way to pick the right one for me even if I’d want to add that support, so I propose to add some check for this for namespace aliases in namespaces ‘User’ (2) and ‘User talk’ (3) in siteinfo API.

Current output:
https://ru.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces|namespacealiases&formatversion=2

Proposed output:

{
    "id": 2,
    "alias": "Участница",
    "gender": "female"
},
{
    "id": 3,
    "alias": "UT"
},
{
    "id": 3,
    "alias": "ОУ"
},
{
    "id": 3,
    "alias": "Обсуждение участницы",
    "gender": "female"
},

Event Timeline

stjn created this task.Sep 17 2018, 9:58 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 17 2018, 9:58 PM
Kf8 added a subscriber: Kf8.Sep 17 2018, 10:02 PM
putnik added a subscriber: putnik.Sep 18 2018, 6:11 AM
Anomie added a subscriber: Anomie.

As stated, this request has two issues.

First, MediaWiki doesn't have any straightforward method for determining whether an alias is supposed to be for a gender. We'd have to guess by looking at whether it's a gendered namespace and test the alias for equality with each possible gender's version, and that wouldn't flag any non-canonical gendered aliases. For example, if a language had "Male user talk" and "Female user talk" as canonical names for NS 3 and "MUT" and "FUT" as aliases, the latter wouldn't be flagged even though they're "supposed" to be gendered.

Second, MediaWiki internally has three "genders": 'male', 'female', and 'unknown'. In most but not all of the languages 'male' and 'unknown' use the same translation (aln and mwl are the exceptions). Should the single "gender" attribute indicate "male" or "unknown" for those where they're the same?

Possibly a better way to do it would be to have siprop=namespaces indicate the canonical gendered aliases rather than trying to do it per alias in siprop=namespacealiases. For https://mwl.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2 that might look like,

"2": {
    "id": 2,
    "case": "first-letter",
    "name": "Outelizador(a)",
    "subpages": true,
    "canonical": "User",
    "content": false,
    "nonincludable": false,
    "gendered": {
        "male": "Outelizador",
        "female": "Outelizadora",
        "unknown": "Outelizador(a)"
    }
},
"3": {
    "id": 3,
    "case": "first-letter",
    "name": "Cumbersa outelizador(a)",
    "subpages": true,
    "canonical": "User talk",
    "content": false,
    "nonincludable": false,
    "gendered": {
        "male": "Cumbersa outelizador",
        "female": "Cumbersa outelizadora",
        "unknown": "Cumbersa outelizador(a)"
    }
},

BTW, at the code level this should use $wgContLang->needsGenderDistinction() && MWNamespace::hasGenderDistinction( $ns ) to determine if a namespace has gender distinction, and then $wgContLang->getGenderNsText( $ns, $gender ) for each gender to fetch the gendered version. I don't know of any central list of genders in MediaWiki, although I suppose you could extract it from the 'gender' preference's HTMLFormField definition's options from PreferencesFactory.

stjn added a comment.Sep 18 2018, 3:33 PM

Should the single "gender" attribute indicate "male" or "unknown" for those where they're the same?

I was thinking that default state wouldn’t need to be specified in any way and non-default ones will be provided with their state, but I am personally fine with your proposal, too.