Page MenuHomePhabricator

{{GENDER}} for bots
Open, Needs TriagePublic

Description

We currently have the magic word {{GENDER}}, which can take up to three unnamed parameters, and the output depends on what the mentioned user has set as their gender in the preferences. This is necessary for many languages (like Arabic or Spanish, for instance) that use separate forms depending on the gender. The syntax is like this:

{{GENDER:Arbitrary_username|he|she|they}}

However, in Swahili (and I believe most Bantu languages, but Swahili is the only one I know), they don't have gendered pronouns (or subject/object prefixes, which is what's normally used) at all, so {{GENDER}} is seemingly superfluous. What they do have, however, are noun classes. So humans and animals belong to classes 1&2 (1=singular, 2=plural), while things belong to classes 9&10. And bots are not humans, they are things.

Therefore it would be great to have a way to make {{GENDER}} understand not only the gender of a user, but also the user's humanity (or lack thereof).

For instance, in [[Special:Log/move]] on the Swahili Wikipedia, we have this line (verb in bold):

  • 12:03, 25 Juni 2020 Kipala Majadiliano michango alihamisha ukurasa wa Mbuyuni (Chunya) hadi Mbuyuni (Songwe) (rejesha) (thank)

If User:Kipala had been a bot, the line should have read:

  • 12:03, 25 Juni 2020 Kipala Majadiliano michango ilihamisha ukurasa wa Mbuyuni (Chunya) hadi Mbuyuni (Songwe) (rejesha) (thank)

I.e. the subject prefix changes from a- to i-.

So the question is: Would it be possible to introduce some sort of mechanic to {{GENDER}} that could also look at a whether or not a user has a bot status? The solution I'm thinking of is introducing a named parameter, let's say bot, to the {{GENDER}} magic word, that takes effect if the user is in the bot user group. (But of course I'm open to other solutions as well.)

That way, for languages like Swahili, you could do {{GENDER:Arbitrary_username|he|she|they|bot=it}} when needed.

Event Timeline

Potential problem: Bot status is non-permanent, an account can get it or lose it whenever. But that's also the case for the gender in settings, a user could change that at any time, so I don't know if it's a big issue.

Adding @Muddyb, who is a native Swahili speaker, and could potentially be interested and shed some light on this.

I'm wondering how Slavic languages handle this, as there is the concept of animate and inanimate declension for nouns with a masculine gender (and in East-Slavic also feminine gender). See items in https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Masculine,_feminine,_and_neuter with an asterisk.

Adding @Muddyb, who is a native Swahili speaker, and could potentially be interested and shed some light on this.

I think you got it covered already.
What can I do to help?
Whatever it takes!

I'm wondering how Slavic languages handle this, as there is the concept of animate and inanimate declension for nouns with a masculine gender (and in East-Slavic also feminine gender). See items in https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Masculine,_feminine,_and_neuter with an asterisk.

In Russian, it's more about noun and adjective case endings and not about verbs. It would perhaps be relevant with usernames, but I don't think that we ever put the usernames themselves in GRAMMAR or GENDER. More precisely, we do use usernames as a parameter to GENDER, but I cannot think of any instance of modifying the username using GRAMMAR or GENDER.

I can think of at least one website that does modify names of users according to grammar when the Russian-language user interface is used: The social network VK, which is popular in Russian-speaking countries. However, the names there are expected to be natural real names, so you can decline them using software mostly correctly, whereas on wikis the usernames are an unpredictable mix of natural people names and pseudonyms, so it's too complicated to have a system that would decline them well.

More generally to the original question: it sounds like a valid use case, but it may be complicated to implement in practice for now. GENDER uses a user preference, while the state of being a bot is stored differently, as a user group. It's probably conceivable to modify the GENDER code to check for the bot group, but it's probably not trivial. Perhaps @Nikerabbit can think of more details.

We could add a fourth gender option to preferences, and then the bot account holders can decided how they want their bot to be talked about? I think it's also possible to set preference defaults per user group using a hook, but I haven't checked that.

I can think of at least one website that does modify names of users according to grammar when the Russian-language user interface is used

Facebook also does this for Ukrainian names, sometimes incorrectly, see https://www.facebook.com/groups/263372487032233/permalink/3153883047981148/ for example.

Perhaps something we could do when Lexicographical data on WD will rock :)