Page MenuHomePhabricator

Investigation: Where can we automatically get the female form of a category?
Open, Needs TriagePublicSpike

Description

Motivation
Wikidata does not yet have the female form for most categories, so let's find out which categories we could actually find the female form for.

Acceptance Criteria

  • Write down all the cases where we can find the female form (e.g. occupations)
  • Write down all the cases where we will have problems or will not be able to do anything about it (e.g. nationalities before nouns, and even worse instances)
  • Explain how we can override edge cases with manual entry.
  • If you are unclear if we can get to that data or not, try it out

Event Timeline

Lea_WMDE created this task.Jul 24 2019, 1:05 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 24 2019, 1:05 PM
awight updated the task description. (Show Details)Jul 24 2019, 2:57 PM
awight added a subscriber: awight.

I've added an acceptance criteria to document how edge cases are corrected. We need to be able to prevent our software from adding certain incorrect categories to its list, override bad guesses, and provide additional edge cases manually.

awight added a comment.Aug 1 2019, 8:10 AM

Initial guesses found in T227874: Investigation: Check Wikidata data on gendered category names were,

  • P2521, female form of label - There is precedent to apply this property to categories, occupations, and more.
  • P3321, male form of label
  • P2959, permanent duplicated item - This is a persistent alias for another item, not a duplicate scheduled for eventual merge.

There doesn't seem to be a "form of label" property for other genders, see https://www.wikidata.org/wiki/Special:Search?search=form+of+label&ns120=1

Another property that is being used for aliasing, but doesn't carry gender information:

  • P301, topic's main category - This has been used to link alternative spellings which would be used by the biographical subjects in describing their profession: British theater directors and American theater directors.

Adding an active project tag so this task is shown on some workboard.

Adding WMDE-TechWish to this open task as there are no active project tags on this task since the archival of WMDE-TechWish, hence nobody could find this task on some workboard.

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptNov 2 2020, 10:04 AM