Page MenuHomePhabricator

Wikidata Humans and Gender Data Tools
Open, Needs TriagePublic

Description

Tools like Wikidata Human Gender Indicators[1] and Denelezh[2] and Wikidata Cultural Observatory[3] display data about Wikidata's human coverage, but do they do provide exactly what anti-bias communities want? What new features, toolsm or maintenance could we build to help these "countering systemic bias" projects succeed?

[1] http://whgi.wmflabs.org/
[2] https://www.denelezh.org/gender-gap/?project=enwiki
[3] https://meta.wikimedia.org/wiki/Wikipedia_Cultural_Diversity_Observatory
[4] http://wmdeanalytics.wmflabs.org/WDCM_BiasesDashboard/

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 9 2019, 7:11 AM
Envlh added a subscriber: Envlh.Aug 18 2019, 10:12 AM

@Envlh and I decided to make a roadmap for a "new" tool that would combine the best features of our 2 tools, and make a common platform to de-duplicate engineering effort.

We found there are 3 stages common.

  • Wikidata --> WDTK --> Intermediate-Human-Focused Database
  • Transformations and computations
  • Website or UX/UI.
notconfusing added a comment.EditedAug 18 2019, 10:26 AM

TODO:
@Envlh

  • create draft domain schema

@notconfusing

  • add you to whgi.wmflabs.org access
  • create mysql database on whgi.wmflabs.org

other:

  • determine exactly which reports we want to in the new
  • [make the architecture easy for new reports]
  • find a way to smart-check when there is a new dump.

As far as reports go, it would be nice for the portrait part of the Sum of all Paintings if we could easily see the missing women in the genealogy chains from circa 1450 onwards. Often we already have the portrait items of the women, but are missing the human items for the women. Meanwhile the men seem to spawn from generation to generation 100% from the male chromosomes and only half of pendant portrait pairs are ingested as data. If we can identify missing wives/mpthers/daughters we can interlink the female portraits to the sitters. Sometimes it goes the other way: we have the portrait as an image illustrating the human item, but no portrait item yet. We want both!

Hey all, so I wrote a requirement for something that could be a rewrite of the Monumental software by @Yarl and @Slaporte. I have been shopping around the idea, and there is a fair bit of interest for a "missing wiki women" type list building do: https://docs.google.com/document/d/1LtBujK6kARbwxUzDyF6hpv445rJwk7Ka7rAlf7e9USg/edit -- I would be happy to talk with folks about it.

Magnus added a subscriber: Magnus.Sep 12 2019, 1:18 PM

This interests me a lot and I'd like to stay in the loop with conversations. Thanks.

Hi, this open task is tagged only with Wikimania-Hackathon-2019 which is in the past.
If this task was being worked on and resolved at the Wikimania 2019 Hackathon: Please change the task status to "resolved" via the Add Action...Change Status dropdown.
If nobody plans to ever work on this task anymore: Feel free to set the task status to "declined".
If this task is still valid and should stay open: Please add an active project tag to this task, so others can find this task when searching for open tasks under that active project or when looking at that project's workboard.
Thank you for helping clean up a bit! :)

Maximilianklein added a subscriber: Maximilianklein.

@Aklapper thanks. I want to keep it open, but don't know what project to assign it to. It should either be part of a community data/statistics tools project. Or perhaps it is its own project? How can I find a good new tag for it?

Envlh added a comment.Oct 14 2019, 7:46 PM

@Maximilianklein I sent you an email about this topic.

Hey all, so I wrote a requirement for something that could be a rewrite of the Monumental software by @Yarl and @Slaporte. I have been shopping around the idea, and there is a fair bit of interest for a "missing wiki women" type list building do: https://docs.google.com/document/d/1LtBujK6kARbwxUzDyF6hpv445rJwk7Ka7rAlf7e9USg/edit -- I would be happy to talk with folks about it.

@Astinson what's the status of that project. It seems like an output of this project could at least fulfil the Women in Red portion of your requirement.

As far as reports go, it would be nice for the portrait part of the Sum of all Paintings if we could easily see the missing women in the genealogy chains from circa 1450 onwards. Often we already have the portrait items of the women, but are missing the human items for the women. Meanwhile the men seem to spawn from generation to generation 100% from the male chromosomes and only half of pendant portrait pairs are ingested as data. If we can identify missing wives/mpthers/daughters we can interlink the female portraits to the sitters. Sometimes it goes the other way: we have the portrait as an image illustrating the human item, but no portrait item yet. We want both!

Cool, thanks for noting this extra use case @Jane023 . We don't currently ingest Portraits, but it would be great to identify frequently occurring property pairs beyond the obvious birth/death/nationalities like you mention.

Women in Red members have uploaded thousands of portrait photos of notable women into Commons without also creating a Wikidata item for them. If this could be automated, there is the potential for thousands of additional names of missing notable women to appear on the Women in Red redlists. Conversely, some of the portrait photos uploaded to Commons years ago are for women who now have a Wikipedia article, but the editor didn't check Commons after they created the Wikipedia article. If the Commons image is linked to a Wikidata item, and the Wikidata item is linked to the Wikipedia article, how can we close the loop with getting the portrait photo into the Wikipedia article?

@Maximilianklein we are building a prototype with @Slaporte at the WikiConference North America Hackathon: would be happy to workshop it there with you all? Are you going?

@Astinson that prototype sounds interesting. I won't be at WikiConference North America, regretably. I would be happy to Skype in and chat, as I think your campaigns tool would be a good candidate as a consumer of our the data output we're planning. I'll comment on your gdoc.

Having spoken briefly with Envel on Sunday in Berlin after WikidataCon, I think it would be good to break this up into various tasks. We have a largish problem with articles coming in to Wikidata semi-automatically from various Women-in-Red editathons (all languages including English) that are not tagged as human or female. There are various people who work on these in various semi-automated projects. It would be nice to be able to measure these specifically somehow as a group so we can do better at catching them on the day, at the source. Once they are tagged as human and female, they come into "the grand bit bucket that cannot be queried due to time-outs". I propose setting up various measurements to tackle the 2-statement items that need further sorting, and then the 3+ statement items can be set up in various other visualisations. Once you have occupations set up, the same visualisations can be applied per occupation. The more specific the data, the more volunteers there are who are willing to help with improvement tasks. For example it is much easier to improve the item of a women who is a professional tennis player once you know that is her occupation. I think if we set it up correctly, the totals can be generated from "lists of lists" that possibly drill down to item level instead of a grand database query.

Thanks @Jane023 for the great "User Story", that really helps shape the concept for how we can make the next version "usable", rather than just browsable.

I was looking at @Envlh 's site and remarked that the color is green for the masculine side and yellow for the feminine side. I just realized that it's the same for @Maximilianklein's tool. Green is usually associated with a positive, something completed and good, although in this case, the green part is supposed to be the overwhelming part and showing that something is wrong. So I'm just adding a note here to remind you to think about connotations of colors and what message they send when revamping those tools :) Thanks for your work!

Envlh added a comment.Mar 17 2020, 8:47 PM

A project grant was opened to achieve this project. Feedback is of course welcome :)

@Aklapper this project is funded now, and has a new name 'humaniki' . Can you help me set-up a new tag or board to start tracking its progress? I'm not sure what level of organization is appropriate. I want to model out the technical development, as well as take user feedback, and bug reports. Thanks.

@notconfusing: Congratulations! :) Please follow https://www.mediawiki.org/wiki/Phabricator/Creating_and_renaming_projects to request a project tag - thanks!

Sek2016 added a subscriber: Sek2016.Wed, Sep 9, 5:07 AM
This comment was removed by Sek2016.

@notafish For humaniki, we are thinking of using green to represent Men, Purple to represent Women and Golden to represent 'Other genders'.

I understand your concerns about green being associated with positive, and your idea of symbolizing the Men stats as 'overwhelming part and showing that something is wrong'. But I am thinking of this in a different manner and your comments/feedback on this are welcomed.

  1. Symbolizing Men Stats as 'negative/overwhelming', would require highlighting that with a bold color which would divert users' focus to mens stats over women. I am instead planning on using a vibrant purple for women and cool green for men, to bring women stats under focus and divert users attention to low numbers of female articles. (also purple is color used for symbolizing women internationally)
  2. I was influenced by a telegraph designer who used ‘purple for women’ and ‘green for men’, here's a quote from her, "Against white, purple registers with far greater contrast and so should attract more attention when putting alongside the green, not by much but just enough to tip the scales. In a lot of the visualisations men largely outnumber women, so it was a fairly simple method of bringing them back into focus.”
  3. One other factor that contributed to this decision is the usage of neutral color golden for ‘other genders’ which complements green and purple colors when using color palettes with triadic chromatic relationships.

Other option can be using blue for men, dark yellow for women and green for other genders (combination selected using triadic chromatic relationships)
Pros: This color palette would not require users to learn three new color mappings as blue is traditionally used to represent Men.

References:
https://www.internationalwomensday.com/about
https://blog.datawrapper.de/gendercolor/
https://www.canva.com/colors/color-wheel/