Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | • Rileych | T240517 [EPIC] Growth: Newcomer tasks 1.1.1 (ORES topics) | |||
Open | None | T246909 Follow-up cleanup to topic models | |||
Open | None | T246910 Filter out disambiguation pages in topic labels | |||
Resolved | Isaac | T246912 Clean up History and Society.Society in the topic taxonomy. | |||
Open | None | T248042 Newcomer tasks: SE for kowiki not scored correctly (investigate) |
Event Timeline
Thanks for creating this task, @Halfak. I'll respond to T245368#5941808 here (also @Tgr and @Isaac, who were participating).
I see what you mean about women-related topics, and why they might be okay in "Biography (women)". You're saying we could instead retitle that topic like "Women biographies and organizations"? Is there an easy way to use Wikidata to check what percent of articles with high scores on that topic are actually biographies?
And another question -- I see that you made subtasks for disambiguation pages and for "Society". Do you also intend to address the other issues we listed, like the weaker topics, especially "Central Africa"?
Thanks for creating this task, @Halfak.
Agreed!
Is there an easy way to use Wikidata to check what percent of articles with high scores on that topic are actually biographies?
If you can grab a sample of even a few hundred articles with a high predicted probability for Biography (women), then it's very simple to write a script to check whether they are biographies of women per Wikidata. I'm happy to do the check part though would need help extracting the list of predicted Biography (women) articles.
@Halfak -- I wanted to ping you about the questions in my previous comment.
Also, I added a subtask in which @Etonkovidova details the performance of the "Central Africa" topic.
Moving this to "Medium" because it seems getting more topic models has higher priority than this.