Rate article importance. This can take many different forms and any given rating of importance likely depends on the context under which it was asked (e.g., importance of a given article to a topic or a wiki or a region or to the diversity of coverage in a wiki at a given point in time). This work will help to identify these factors and build tools that support automatic ranking of articles according to their importance in a given context.
Background
Article importance is nebulous but important component of prioritizing wiki work (see this very comprehensive lit review). Various applications of importance can be found across the wikis -- e.g., Vital Articles lists, WikiProject importance assessments, inclusion in offline wikis, Identifying Topics for Impact in Movement Strategy.
In previous work, article importance appears to be most often represented via demand (as measured by pageviews), centrality (as measured by number of inlinks, PageRank, or other network measures), or language coverage (number of sitelinks). While these factors have been demonstrated to provide strong indicators of importance per external assessments, they tend to reinforce existing notions of importance and do not necessarily help move us towards a more diverse and inclusive Wikipedia given the large gaps seen in reader populations, existing content biases, and biases in the external sources that are available and types of people and ideas that history has centered.
Wiki things it could help with
- Recommender Systems:
- SuggestBot could better help editors find the most impactful things to work on.
- FixmeBot
- Content Translation
- GapFinder
- Newcomer tasks
- List-building -- e.g., helping WikiProjects / campaigns to build ranked lists of articles to work on
- Metrics / Research:
- More comprehensive measurements of productivity in Wikipedia: https://meta.wikimedia.org/wiki/Research:Measuring_value-added
- Misalignment of quality and importance within a language edition akin to Warncke-Wang et al. (potentially as a measure of knowledge gaps)
Potential components of importance
- Topical relevance -- either along a pre-defined taxonomy or more ad-hoc semantic relatedness of a given keyword / article to other Wikipedia articles such as via MoreLike. For instance, en:Waffle is top-importance for WikiProject Breakfast, but only high-importance for WikiProject Food and Drink.
- Global relevance -- e.g., geographic (what countries are mentioned), language (sitelinks)
- Reader demand -- e.g., pageviews, reader sessions, language switching
- Centrality -- e.g., inlinks, pagerank
- Diversity -- e.g., to what degree does an article match characteristics of existing articles vs. provide new content
- ...
Past work
In the past, work has largely focused on modeling the article importance assessments produced by WikiProjects (akin to ORES articlequality models):
- https://meta.wikimedia.org/wiki/Research:Measuring_article_importance
- https://meta.wikimedia.org/wiki/Research:Automated_classification_of_article_importance
- Quality and Importance of Wikipedia Articles in Different Languages
Other related projects:
- Growing Wikipedia Across Languages via Recommendation: https://arxiv.org/abs/1604.03235
- Cultural relevance: https://meta.wikimedia.org/wiki/Wikipedia_Cultural_Diversity_Observatory