Goal
Incorporate vital-article gap into Knowledge Gap pipelines.
Details
Caroline put together code under T383925: Develop Metrics for the Language Gap: Explore vital article coverage across Wikipedia language editions to calculate coverage of the List of articles every Wikipedia should have for all language editions of Wikipedia. It should largely correspond to the schema used by the other language gap metrics. The notebook currently uses a SPARQL query to gather the data but that could easily be converted into a query against the item_page_link table if that's deemed a more sustainable approach.
Motivation
The existing content gap metrics already cover several facets but we have not yet provided coverage of the Language Gap. This vital articles gap arose from exploration of what metrics might look in that space. Ultimately it was determined that the Language Gap was multi-faceted and closely related to the Topics for Impact gap as well. The vital articles component captures a globally-important set of articles that Wikimedians have determined should exist on all Wikipedia language projects. Measuring progress against this list helps in assessing one aspect of these gaps. Other aspects of these gaps are also under consideration -- e.g., language coverage, more locally-impactful topics -- but this vital-article component is well-defined and worth formalizing while we wait for a clearer scope with respect to the broader language and topics for impact space.