Page MenuHomePhabricator

Develop Metrics for the Language Gap: Develop metrics for scripts (writing systems)
Closed, ResolvedPublic

Description

A proposed metric facet for the State of Languages Metrics is script coverage, to include the following metrics:

  • Number of world's languages with script(s) supported on Wikimedia projects
  • Number of scripts represented across hosted content projects
  • Number of scripts represented across hosted and pre-hosted (i.e. test) content projects

Currently, no structured data exists connecting our public scripts data with Wikimedia projects.

Tasks:

  • Build notebook to
  • Build a notebooks for metrics calculation and visualization

Event Timeline

CMyrick-WMF changed the task status from Open to In Progress.Aug 14 2024, 7:49 PM
CMyrick-WMF updated the task description. (Show Details)

scripts_scraper.ipynb: notebook scrapes Wikimedia script data from langdb.yaml, and joins with script names from ISO 15924 wiki table.

Weekly update:

  • Finalized scripts_scraper.ipynb notebook which
    • scrapes public scripts data
    • joins with public ISO 15925 data
    • wrangles for use in metrics calculations:
This comment was removed by CMyrick-WMF.
CMyrick-WMF updated the task description. (Show Details)