Per T385970, we'll need a SQLite DB dependency for the article-country model on LiftWing. Isaac produced the initial database but, per discussion with Fabian, we should build an Airflow DAG to make updating this dependency easier in the future. The current code is custom but we should be able to build directly on top of the geography and cultural data in the content gaps metrics because the increased accuracy of e.g., the coordinate-based data on the article-country model is not necessary for this DB so we can use the more efficient approach taken with the content gaps.
Steps:
- @Isaac produce notebook with initial code for how to generate this SQLite DB. Ideally follow the logic used by reference-quality (code). Done: https://gitlab.wikimedia.org/isaacj/miscellaneous-wikimedia/-/blob/master/article-country/article-country-sqlite-dependency.ipynb?ref_type=heads
- REng update country properties for cultural gap to include "P9714": "taxon range".
- REng incorporate geographic/cultural gaps pipeline into research-datasets (T377267) and add in SQLite DB pipeline, presumably as code that can be manually run when an update is desired on LiftWing.