Decide strategy for evaluating the article-country model. It's not a straightforward evaluation for a few reasons:
- Some of the countries are pulled directly from Wikidata (by definition we're essentially calling these 100% precision though we could verify this at some point)
- The countries that are "missing" from Wikidata are not missing at random -- i.e. certain topics have very high country coverage on Wikidata such as people and country-of-citizenship but certain topics have very low coverage such as countries where a given species is endemic. The intent is for the model to use the links in the article to discover these missing countries but if we test it with the existing groundtruth, it'll tell us how well it works for articles where we don't need it but not tell us how well it works for articles where we do need it.
- There is also the language component to consider -- this might be more or less effective in different languages based on their article lengths / linking norms.
In reality, I'll probably do a few things:
- Testing the linking approach on the available groundtruth just to get a sense of its effectiveness and because it can be done easily.
- Split articles into 1) those with countries from Wikidata, 2) those without Wikidata-based countries but with link-based predictions, and 3) those without Wikidata-based countries but with no link-based predictions. Verifying the results for group 1 and 3 should be pretty quick but maybe less important and so I can probably get by with a reasonable random sample. It's group 2 that probably takes the most time to validate because evidently there may be a connection between the subject of the article and the country but it might not be direct or explicit. For these, I'll more carefully build a sample to have folks evaluate for me.