Analysis of test results for image recommendations for unillustrated articles from mediasearch are here https://phabricator.wikimedia.org/T272710#7119669
We want to get estimates of how many total unillustrated articles on each of the relevant wikis will have an image recommended by mediasearch, for different levels of likelihood-that-an-image-is-good in the recommendation. This is necessary for us to make a decision about which confidence score cutoff to use in making the recommendations. In general, we want the highest confidence score possible, but if there aren't enough recommendations at a high score, we will consider using a lower score.
The wikis are:
pt
de
ru
nl
vi
+1 TBD
The likelihood-that-an-image-is-good levels we want to measure are 0.89 (equivalent to the API's "high" confidence level), 0.75, 0.66, 0.58 (equivalent to the API's "medium" confidence level). We can start with just measuring 0.89, and evaluate if that is a good enough number to use. If the number of recommendations at the 0.89 confidence level is too low, we will need to measure again at 0.75.
Acceptance criteria:
[] Document the number of suggestions for unillustrated articles in the above wikis at the 0.89 confidence level
[] Work with product management to evaluate whether that number is sufficient
[] If not, measure again at the 0.75 level, etc.