As with previous editions, the 2019 [CE Insights survey](https://meta.wikimedia.org/wiki/Community_Engagement_Insights/Annual_surveys) will need help selecting a sample of currently active contributors.
We will need to do the following:
[] Update [last year's code]( https://github.com/wikimedia-research/Community-Engagement-Insights-sampling) with the following changes:
[] Combine the 1000-9999 and 10000+ edit buckets into a single 1000+ bucket.
[] Pull 450 users from a new South Asian language stratum (containing Hindi, Punjabi, Malayam, Maithili, Gujarati, Tamil, Urdu)
[] Pull 150 users from a new Vietnamese language stratum
[] Pull 150 users from a new Malay language stratum (containing Malay and Indonesian)
[] Pull 150 users from a new Korean language stratum
[] Use the "2019 contributor opt-outs" tab from [this spreadsheet](https://docs.google.com/spreadsheets/d/1Ig_CKwjTw6_APQP3gwbvIk_SrEEE7Kt25m6ODgc8J7U/edit#gid=116779925) as our opt-out list
[] Add the Programs and Events Dashboard users (from the "Dashboard Leaders by home wiki 2018-2019" tab of [this spreadsheet](https://docs.google.com/spreadsheets/d/1k_H7XNC9GvwG9PRguXH5E6HY8o1wKdKoG84SygYd6mQ/edit#gid=1370015503)) as a additional stratum so they can be run through the next step along with the active editors
[] Use a fixed seed value for the sampling (e.g. the `random_state` parameter of `pandas.DataFrame.sample()`) so that, if necessary, we can re-run it with changes while still selecting the same users
[] Run the code to provide the users sampled for each stratum
[] Look up each user's registered email address (this will require using the MediaWiki replicas; email addresses are available both centrally in the `centralauth.globaluser` table and locally in each wiki's `user` table, but it's not clear if one of those places is more reliable or more up to date)
[] Provide Learning and Evaluation with a list of the sampled users, with the following data for each:
[] user name
[] home wiki
[] registered email address if found (note this makes the dataset `[sensitive]`)
[] email verification date if found
[] edit count stratum
[] wiki stratum