General issue:
Description
The MaxMind database has recently updated the names of some countries, for instance Netherlands -> The Netherlands.
On Dember 14th Turkey got renamed to Türkiye in our MaxMind database.
The traffic anomaly detection job, which was configured to track Turkey (among the other countries)
has since not found any traffic counts for Turkey and assumed they were 0.
Thus, it has raised false positive anomaly alerts, thinking there was a sudden traffic drop for that country.
We should fix that, so that false positive alerts stop, the monitored metric history is restored, and the job can continue to monitor the country properly.
This should be a quick fix, the original problem affecting many pipelines (all pipelines that use MaxMind country names) is tackled in T353959.
A possible approach is (two steps):
- Change the query that generates the traffic anomaly metrics to group by MaxMind country code instead of country name. Then join the resulting metrics to the canonical_data.countires table to retrieve our canonical country name for each code. Test, review, merge and deploy.
- Correct the data by: Copying all production anomaly detection metric data over to a temporary place - with the Türkiye country name normalized to Turkey. Then replace the production dataset with the temporary one.
Acceptance Criteria
- All the traffic anomaly detection metric have 1 single country name across all time.
- The Airflow job for traffic anomaly detection outputs metrics with consistent country names.
Required
- Modify traffic anomaly detection query
- Test it in Airflow's development instance
- Review, merge and deploy to airflow analytics
- Re-run the corrupted dates in Airflow