Reported by Laurence "GreenReaper" Parry from (Wikibase Community User Group).
There was a previous outage that was tracked and resolved in T279443
$ tail -n 1 external_traffic/referer_* ==> external_traffic/referer_data.tsv <== 2021-04-25 TRUE external (search engine) Ecosia mobile web 498982 ==> external_traffic/referer_data.tsv.tmp <== Scaling row group sizes to 93.16% for 1 writers access_method date is_search pageviews referer_class search_engine ==> external_traffic/referer_nonbot_data.tsv <== 2021-04-25 TRUE external (search engine) Ecosia mobile web 498872 ==> external_traffic/referer_nonbot_data.tsv.tmp <== Scaling row group sizes to 93.16% for 1 writers access_method date is_search pageviews referer_class search_engine
$ tail -n 1 wdqs/basic_usage* ==> wdqs/basic_usage.tsv <== 2021-04-25 /bigdata/namespace/wdq/sparql TRUE FALSE 950887 ==> wdqs/basic_usage.tsv.tmp <== Scaling row group sizes to 92.77% for 1 writers date events http_success is_automata path
It seems that Reportupdater was updated in April to enable hive as a report type. Might be useful to switch to that, but still wouldn't explain the error here.
Unfortunately as of right now 90 days is already April 27th, so there's going to be a gap in the data no matter what. Just a question of how a big at this point.
Looking art those .tmp files it would appear I probably need to add some more grep's to https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia/discovery/golden/+/refs/heads/master/modules/metrics/external_traffic/referer_data and https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia/discovery/golden/+/refs/heads/master/modules/metrics/wdqs/basic_usage