For consistency, we rename last_access_uniques to unique_devices/per-domain (to match project-wide).
Things to do to deploy in production (order is somehow important):
- Stop oozie jobs
- last_access_uniques_daily and `monthly
- cassandra_unique_devices_daily and monthly loading job
- last_access_uniques_daily druid loading jobs
- Change hive datasets
- create the new hive tables unique_devices_per_domain_daily and unique_devices_per_domain_monthly
- copy from last_access_uniques_daily and monthly tables to unique_devices_per_domain_daily and monthly, using uri_host to populate domain
- Change archive folder and filenames in HDFS
- create new HDFS archive folder /wmf/data/archive/unique_devices/per_domain
- copy with renaming the HDFS files /wmf/data/archive/unique_devices/YYYY/YY-MM/unique_devices_daily-YYYY-MM-DD.gz and /wmf/data/archive/unique_devices/YYYY/YY-MM/unique_devices_monthly-YYYY-MM.gz to /wmf/data/archive/unique_devices/per_domain/YYYY/YYYY-MM/unique_devices_per_domain_...
- Restart oozie jobs
- unique_devices_per_domain_daily and monthly jobs (from last last_access_uniques job)
- Cassandra unique_devices_loading_daily and monthlyas coordinators(from last unique_devices loading job, and keep a calendar reminder to restart the bundle at the beginning of next month)
- unique_devices_per_domain_daily-druid - From beginning of uniques (2015-12-17), since schema change, full reload is needed.
- Drop previous druid data (datasource and fieldnames change) (see our druid doc)
- Disable the unique_devices_daily datasource (from druid coordinator UI)
- Ask druid to delete deep storage data
- Update documentation for names