The copy happens from hdfs://analytics-hadoop (60 nodes analytics cluster) to hdfs://analytics-backup-hadoop (16 nodes cluster)
The backup cluster stores data with a replication factor of 2 instead of 3 in regular analytics cluster.
Doing the copy in chunks to prevent too long jobs.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | JAllemandou | T168554 Default hive table creation to parquet - needs hive 2.3.0 | |||
Resolved | elukey | T203498 Upgrade Hive to ≥ 2.0 | |||
Resolved | elukey | T203693 Update to CDH 6 or other up-to-date Hadoop distribution | |||
Resolved | elukey | T273711 Upgrade the Analytics Hadoop cluster to Apache Bigtop | |||
Resolved | JAllemandou | T272846 Backup HDFS data before BigTop upgrade |
Event Timeline
Mentioned in SAL (#wikimedia-analytics) [2021-01-25T12:25:17Z] <joal> Copy /wmf/data/archive to backup cluster (32Tb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-25T17:13:34Z] <joal> Copy /user to backup cluster (92Tb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-26T08:42:42Z] <joal> Copy /wmf/camus to backup cluster (120Gb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-26T09:01:44Z] <joal> Copy /wmf/discovery to backup cluster (120Gb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-26T09:07:33Z] <joal> Copy /wmf/refinery to backup cluster (1.1Tb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-26T09:35:40Z] <joal> Copy /wmf/data/discovery to backup cluster (21Tb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-27T13:02:39Z] <joal> Copy /wmf/data/event to backup cluster (30Tb) - T272846
Mentioned in SAL (#wikimedia-analytics) [2021-01-29T07:44:20Z] <joal> Copy /wmf/data/event_sanitized to backup cluster (T272846)