Page MenuHomePhabricator

Rename event_sanitized partition directories to lowercase
Closed, ResolvedPublic

Description

Work done in T273789 refactors Refine code to always normalize output partition paths, meaning they are always lowercased. This is at odds with the existent partitions in the event_sanitized database. We already (semi-accidentally) re-refined and normalized partition paths in the event database. To avoid all that extra work when we start to use the new RefineSanitize job, we should manually move and rename partition paths in event_sanitized now.

Event Timeline

Huh wow! Apparently Hive table and partition locations are also case insensitive!

I just did a test where I moved lowercased the external table's directory name, and all partitions and queries still worked as is!

I also altered the table LOCATION and did a MSCK REPAIR TABLE, and partitions needed to be added.

So, I believe this will not require any extensive DROP and ADD PARTITION statements, we'll only need to rename the HDFS paths for good measure alter the table LOCATION to use the lowercased directory names and we're good.

Mentioned in SAL (#wikimedia-analytics) [2021-04-21T20:04:05Z] <ottomata> renaming event_santized hive table directories to lower case and repairing table partition paths - T280813

Huh, my test must not have been representative. Marcel and I just paired to do this and a MSCK REPAIR TABLE was needed. Here's our procedure:


# On an-launcher1002 in /home/otto
mkdir event_sanitized_table_rename_T280813
cd event_sanitized_table_rename_T280813

mv_cmds_file=./event_sanitized_mv_cmds.sh
truncate -s 0 $mv_cmds_file
echo "#!/bin/bash" >> $mv_cmds_file
chmod 755 $mv_cmds_file

hive_drop_old_table_file=./event_sanitized_hive_drop_old_table.hql
truncate -s 0 $hive_drop_old_table_file

mkdir ./done;
# list out all paths in event_sanitized that need to be renamed.
for path_orig in $(sudo -u hdfs kerberos-run-command hdfs hdfs dfs -ls /wmf/data/event_sanitized | grep -vE '^Found ' | grep -vE 'netflow' | awk '{print $NF}'); do
    # lowercase the path_orig
    path_new=$(echo $path_orig | awk '{print tolower($0)}')
    hive_table=$(echo $path_new | awk -F '/' '{print $NF}')

    # Build a shell script of hdfs dfs -mv commands
    echo "hdfs dfs -mv $path_orig $path_new" >> $mv_cmds_file

    # Build Hive scripts to recreate the tables with new lowercased partitions and then repair the tables to recreate the partitions.
    hive_alter_sql_file=./event_sanitized_${hive_table}_alter.hql
    truncate -s 0 $hive_alter_sql_file

    # Build Hive scripts to recreate the tables with new lowercased partitions and then repair the tables to recreate the partitions.
    echo "SET mapred.child.java.opts=-Xmx4G -XX:+UseConcMarkSweepGC  -XX:-UseGCOverheadLimit;" >> $hive_alter_sql_file
    echo "ALTER TABLE event_sanitized.${hive_table} RENAME TO event_sanitized.${hive_table}_T280813;" >> $hive_alter_sql_file
    echo "CREATE EXTERNAL TABLE event_sanitized.${hive_table} LIKE event_sanitized.${hive_table}_T280813;" >> $hive_alter_sql_file
    echo "ALTER TABLE event_sanitized.${hive_table} SET LOCATION 'hdfs://analytics-hadoop/$path_new';" >> $hive_alter_sql_file
    echo "MSCK REPAIR TABLE event_sanitized.${hive_table};" >> $hive_alter_sql_file

    # Build a Hive scipt that drops the old upper cased partition location tables.
    echo "DROP TABLE event_sanitized.${hive_table}_T280813;" >> $hive_drop_old_table_file
done

# Run the mv commands
time sudo -u hdfs kerberos-run-command hdfs $mv_cmds_file && mv $mv_cmds_file ./done/

# Run alter and repair Hive statements for each table
for f in ./event_sanitized_*_alter.hql; do
    echo "---- $f ----"
    sudo -u analytics kerberos-run-command analytics hive -f $f && mv -v $f ./done/
done


# If all looks 100% good, drop the old no longer needed external tables.
time sudo -u analytics kerberos-run-command analytics hive -f $hive_drop_old_table_file

Unfortunately, not all completed successfully the first time due to OOM or GC errors. Will try to increase mem for those and try again.

Change 681784 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/puppet@production] sanitize_eventlogging_analytics_immediate - ensure absent during switchover

https://gerrit.wikimedia.org/r/681784

Change 681784 merged by Ottomata:

[operations/puppet@production] sanitize_eventlogging_analytics_immediate - ensure absent during switchover

https://gerrit.wikimedia.org/r/681784

Mentioned in SAL (#wikimedia-analytics) [2021-04-21T21:30:03Z] <ottomata> temporariliy disabling sanitize_eventlogging_analytics_delayed jobs until T280813 is completed (probably tomorrow)

ah ha, right. The partition changes needed to be done in batches. https://issues.apache.org/jira/browse/HIVE-12077

Setting SET hive.msck.repair.batch.size=5000; before running MSCK REPAIR TABLE seems to do the trick!

Setting SET hive.msck.repair.batch.size=5000; before running MSCK REPAIR TABLE seems to do the trick!

👏

Ah, found another step.

Last week, when I upgraded the Refine job, I inadvertently re-refined a lot of old data, which caused the input to event_sanitize to have mtimes changed. This means that RefineSanitize now sees all that data as new, and wants to re-refine. Additionally, since I upgraded Refine last week, the event table paths are now lowercase, which the old EventLoggingSanitization job did not see, so no sanitization has been happening since last Thursday!!!

The refactor to use RefineSanitize should catch this now that it uses RefineSanitizeMonitor (and it does).

I'm going to do 2 things.

  1. update the _REFINED done flag mtimes for all existent event_sanitized partitions between 2021-04-12T14:00:00Z and 2021-04-15T14:00:00Z. This is the period that was re-refined last week.
  2. Backfill refine event_sanitized from 2021-04-15T14:00:00Z until now using the new RefineSanitize job. This should catch event_sanitized back up.

Did:

import org.wikimedia.analytics.refinery.job.refine._
import com.github.nscala_time.time.Imports._
import org.joda.time.format.ISODateTimeFormat
import org.apache.hadoop.fs.Path

val isoDateTimeFormatter = ISODateTimeFormat.dateTimeParser()

# Had to actually split this into smaller chunks; default non yarn spark shell couldn't handle all partitions in this range at once.
val since = DateTime.parse("2021-04-12T14:00:00Z", isoDateTimeFormatter)
val until = DateTime.parse("2021-04-15T14:00:00Z", isoDateTimeFormatter)

val rts = RefineTarget.find(
        spark,
        "event",
        new Path("/wmf/data/event_sanitized"),
        "event_sanitized",
        since,
        until,
        None,
        None
    )

val doneRts = rts.filter(_.doneFlagExists).par
doneRts.foreach(_.writeDoneFlag)

Backfilling:

sudo -u analytics kerberos-run-command analytics refine_event_sanitized_analytics_immediate --since='2021-04-15T14:00:00Z' --until=2
21/04/22 21:42:14 INFO Refine: Successfully refined 20 of 20 dataset partitions into table `event_sanitized`.`netflow` (total # refined records: 213548205)
21/04/22 21:42:14 INFO Refine: Successfully refined 7 of 7 dataset partitions into table `event_sanitized`.`readingdepth` (total # refined records: 10)
21/04/22 21:42:14 INFO Refine: Successfully refined 10 of 10 dataset partitions into table `event_sanitized`.`mobilewikiapplangselect` (total # refined records: 26)
21/04/22 21:42:14 INFO Refine: Successfully refined 65 of 65 dataset partitions into table `event_sanitized`.`mobilewikiappnavmenu` (total # refined records: 202)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`prefupdate` (total # refined records: 35910)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappiosuserhistory` (total # refined records: 72268)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewebuiactionstracking` (total # refined records: 384950)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`visualeditorfeatureuse` (total # refined records: 1982496)
21/04/22 21:42:14 INFO Refine: Successfully refined 150 of 150 dataset partitions into table `event_sanitized`.`mobilewikiappprotectededitattempt` (total # refined records: 482)
21/04/22 21:42:14 INFO Refine: Successfully refined 127 of 127 dataset partitions into table `event_sanitized`.`uploadwizardexceptionflowevent` (total # refined records: 632)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappshareafact` (total # refined records: 2358068)
21/04/22 21:42:14 INFO Refine: Successfully refined 91 of 91 dataset partitions into table `event_sanitized`.`guidedtourexternallinkactivation` (total # refined records: 708)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`guidedtourguiderimpression` (total # refined records: 46651)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`cpubenchmark` (total # refined records: 6517250)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`editattemptstep` (total # refined records: 6018963)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`referencepreviewscite` (total # refined records: 202982)
21/04/22 21:42:14 INFO Refine: Successfully refined 168 of 168 dataset partitions into table `event_sanitized`.`specialmutesubmit` (total # refined records: 19)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`changeslistfiltergrouping` (total # refined records: 169285)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapplanguagesettings` (total # refined records: 94037)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`guidedtourexited` (total # refined records: 16836)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappsuggestededits` (total # refined records: 27307)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapppagescroll` (total # refined records: 1512259)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappiosfeed` (total # refined records: 1068464)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`gettingstartedredirectimpression` (total # refined records: 23203)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`quicksurveyinitiation` (total # refined records: 6155872)
21/04/22 21:42:14 INFO Refine: Successfully refined 132 of 132 dataset partitions into table `event_sanitized`.`flowreplies` (total # refined records: 637)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewebsearch` (total # refined records: 203165)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`quicksurveysresponses` (total # refined records: 136399)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappreadinglists` (total # refined records: 620674)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`resourcetiming` (total # refined records: 6837898)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`referencepreviewspopups` (total # refined records: 2074749)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`firstinputtiming` (total # refined records: 3756467)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapplinkpreview` (total # refined records: 56586409)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`layoutshift` (total # refined records: 12055375)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappinstallreferrer` (total # refined records: 115086)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`uploadwizardflowevent` (total # refined records: 4407)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappdailystats` (total # refined records: 5553271)
21/04/22 21:42:14 INFO Refine: Successfully refined 5 of 5 dataset partitions into table `event_sanitized`.`mobilewikiapponboarding` (total # refined records: 7)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`echointeraction` (total # refined records: 482057)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappedit` (total # refined records: 233668)
21/04/22 21:42:14 INFO Refine: Successfully refined 166 of 166 dataset partitions into table `event_sanitized`.`mobilewikiappwiktionarypopup` (total # refined records: 1030)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`navigationtiming` (total # refined records: 9719649)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappfeed` (total # refined records: 322456)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapptabs` (total # refined records: 124754)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`wikipediaportal` (total # refined records: 114547)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`guidedtourbuttonclick` (total # refined records: 29118)
21/04/22 21:42:14 INFO Refine: Successfully refined 70 of 70 dataset partitions into table `event_sanitized`.`mobilewikiappnotificationpreferences` (total # refined records: 110)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`rumspeedindex` (total # refined records: 6861673)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`twocolconflictconflict` (total # refined records: 13265)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapplanguagesearching` (total # refined records: 49694)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`echomail` (total # refined records: 46725)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappfindinpage` (total # refined records: 205694)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapprandomizer` (total # refined records: 16845)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`guidedtourguiderhidden` (total # refined records: 5727)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`kaiosappfirstrun` (total # refined records: 26514)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapptocinteraction` (total # refined records: 2249575)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`uploadwizarderrorflowevent` (total # refined records: 89573)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`savetiming` (total # refined records: 1721484)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`centralnoticetiming` (total # refined records: 321336)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapplogin` (total # refined records: 100917)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`editoractivation` (total # refined records: 17906)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`uploadwizarduploadflowevent` (total # refined records: 139639)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappmediagallery` (total # refined records: 70018)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappnotificationinteraction` (total # refined records: 10639)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappsearch` (total # refined records: 421115)
21/04/22 21:42:14 INFO Refine: Successfully refined 14 of 14 dataset partitions into table `event_sanitized`.`guidedtourinternallinkactivation` (total # refined records: 22)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mediawikipingback` (total # refined records: 2003154)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`externalguidance` (total # refined records: 2456)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`uploadwizardtutorialactions` (total # refined records: 72035)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiapparticlesuggestions` (total # refined records: 3662)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappsessions` (total # refined records: 92039579)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`twocolconflictexit` (total # refined records: 1498)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappsavedpages` (total # refined records: 20709)
21/04/22 21:42:14 INFO Refine: Successfully refined 43 of 43 dataset partitions into table `event_sanitized`.`popups` (total # refined records: 47)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappcreateaccount` (total # refined records: 38914)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`serversideaccountcreation` (total # refined records: 68599)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`templatedataeditor` (total # refined records: 2290)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappintents` (total # refined records: 188205)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`painttiming` (total # refined records: 14440049)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`desktopwebuiactionstracking` (total # refined records: 4533772)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`uploadwizardstep` (total # refined records: 169265)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappfeedconfigure` (total # refined records: 18951)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`mobilewikiappappearancesettings` (total # refined records: 221517)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`centralnoticebannerhistory` (total # refined records: 811963)
21/04/22 21:42:14 INFO Refine: Successfully refined 2 of 2 dataset partitions into table `event_sanitized`.`mobilewikiappwidgets` (total # refined records: 3)
21/04/22 21:42:14 INFO Refine: Successfully refined 168 of 168 dataset partitions into table `event_sanitized`.`specialinvestigate` (total # refined records: 238)
21/04/22 21:42:14 INFO Refine: Successfully refined 169 of 169 dataset partitions into table `event_sanitized`.`referencepreviewsbaseline` (total # refined records: 978295)

Ok, should be good here. Will check up on the RefineSanitize jobs in the morning.

Mentioned in SAL (#wikimedia-analytics) [2021-05-03T13:59:40Z] <ottomata> dropped all obselete (upper cased location) event_santizied.*_T280813 tables created for T280813