Page MenuHomePhabricator

mylvmbackup on an-coord1001 not working
Closed, ResolvedPublic3 Estimated Story Points

Description

20190713 08:00:01 Info: Running: lvcreate -s --size=5G --name=mysql_snapshot /dev/an-coord1001-vg/mysql
File descriptor 3 (/run/lock/mylvmbackup-analytics-meta) leaked on lvcreate invocation. Parent PID 152942: /usr/bin/perl
  Using default stripesize 64.00 KiB.
  Volume group "an-coord1001-vg" has insufficient free space (242 extents): 1280 required.
20190713 08:00:01 Error: FAILED: taking LVM snapshot (exit status 5)
20190713 08:00:01 Info: Unlocking tables...
20190713 08:00:01 Info: Disconnecting from database...
20190713 08:00:01 Error: Could not create snapshot volume mysql_snapshot
20190713 08:00:01 Info: Cleaning up...
elukey@an-coord1001:~$ sudo pvs
  PV         VG              Fmt  Attr PSize   PFree
  /dev/md2   an-coord1001-vg lvm2 a--  175.95g 968.00m

elukey@an-coord1001:~$ df -h
Filesystem                           Size  Used Avail Use% Mounted on
udev                                  16G     0   16G   0% /dev
tmpfs                                3.2G  296M  2.9G  10% /run
/dev/md0                              46G   24G   21G  54% /
tmpfs                                 16G     0   16G   0% /dev/shm
tmpfs                                5.0M     0  5.0M   0% /run/lock
tmpfs                                 16G     0   16G   0% /sys/fs/cgroup
fuse_dfs                             2.3P  1.6P  711T  70% /mnt/hdfs
/dev/mapper/an--coord1001--vg-mysql   59G   21G   39G  35% /var/lib/mysql
/dev/mapper/an--coord1001--vg-srv    113G   16G   98G  14% /srv
tmpfs                                3.2G     0  3.2G   0% /run/user/124
tmpfs                                3.2G     0  3.2G   0% /run/user/119
tmpfs                                3.2G     0  3.2G   0% /run/user/13926

Proposed fix:

e2fsck -fy /dev/mapper/an--coord1001--vg-srv (seems needed before proceeding)
resize2fs /srv 103G
lvreduce -L -10G /dev/mapper/an--coord1001--vg-srv

Event Timeline

elukey triaged this task as High priority.Jul 13 2019, 9:01 AM
elukey created this task.

To keep archives happy:

  • /srv needed to be umounted, but mariadb was holding inodes (/srv/tmp is its tmp directory)
  • mariadb was stopped, the tmp dir switched to /tmp temporarily, and restarted. Oozie/Hive/etc.. were stopped as well.
  • umounted /srv, then e2fsck -fy /dev/mapper/an--coord1001--vg-srv , resize2fs /dev/mapper/an--coord1001--vg-srv 103G, lvreduce -L -10G /dev/mapper/an--coord1001--vg-srv
  • mariadb was stopped, reconfigured and restarted together with oozie/hive.

I forced a manual run of mylvmbackup and it worked :)

elukey changed the point value for this task from 0 to 3.
elukey moved this task from Next Up to Done on the Analytics-Kanban board.
Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.