Page MenuHomePhabricator

toolforge: apt errors cronspam
Closed, ResolvedPublic

Description

As part of T181647 in modules/apt/manifests/unattendedupgrades.pp we are installing apt-show-versions. The package is part of our build for Jessie and Stretch but not so for Trusty (most of Toolforge still) and it keeps tripping on other apt operations intermitently.

/etc/cron.daily/apt-show-versions

Jessie

tools-worker-1010

#!/bin/sh

[ -f /usr/bin/apt-show-versions ] || exit 0

apt-show-versions -i

Now here is where it gets interesting.

  • Trusty**

tools-exec-1405

#!/bin/sh
#
# Note: This file is managed by Puppet
#
[ -f /usr/bin/apt-show-versions ] || exit 0

apt-show-versions -i > /dev/null 2>&1

This file is managed by Puppet? Not currently it seems but at some point?

We already have Puppet updating the Apt metadata periodically, and apt-show-versions trying to do it once a day doesn't seem useful to me.

I'm thinking we should ensure => 'absent' for /etc/cron.daily/apt-show-versions in modules/apt/manifests/unattendedupgrades.pp.

Event Timeline

chasemp triaged this task as Medium priority.Feb 1 2018, 3:54 PM

Cron <root@tools-webgrid-lighttpd-1409> test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

/etc/cron.daily/apt:
Traceback (most recent call last):
  File "/usr/bin/unattended-upgrade", line 1200, in <module>
    main(options)
  File "/usr/bin/unattended-upgrade", line 938, in main
    allowed_origins=allowed_origins)
  File "/usr/bin/unattended-upgrade", line 84, in __init__
    apt.Cache.__init__(self, rootdir=rootdir)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 107, in __init__
    self.open(progress)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 153, in open
    self._records = apt_pkg.PackageRecords(self._cache)
SystemError: E:Could not open file /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty_main_i18n_Translation-en - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty_universe_binary-amd64_Packages - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty_main_binary-amd64_Packages - open (2: No such file or directory)

Cron <root@tools-checker-01> test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
run-parts: /etc/cron.daily/apt-show-versions exited with return code 255

Change 407456 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] apt: disable daily cron job from apt-show-versions

https://gerrit.wikimedia.org/r/407456

Change 407456 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] apt: disable daily cron job from apt-show-versions

https://gerrit.wikimedia.org/r/407456

Mentioned in SAL (#wikimedia-cloud) [2018-02-05T13:06:18Z] <arturo> deploying fix for T186230 using clush

We still have some issues, cronspam:

tools-exec-1401

/etc/cron.daily/apt:
Traceback (most recent call last):
  File "/usr/bin/unattended-upgrade", line 1200, in <module>
    main(options)
  File "/usr/bin/unattended-upgrade", line 938, in main
    allowed_origins=allowed_origins)
  File "/usr/bin/unattended-upgrade", line 84, in __init__
    apt.Cache.__init__(self, rootdir=rootdir)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 107, in __init__
    self.open(progress)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 153, in open
    self._records = apt_pkg.PackageRecords(self._cache)
SystemError: E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_universe_i18n_Translation-en - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_main_i18n_Translation-en - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty-backports_main_binary-amd64_Packages - open (2: No such file or directory)

tools-exec-1442

/etc/cron.daily/apt:
Traceback (most recent call last):
  File "/usr/bin/unattended-upgrade", line 1255, in <module>
    main(options)
  File "/usr/bin/unattended-upgrade", line 979, in main
    allowed_origins=allowed_origins)
  File "/usr/bin/unattended-upgrade", line 84, in __init__
    apt.Cache.__init__(self, rootdir=rootdir)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 107, in __init__
    self.open(progress)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 151, in open
    self._cache = apt_pkg.Cache(progress)
SystemError: E:Problem renaming the file /var/cache/apt/srcpkgcache.bin.QskC72 to /var/cache/apt/srcpkgcache.bin - rename (2: No such file or directory), E:Problem renaming the file /var/cache/apt/pkgcache.bin.qTsVeg to /var/cache/apt/pkgcache.bin - rename (2: No such file or directory), E:The package lists or status file could not be parsed or opened.

tools-services-02

/etc/cron.daily/apt:
Traceback (most recent call last):
  File "/usr/bin/unattended-upgrade", line 1200, in <module>
    main(options)
  File "/usr/bin/unattended-upgrade", line 938, in main
    allowed_origins=allowed_origins)
  File "/usr/bin/unattended-upgrade", line 84, in __init__
    apt.Cache.__init__(self, rootdir=rootdir)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 107, in __init__
    self.open(progress)
  File "/usr/lib/python3/dist-packages/apt/cache.py", line 153, in open
    self._records = apt_pkg.PackageRecords(self._cache)
SystemError: E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_universe_i18n_Translation-en - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_main_i18n_Translation-en - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_universe_binary-amd64_Packages - open (2: No such file or directory), E:Could not open file /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_main_binary-amd64_Packages - open (2: No such file or directory)

According to google, there seem to be some kind of race condition

Could this be puppet's apt cache update and apt own cron run clashing?

The ls command don't show the proper date of file creation:

aborrero@tools-exec-1401:~$ ls --full-time /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty-backports_main_binary-amd64_Packages
-rw-r--r-- 1 root root 59079 2017-12-17 09:45:00.000000000 +0000 /var/lib/apt/lists/nova.clouds.archive.ubuntu.com_ubuntu_dists_trusty-backports_main_binary-amd64_Packages

That's 'content modification' or something like that, which is the same date included in the Debian archive file.

Using stat we can see that files are actually overwritten every time the cache is updated:

aborrero@tools-exec-1401:~$ stat -c '%z' /var/lib/apt/lists/*
2018-02-08 12:44:41.898117231 +0000
2018-02-08 12:44:41.946117630 +0000
2018-02-08 12:44:41.926117464 +0000
2018-02-08 12:44:41.970117830 +0000
2018-02-08 12:44:41.938117563 +0000
2018-02-08 12:44:41.958117730 +0000
2018-02-08 12:44:41.938117563 +0000
2018-02-08 11:39:46.517743179 +0000
2018-02-08 12:44:41.970117830 +0000
2018-02-08 12:44:42.502122255 +0000
2018-02-08 12:44:42.534122521 +0000
2018-02-08 12:44:42.498122222 +0000
2018-02-08 12:44:42.534122521 +0000
2018-02-08 12:44:42.534122521 +0000
2018-02-08 12:44:42.502122255 +0000
2018-02-08 12:44:42.502122255 +0000
2018-02-08 12:44:42.534122521 +0000
2018-02-08 12:44:42.498122222 +0000
2018-02-08 12:44:42.526122455 +0000
2018-02-08 12:44:42.538122554 +0000
2018-02-08 12:44:42.502122255 +0000
2018-02-08 12:44:43.686132103 +0000
2018-02-08 12:44:44.518139024 +0000
2018-02-08 12:44:42.658123552 +0000
2018-02-08 12:44:41.990117996 +0000
2018-02-08 12:44:41.990117996 +0000
2018-02-08 12:44:44.426138259 +0000
2018-02-08 12:44:44.966142751 +0000
2018-02-08 12:44:43.462130240 +0000
2018-02-08 12:44:41.934117531 +0000
2018-02-08 12:44:42.238120059 +0000
2018-02-08 12:44:42.466121956 +0000
2018-02-08 12:44:42.042118429 +0000
2018-02-08 12:44:42.298120558 +0000
2018-02-08 12:44:42.494122188 +0000
2018-02-08 12:44:42.066118629 +0000
2018-02-08 12:44:45.118144015 +0000
2018-02-08 12:44:42.042118429 +0000
2018-02-08 12:44:42.778124551 +0000
2018-02-08 12:44:45.106143915 +0000
2018-02-08 12:44:42.522122421 +0000
2018-02-08 12:44:44.550139290 +0000
2018-02-08 12:44:45.118144015 +0000
2018-02-08 12:44:42.534122521 +0000
2018-02-08 12:44:41.906117298 +0000
2018-02-08 12:44:41.878117065 +0000

Which may confirm the race condition if we run stat after the cronspam email and the date is the same.

This is my guess:

  • puppet runs every 30 minutes or so. This run updates the apt cache by means of something I ignore right now.
  • apt's cron runs daily at a random time (there is a random sleep inside the cron script)
  • if both collide (hey, it's random), then these error happens.

A possible fix, if we confirm this race condition, is to simply disable the apt daily cron, which I'm not sure of the benefit it produces anyway. And let puppet update our apt cache.
If we follow this approach, we should evaluate how this could affect the ability of unattended-upgrades to actually upgrade packages (i.e, AFAIK it's run from the apt daily cron).

Also, I'm not sure if it worth digging more into the issue, given the old versions of both apt and unattended-upgrades we are running in trusty.

aborrero renamed this task from apt-show-versions daily cron often failing with conflict to toolforge: apt errors cronspam.Feb 8 2018, 1:05 PM

The only other thing I can think of is to make the normal apt cron fail gracefully if it is in conflict with another apt operation.

Change 415843 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] apt: unattended-upgrades: don't install apt-show-upgrades

https://gerrit.wikimedia.org/r/415843

Change 415843 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] apt: unattended-upgrades: don't install apt-show-upgrades

https://gerrit.wikimedia.org/r/415843

We didn't receive more cronspam. Also, with the patch above, we no longer require apt-show-upgrades being installed with unattended-upgrades.

I think we can close this task.