Page MenuHomePhabricator

Cleanup debconf handling in mailman puppet setup
Open, MediumPublic

Description

Earlier the day I tried to install the mailman security update, which triggered a debconf dialogue which warned about an existing installation. To avoid potentially overwriting an unpuppetised local configuration I selected the offered "abort" to investigate further, which quit the preinst. That didn't leave apparent changes to the installation.

Later on there was an Icinga alert for qrunner not running and I restarted mailman. When it disappeared later on again I noticed that the puppet run has triggered a reload of the mailman service, which lead to qrunner not running until I restarted mailman manually:

ESC[mNotice: /Stage[main]/Ssh::Client/File[/etc/ssh/ssh_known_hosts]/content: content changed '{md5}318b42d6506e93f9cb45adbd3d2e0e5b' to '{md5}1c4de80271ec0c284007948a7d37daab'ESC[0m
ESC[mNotice: /Stage[main]/Mailman::Listserve/Exec[dpkg-reconfigure mailman]: Triggered 'refresh' from 1 eventsESC[0m
ESC[mNotice: /Stage[main]/Mailman::Listserve/File[/etc/mailman/mm_cfg.py]/mode: mode changed '0644' to '0444'ESC[0m
ESC[0;32mInfo: /Stage[main]/Mailman::Listserve/File[/etc/mailman/mm_cfg.py]: Scheduling refresh of Service[mailman]ESC[0m
ESC[mNotice: /Stage[main]/Mailman::Listserve/Service[mailman]: Triggered 'refresh' from 1 eventsESC[0m
ESC[mNotice: Finished catalog run in 191.11 secondsESC[0m
ESC[0;32mInfo: Sleeping for 37 seconds (splay is enabled)ESC[0m
ESC[0;32mInfo: Retrieving pluginfactsESC[0m
ESC[0;32mInfo: Retrieving pluginESC[0m
ESC[0;32mInfo: Loading factsESC[0m
ESC[0;32mInfo: Caching catalog for fermium.wikimedia.orgESC[0m
ESC[0;32mInfo: Applying configuration version '1473251029'ESC[0m
ESC[mNotice: /Stage[main]/Mailman::Listserve/Debconf::Set[mailman/site_languages]/Exec[debconf-communicate set mailman/site_languages]/returns: executed successfullyESC[0m
ESC[0;32mInfo: Debconf::Set[mailman/site_languages]: Scheduling refresh of Exec[dpkg-reconfigure mailman]ESC[0m
ESC[mNotice: /Stage[main]/Ssh::Client/File[/etc/ssh/ssh_known_hosts]/content:

This is likely somehow triggered by the debconf setting mailman/site_languages, so I've stopped puppet on fermium until this is debugged further.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 7 2016, 2:50 PM

It's a problem in the puppet module: It hardcodes a number of debconf choices, but e.g. Asturian is not available when running "dpkg-reconfigure mailman" manually. This possibly depends on the installed locales, need more investigation.

debconf::set { 'mailman/site_languages':

value  => 'ar, ast, ca, cs, da, de, en, es, et, eu, fi, fr, gl, he, hr, hu, ia, it, ja, ko, lt, nl, no, pl, pt, pt_BR, ro, ru, sk, sl, sr, sv, tr, uk, vi, zh_CN,\ zh_TW',
notify => Exec['dpkg-reconfigure mailman'],

}

Change 310746 had a related patch set uploaded (by Muehlenhoff):
Update list of mailman site languages

https://gerrit.wikimedia.org/r/310746

Change 310746 merged by Dzahn:
Update list of mailman site languages

https://gerrit.wikimedia.org/r/310746

Dzahn added a comment.Sep 16 2016, 6:16 PM

I merged that and re-enabled puppet on fermium.

Notice: /Stage[main]/Mailman::Listserve/Debconf::Set[mailman/default_server_language]/Exec[debconf-communicate set mailman/default_server_language]/returns: executed successfully
Info: Debconf::Set[mailman/default_server_language]: Scheduling refresh of Exec[dpkg-reconfigure mailman]

it took a while at this step.. then continued

Info: /Stage[main]/Mailman::Listserve/File[/etc/mailman/mm_cfg.py]: Scheduling refresh of Service[mailman]
Notice: /Stage[main]/Mailman::Listserve/Service[mailman]: Triggered 'refresh' from 1 events

qrunner is running

list 20892 0.1 0.1 78608 16276 ? S 18:15 0:00 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=ArchRunner:0:1 -s
list 20893 0.0 0.2 79548 17256 ? S 18:15 0:00 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=BounceRunner:0:1 -s
... etc..

Dzahn closed this task as Resolved.Sep 16 2016, 6:21 PM
Dzahn reopened this task as Open.Sep 16 2016, 7:29 PM

ehhh... uhmm.. a little while later Icinga tells me:

PROCS CRITICAL: 0 processes with UID = 38 (list), regex args '/mailman/bin/qrunner'

PROCS CRITICAL: 0 processes with UID = 38 (list), regex args '/mailman/bin/mailmanctl'

:/

Mentioned in SAL (#wikimedia-operations) [2016-09-16T19:30:59Z] <mutante> fermium starting mailman qrunner (T144933)

Dzahn added a comment.Sep 16 2016, 7:34 PM

Mentioned in SAL (#wikimedia-operations) [2016-09-16T19:30:59Z] <mutante> fermium starting mailman qrunner (T144933)

(disabled puppet again)

Joe added a subscriber: Joe.Sep 21 2016, 9:44 AM

So no one looked into this in the last few days?

I am going to need to have puppet running for the puppetdb migration, so looking into this now.

Joe added a comment.Sep 21 2016, 10:00 AM

The issue here is that debconf::set is very, very primitive.

The issue was that the order of languages wasn't in the same order in debconf and in puppet:

root@fermium:~# echo get mailman/site_languages | debconf-communicate 
0 sk, gl, fa, ast, ar, ca, cs, da, de, en, es, et, eu, fi, fr, he, hr, hu, ia, it, ja, ko, lt, nl, no, pl, pt, pt_BR, ro, ru, sl, sr, sv, tr, uk, vi, zh_CN, zh_TW

As a quick patch, I'm just reproducing this order in puppet, but debconf::set needs probably to be rewritten as a proper type/resource.

Change 311950 had a related patch set uploaded (by Giuseppe Lavagetto):
mailman::listserve: reproduce the debconf order from fermium

https://gerrit.wikimedia.org/r/311950

Change 311950 merged by Giuseppe Lavagetto:
mailman::listserve: reproduce the debconf order from fermium

https://gerrit.wikimedia.org/r/311950

Joe added a comment.Sep 21 2016, 10:07 AM

Now puppet runs fine on fermium and doesn't stop/start qrunner at each iteration, but I'll leave the ticket open because this is in need of some serious reengineering.

MoritzMuehlenhoff renamed this task from puppet run stopping qrunner on fermium to Cleanup debconf handling in mailman puppet setup.Nov 11 2016, 12:46 PM
MoritzMuehlenhoff removed MoritzMuehlenhoff as the assignee of this task.
Volans triaged this task as Medium priority.Nov 23 2016, 9:01 AM
Peachey88 updated the task description. (Show Details)