Page MenuHomePhabricator

mcelog is deprecated in kernel >= 4.12
Closed, ResolvedPublic

Description

https://github.com/torvalds/linux/commit/5de97c9f6d85fd83af76e09e338b18e7adb1ae60

x86/mce: Factor out and deprecate the /dev/mcelog driver

Move all code relating to /dev/mcelog to a separate source file.
/dev/mcelog driver can now operate from the machine check notifier with
lowest prio.

Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Move the mce_helper and trigger functionality behind CONFIG_X86_MCELOG_LEGACY. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-6-bp@alien8.de
[ Renamed CONFIG_X86_MCELOG to CONFIG_X86_MCELOG_LEGACY. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Related Objects

Event Timeline

Change 462613 had a related patch set uploaded (by GTirloni; owner: GTirloni):
[operations/puppet@production] Do not enable mcelog if kernel >= 4.12

https://gerrit.wikimedia.org/r/462613

Side-effect of trying to run mcelog on kernel >= 4.12:

---------- Forwarded message ----------
From: Cron Daemon <root@backup2001.codfw.wmnet>
Date: Mon, Sep 24, 2018 at 6:23 PM
Subject: Cron <root@backup2001> /usr/local/sbin/wmf-auto-restart -s mcelog
To: root@backup2001.codfw.wmnet


Traceback (most recent call last):
  File "/usr/local/sbin/wmf-auto-restart", line 142, in <module>
    sys.exit(main())
  File "/usr/local/sbin/wmf-auto-restart", line 138, in main
    return check_restart(args.servicename, args.dryrun)
  File "/usr/local/sbin/wmf-auto-restart", line 59, in check_restart
    pid_query = subprocess.check_output(["/bin/pidof", service_name], universal_newlines=True)
  File "/usr/lib/python3.5/subprocess.py", line 316, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/bin/pidof', 'mcelog']' returned non-zero exit status 1

Due to kernel upgrade in T196477

MoritzMuehlenhoff claimed this task.

I have created https://phabricator.wikimedia.org/T205366 to migrate away from mcelog.

In principle we could make the installation of mcelog conditional on the kernel version, but backup2001 was mostly installed with 4.14 for some hardware tests, we'll not use that kernel in production. Once we have the replacement parts available it will get reimaged with stretch and the regular 4.9 kernel, so I'll close this bug. But the report has been very useful as it unveiled the migration issue for mcelog, thanks!

Thanks Moritz, that makes sense.

Change 462613 abandoned by GTirloni:
Do not enable mcelog if kernel >= 4.12

Reason:
Default kernel even in Stretch is going to be 4.9 so this is really not needed right now. It was a nice exercise though.

https://gerrit.wikimedia.org/r/462613