Page MenuHomePhabricator

mysql.php reacts to signals intended for mysql
Closed, ResolvedPublic

Description

The mysql command-line tool inteprets Ctrl+C as a request to interrupt the running query. But the new wrapper also gets the SIGINT and exits back to the shell, leaving both mysql and the shell trying to use the terminal.

A lesser issue is that Ctrl+Z properly suspends both mysql and the wrapper, but executing fg does not seem to send SIGCONT through to mysql.

Details

Related Gerrit Patches:
mediawiki/core : masterIn mysql.php ignore SIGINT
operations/puppet : productionMake "sql wikishared" work again

Event Timeline

Anomie created this task.Jul 9 2018, 8:52 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 9 2018, 8:52 PM

Change 446528 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] In mysql.php ignore SIGINT

https://gerrit.wikimedia.org/r/446528

tstarling added a subscriber: tstarling.EditedJul 18 2018, 6:40 AM

A lesser issue is that Ctrl+Z properly suspends both mysql and the wrapper, but executing fg does not seem to send SIGCONT through to mysql.

Could not reproduce. After Ctrl+Z I see mysql stopped (T state), after fg it goes back to the S state.

On my local machine, with bash as the shell, before Ctrl+Z:

brad     10280  0.9  1.2 230104 45680 pts/7    S+   10:12   0:00 php maintenance
brad     10281  0.0  0.0   2368   752 pts/7    S+   10:12   0:00 sh -c 'mysql' '
brad     10282  0.2  0.1  23288  7432 pts/7    S+   10:12   0:00 mysql --default

After Ctrl+Z:

brad     10280  0.4  1.2 230104 45680 pts/7    T    10:12   0:00 php maintenance
brad     10281  0.0  0.0   2368   752 pts/7    T    10:12   0:00 sh -c 'mysql' '
brad     10282  0.1  0.1  23288  7432 pts/7    T    10:12   0:00 mysql --default

After fg:

brad     10280  0.2  1.2 230104 45680 pts/7    S+   10:12   0:00 php maintenance
brad     10281  0.0  0.0   2368   752 pts/7    S+   10:12   0:00 sh -c 'mysql' '
brad     10282  0.0  0.1  23288  7432 pts/7    T+   10:12   0:00 mysql --default

On terbium mwmaint1001 the situation is even worse, mysql doesn't even get suspended. After Ctrl+Z:

anomie    7843  0.0  0.0  11192  3124 pts/0    T    14:10   0:00 /bin/bash /usr/local/bin/mwscript mysql.php --wiki=metawiki --wikidb=enwiki
root      7847  0.0  0.0  49256  3668 pts/0    T    14:10   0:00 sudo -u www-data php /srv/mediawiki-staging/multiversion/MWScript.php mysql.php --wiki=metawiki --wikidb=enwiki
www-data  7848  0.6  0.1 699072 110276 pts/0   T    14:10   0:00 php /srv/mediawiki-staging/multiversion/MWScript.php mysql.php --wiki=metawiki --wikidb=enwiki
www-data  7849  0.0  0.0 362916 24464 ?        Ss   14:10   0:00 php /srv/mediawiki-staging/multiversion/MWScript.php mysql.php --wiki=metawiki --wikidb=enwiki
www-data  7850  0.0  0.0 362916 21088 ?        Ss   14:10   0:00 php /srv/mediawiki-staging/multiversion/MWScript.php mysql.php --wiki=metawiki --wikidb=enwiki
www-data  7855  0.0  0.0   4288   708 ?        S    14:10   0:00 sh -c 'mysql' '--defaults-extra-file=/tmp/redacted.ini' '--user=wikiadmin' '--database=enwiki' '--host=10.64.32.115'
www-data  7856  0.0  0.0  40468  7860 ?        S    14:10   0:00 mysql --defaults-extra-file=/tmp/redacted.ini --user=wikiadmin --database=enwiki --host=10.64.32.115
Anomie added a comment.EditedJul 18 2018, 2:46 PM

Hmm. If I use PHP 7.0 on mwmaint1001, I get different behavior. Sometimes it works right, sometimes the fg unsuspends all the processes but doesn't correctly make them the foreground process, and sometimes it manages to not unsuspend the mysql process.

HHVM's behavior is apparently because it has the mysql process running in a different process group and must be manually forwarding signals or something.

tstarling added a comment.EditedJul 19 2018, 5:10 AM

In production, mwscript/sudo is causing its own set of problems with ctrl-Z even when running eval.php. To test it locally I skipped my own wrappers and ran php directly, since I figured the (evident) problems with the wrappers were out of scope.

Indeed HHVM on mwmaint1001 is creating a process group, which creates very special problems, like background mysql and the controlling shell simultaneously reading from a single terminal:

[0400][tstarling@mwmaint1001:~]$ ls /etc/hhvm
-bash: l: command not found

With hhvm.server.light_process_count=0, it doesn't do that. But even with hhvm.server.light_process_count=0, I can still reproduce the problem of mysql remaining stopped after "fg". With strace, I was able to isolate the problem. mysql is installing SIGTSTP and SIGTTOU handlers. The intended function of these handlers is to do something, disable the handler, then resend the signal, stopping itself with kill(). But the actual sequence is as follows:

  • I press ctrl-Z
  • The kernel sends SIGTTOU and SIGTSTP to mysql simultaneously, the two signals are queued.
  • The SIGTTOU handler kills itself with SIGTTOU.
  • I run "fg"
  • mysql receives SIGCONT and continues, now it runs the SIGTSTP handler for the signal it received some seconds ago
  • The SIGTSTP handler kills itself with SIGTSTP

This is how it goes down with strace attached only to mysql. I'm not sure why mysql is receiving SIGTTOU. If I attach strace to hhvm or sh, the problem is not reproducible, there is no SIGTTOU signal.

SIGTTIN/SIGTTOU being blocked would explain how we can end up with background processes reading from the terminal, these signals are meant to stop that.

I think the workaround is to not press ctrl-Z in this script and leave it at that. This is a really stupid problem to spend half a day on.

Change 446530 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[operations/puppet@production] Make "sql wikishared" work again

https://gerrit.wikimedia.org/r/446530

Change 446530 merged by Tim Starling:
[operations/puppet@production] Make "sql wikishared" work again

https://gerrit.wikimedia.org/r/446530

Change 446528 merged by jenkins-bot:
[mediawiki/core@master] In mysql.php ignore SIGINT

https://gerrit.wikimedia.org/r/446528

Anomie closed this task as Resolved.Jul 19 2018, 5:43 PM
Anomie claimed this task.

Ok, let's call this fixed. The Ctrl+Z thing seems way to complicated to worry about.