Page MenuHomePhabricator

When user create tool via toolsadmin, it doesn't create replica.my.cnf
Closed, ResolvedPublic

Description

I created yesterday tool discordwiki, and it wasn't created replica.my.cnf file per default, default content is only empty logs directory.

Before few minutes I created vagrant2 tool, and same happened. I don't know what's happening, but this is not good.

Event Timeline

Kizule triaged this task as High priority.Sep 22 2019, 5:28 PM
Kizule created this task.
Kizule edited projects, added Toolforge; removed Striker.

Triaged because without replica.my.cnf tools doesn't have access to database.

I don't know which team is for this, I think to this is for SRE

I reported this to IRC yesterday and I talked with @Krenair (I think).

Yeah, unfortunately the script that does this runs on the NFS server, which lives in eqiad.wmnet.

Linux tools-sgebastion-07 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3 (2019-02-02) x86_64
Debian GNU/Linux 9.8 (stretch)
tools-sgebastion-07 is a Toolforge bastion (role::wmcs::toolforge::bastion)
======================================================================
_______  _____   _____         _______  _____   ______  ______ _______
   |    |     | |     | |      |______ |     | |_____/ |  ____ |______
   |    |_____| |_____| |_____ |       |_____| |    \_ |_____| |______
======================================================================
This is a server of the tools Cloud VPS project, the home of community
managed bots, webservices, and tools supporting the Wikimedia movement.

Use of this system is subject to the Toolforge Terms of Use,
Code of Conduct, and Privacy Policies:
- https://tools.wmflabs.org/?Rules

General guidance and help can be found at:
- https://tools.wmflabs.org/?Help

The last Puppet run was at Sun Sep 22 17:22:54 UTC 2019 (3 minutes ago). 
Last login: Sun Sep 22 17:23:30 2019 from 109.245.159.66
zoranzoki21@tools-sgebastion-07:~$ become vagrant2
tools.vagrant2@tools-sgebastion-07:~$ ls
logs
tools.vagrant2@tools-sgebastion-07:~$ ls
logs
tools.vagrant2@tools-sgebastion-07:~$ logout
zoranzoki21@tools-sgebastion-07:~$ become discordwiki
tools.discordwiki@tools-sgebastion-07:~$ ls
logs  public_html  service.manifest
tools.discordwiki@tools-sgebastion-07:~$

Mentioned in SAL (#wikimedia-cloud) [2019-09-23T06:01:19Z] <bd808> Restarted maintain-dbusers process on labstore1004. (T233530)

bd808 lowered the priority of this task from High to Medium.
bd808 edited projects, added cloud-services-team (Kanban); removed cloud-services-team.
bd808 subscribed.

Log messages for the maintaindb-users process on labstore1004 had stopped at 2019-09-20T16:03:39Z. When I restarted the service, credentials for 12 tools were provisioned. A number of these tools look like they were frustrated attempts by folks to get a replica.my.cnf to be generated.

I verified that new tools are getting replica.my.cnf files created and associated credentials provisioned. I have a hunch the service got stuck as a result of a network interruption that happened around 2019-09-21T00:00. This python process has a step that requires querying the LDAP directory and we know from other similar scripts that the python ldap library we are currently using can get hung when reading responses from the directory. I will try to poke around a little more though before I call this done.

The last tool that was properly provisioned before the service restart was created at 2019-09-20T16:01:26Z. The first tool provisioned after the restart was created at 2019-09-21T00:27:42Z.

It's resolved, tools have now replica.my.cnf

tools.discordwiki@tools-sgebastion-07:~/public_html$ cd ..
tools.discordwiki@tools-sgebastion-07:~$ ls
logs  public_html  replica.my.cnf  service.manifest
tools.discordwiki@tools-sgebastion-07:~$