Page MenuHomePhabricator

Cleanup remaining WikipediaMobileFirefoxOS references
Closed, ResolvedPublic

Description

Deleting submodules is weird. They don't like being deleted. I had to manually fix silver. The labsdb* machines started complaining. As did a bunch of beta instances.

So need to find all these and fix 'em

(Putting it under Apache because it's in the docroots and I can't think of a better project)

  • labsdb1009.eqiad.wmnet
  • labsdb1010.eqiad.wmnet
  • labsdb1011.eqiad.wmnet
  • db1095.eqiad.wmnet (not needed anymore, this host was reimaged and it is a spare T196376)
  • db1102.eqiad.wmnet (not needed anymore, this host was reimaged and it is a spare T196376)

Event Timeline

demon triaged this task as High priority.Feb 21 2018, 1:53 AM
demon created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2018, 1:53 AM

Mentioned in SAL (#wikimedia-operations) [2018-02-21T01:54:47Z] <no_justification> WikipediaMobileFirefoxOS submodule references caused labsdb* (and related) puppet failures. They should recover now (self reverted my docroot changes). Filed T187850

$ ssh labsdb1009.eqiad.wmnet
$ ls -ld /usr/local/lib/mediawiki-config/docroot/wikimedia.org/WikipediaMobileFirefoxOS
drwxr-sr-x 6 root staff 4096 Dec 19  2016 /usr/local/lib/mediawiki-config/docroot/wikimedia.org/WikipediaMobileFirefoxOS/
$ ls -ld /usr/local/lib/mediawiki-config/.git/modules/docroot/wikimedia.org/WikipediaMobileFirefoxOS
drwxr-sr-x 8 root staff 4096 Jan 31 20:01 /usr/local/lib/mediawiki-config/.git/modules/docroot/wikimedia.org/WikipediaMobileFirefoxOS/
$ cat /usr/local/lib/mediawiki-config/.gitmodules
[submodule "portals"]
        path = portals
        url = https://gerrit.wikimedia.org/r/wikimedia/portals
[submodule "wmf-config/event-schemas"]
        path = wmf-config/event-schemas
        url = https://gerrit.wikimedia.org/r/mediawiki/event-schemas
[submodule "fonts"]
        path = fonts
        url = https://gerrit.wikimedia.org/r/p/operations/mediawiki-config/fonts

This is fundamentally the same on labsdb1010 and labsdb1011 as well. I think the manual fix would be something like:

$ cd /usr/local/lib/mediawiki-config
$ rm -rf docroot/wikimedia.org/WikipediaMobileFirefoxOS
$ rm -rf .git/modules/docroot/wikimedia.org/WikipediaMobileFirefoxOS

Change 413095 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] labsdb: Remove obsolete mediawiki-config submodule

https://gerrit.wikimedia.org/r/413095

bd808 updated the task description. (Show Details)Feb 21 2018, 6:23 PM

@chasemp wrote on https://gerrit.wikimedia.org/r/413095:

This worked seemingly fine for all 3 labsdb10[09|10|11]

root@labsdb1011:~# rm -fR  /usr/local/lib/mediawiki-config && puppet agent --test
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for labsdb1011.eqiad.wmnet
Notice: /Stage[main]/Base::Environment/Tidy[/var/tmp/core]: Tidying 0 files
Info: Applying configuration version '1519238041'
Notice: /Stage[main]/Role::Labs::Db::Common/Git::Clone[operations/mediawiki-config]/File[/usr/local/lib/mediawiki-config]/ensure: created
Notice: /Stage[main]/Role::Labs::Db::Common/Git::Clone[operations/mediawiki-config]/Exec[git_clone_operations/mediawiki-config]/returns: executed successfully
Notice: /Stage[main]/Role::Labs::Db::Common/Git::Clone[operations/mediawiki-config]/Exec[git_pull_operations/mediawiki-config]/returns: executed successfully
Info: /Stage[main]/Role::Labs::Db::Common/Git::Clone[operations/mediawiki-config]/Exec[git_pull_operations/mediawiki-config]: Scheduling refresh of Exec[git_submodule_update_operations/mediawiki-config]
Notice: /Stage[main]/Role::Labs::Db::Common/Git::Clone[operations/mediawiki-config]/Exec[git_submodule_update_operations/mediawiki-config]: Triggered 'refresh' from 1 events
Notice: Applied catalog in 23.32 seconds
bd808 updated the task description. (Show Details)Feb 21 2018, 6:39 PM

Change 413095 abandoned by BryanDavis:
labsdb: Remove obsolete mediawiki-config submodule

Reason:
Will be handled by manual removal of the existing mediawiki-config clone and subsequent puppet run.

https://gerrit.wikimedia.org/r/413095

demon added a comment.Mar 23 2018, 6:55 PM

Bump. Can we get this resolved for the remaining nodes?

bd808 removed demon as the assignee of this task.Mar 23 2018, 7:20 PM
bd808 added a subscriber: Marostegui.

The cleanup command @chasemp used was rm -fR /usr/local/lib/mediawiki-config && puppet agent --test. We need to find a willing root to do this on db1095.eqiad.wmnet and db1102.eqiad.wmnet. @Marostegui does this sound like a scary thing for the db servers or can I just get one on the cloud folks to take care of it?

@bd808 I don't think we use mediawiki-config for anything on those sanitarium hosts for anything, but we should probably not remove the directory initially but move them and see what puppet does.
I am off till 2nd April, so I won't be able to check this myself.

@bd808 I don't think we use mediawiki-config for anything on those sanitarium hosts for anything, but we should probably not remove the directory initially but move them and see what puppet does.
I am off till 2nd April, so I won't be able to check this myself.

The puppet run will re-clone /usr/local/lib/mediawiki-config. The new clone will not have the WikipediaMobileFirefoxOS submodule which has already been removed in the master branch of the upstream repo. The clone would be missing only for the duration of a puppet run. If this directory is used by scripts on the sanitarium hosts it is likely done to get access to the dblist files.

Ah right, if it will get the directory back but just without the module, I'm sure that won't break anything.
I would advise to do the rm on Monday though :)

I have fixed this breakage (not this one in particular- one with another module) already on these hosts. We should probably have a better workflow if this is going to happen more than twice.

demon added a comment.Mar 26 2018, 4:31 PM

I can't speak for puppet, but the wmf-config repo doesn't have any other submodules slated for removal.

I don't think we use mediawiki-config for anything

We use them to run check_private_data.py, which needs the not-too-stalled configuration of private wikis/deleted wikis,etc.

The cleanup command @chasemp used was rm -fR /usr/local/lib/mediawiki-config && puppet agent --test. We need to find a willing root to do this on db1095.eqiad.wmnet and db1102.eqiad.wmnet. @Marostegui does this sound like a scary thing for the db servers or can I just get one on the cloud folks to take care of it?

rm -fR /usr/local/lib/mediawiki-config/docroot/wikimedia.org/WikipediaMobileFirefoxOS would suffice

greg added a project: DBA.Jul 5 2018, 6:31 PM
greg added a subscriber: greg.

The cleanup command @chasemp used was rm -fR /usr/local/lib/mediawiki-config && puppet agent --test. We need to find a willing root to do this on db1095.eqiad.wmnet and db1102.eqiad.wmnet. @Marostegui does this sound like a scary thing for the db servers or can I just get one on the cloud folks to take care of it?

rm -fR /usr/local/lib/mediawiki-config/docroot/wikimedia.org/WikipediaMobileFirefoxOS would suffice

Adding DBA explicitly for their action on the db hosts.

Those two hosts are not used anymore as sanitariums (they are now spares) so they do not contain anything anymore.
So it can considered done from the DBA side

Marostegui closed this task as Resolved.Jul 6 2018, 4:51 AM
Marostegui assigned this task to chasemp.
Marostegui updated the task description. (Show Details)

If the only hosts pending were db1095 and db1102 this can be considered fixed as per T187850#4401079.
I am resolving this, if there is anything else pending and not in this ticket, please reopen.