Page MenuHomePhabricator

codesearch's systemd timeouts might accidentally kill git processes
Closed, ResolvedPublic

Description

It appears TimedMediaHandler is missing.

The all search (https://codesearch.wmflabs.org/search/?q=ResizeObserver&i=nope&files=&repos=) has results from TMH, but the WMF-deployed search does not. Looking at the advanced filter for WMF-deployed shows that it is indeed missing from the list of repos.

Event Timeline

legoktm@codesearch4:/srv/hound/hound-deployed$ ack Timed config.json 
		"mediawiki/extensions/TimedMediaHandler": {
				"base-url": "https://gerrit.wikimedia.org/g/mediawiki/extensions/TimedMediaHandler/+/{rev}/{path}{anchor}"
			"url": "https://gerrit.wikimedia.org/r/mediawiki/extensions/TimedMediaHandler"

...

I also see this in the logs:

Feb 07 07:23:41 codesearch4 docker[9286]: 2019/02/07 07:23:41 Failed to git fetch /data/data/vcs-1a58778e8f1a8bb707a0050aab89f0c507eb9f37, see output below
Feb 07 07:23:41 codesearch4 docker[9286]: fatal: Unable to create '/data/data/vcs-1a58778e8f1a8bb707a0050aab89f0c507eb9f37/.git/shallow.lock': File exists.
Feb 07 07:23:41 codesearch4 docker[9286]: Another git process seems to be running in this repository, e.g.
Feb 07 07:23:41 codesearch4 docker[9286]: an editor opened by 'git commit'. Please make sure all processes
Feb 07 07:23:41 codesearch4 docker[9286]: are terminated then try again. If it still fails, a git process
Feb 07 07:23:41 codesearch4 docker[9286]: may have crashed in this repository earlier:
Feb 07 07:23:41 codesearch4 docker[9286]: remove the file manually to continue.
Feb 07 07:23:41 codesearch4 docker[9286]: Continuing...

plus

Feb 07 07:23:46 codesearch4 docker[9286]: 2019/02/07 07:23:46 Failed to git reset /data/data/vcs-64d4e55c0df08822adb545d4445192b2751cb5fc, see output below
Feb 07 07:23:46 codesearch4 docker[9286]: fatal: Unable to create '/data/data/vcs-64d4e55c0df08822adb545d4445192b2751cb5fc/.git/index.lock': File exists.
Feb 07 07:23:46 codesearch4 docker[9286]: Another git process seems to be running in this repository, e.g.
Feb 07 07:23:46 codesearch4 docker[9286]: an editor opened by 'git commit'. Please make sure all processes
Feb 07 07:23:46 codesearch4 docker[9286]: are terminated then try again. If it still fails, a git process
Feb 07 07:23:46 codesearch4 docker[9286]: may have crashed in this repository earlier:
Feb 07 07:23:46 codesearch4 docker[9286]: remove the file manually to continue.
Feb 07 07:23:46 codesearch4 docker[9286]: Continuing...
Feb 07 07:23:46 codesearch4 docker[9286]: 2019/02/07 07:23:46 exit status 128

and one more wow

Feb 07 07:23:54 codesearch4 docker[9286]: 2019/02/07 07:23:54 Failed to git fetch /data/data/vcs-e298da1f2d84eee12c1f2aaf338af617bb5f0883, see output below
Feb 07 07:23:54 codesearch4 docker[9286]: fatal: Unable to create '/data/data/vcs-e298da1f2d84eee12c1f2aaf338af617bb5f0883/.git/shallow.lock': File exists.
Feb 07 07:23:54 codesearch4 docker[9286]: Another git process seems to be running in this repository, e.g.
Feb 07 07:23:54 codesearch4 docker[9286]: an editor opened by 'git commit'. Please make sure all processes
Feb 07 07:23:54 codesearch4 docker[9286]: are terminated then try again. If it still fails, a git process
Feb 07 07:23:54 codesearch4 docker[9286]: may have crashed in this repository earlier:
Feb 07 07:23:54 codesearch4 docker[9286]: remove the file manually to continue.
Feb 07 07:23:54 codesearch4 docker[9286]: Continuing...
Feb 07 07:23:54 codesearch4 docker[9286]: 2019/02/07 07:23:54 exit status 128

Clearing the /data/ folder now and restarting...

Fixed now, still wondering what would have caused this in the first place though. I wonder if it's the systemd timeouts that force automatic restarts leaving git processes stuck midway through?

Fixed now, still wondering what would have caused this in the first place though. I wonder if it's the systemd timeouts that force automatic restarts leaving git processes stuck midway through?

Can this be closed then?

Legoktm renamed this task from Add TimedMediaHandler to WMF-deployed search preset to codesearch's systemd timeouts might accidentally kill git processes.Jun 28 2020, 9:10 AM

I don't believe anything has changed since this was originally filed, maybe just luck that we haven't hit it recently.

This is no longer an issue because we got rid of these timeouts.