Page MenuHomePhabricator

Workspaces for mwgate-php55lint / mwgate-php70lint are getting huge
Closed, ResolvedPublicPRODUCTION ERROR

Description

legoktm@integration-slave-jessie-1003:/srv/jenkins-workspace/workspace$ du -hs *
311M	analytics-refinery-release
4.0K	analytics-refinery-release@tmp
873M	apps-android-wikipedia-publish
913M	apps-android-wikipedia-test
114M	commit-message-validator
227M	composer-package-validate
42M	debian-glue
472K	debian-glue-non-voting
4.0K	fail-archived-repositories
88M	integration-zuul-layoutdiff
16M	integration-zuul-layoutvalidation-gate
876K	labs-tools-ZppixBot-php55lint
788M	mediawiki-core-code-coverage
1.3G	mediawiki-core-jsduck
3.1G	mediawiki-core-php55lint
82M	mediawiki-vendor-composer-security
96M	mwext-CirrusSearch-whitespaces
163M	mwext-VisualEditor-jsduck
696K	mwgate-composer-validate
50M	mwgate-jsduck
5.4G	mwgate-php55lint
24M	mwgate-php56lint
15M	operations-dns-lint
1.6M	operations-dns-tabs
192M	operations-mw-config-php55lint
97M	operations-mw-config-typos
169M	phabricator-jessie-commits
3.7M	phabricator-jessie-debs
170M	phabricator-jessie-diffs
4.8M	php55lint
144M	php56lint
14M	php-compile-hhvm
15M	php-compile-php55
15M	php-compile-php70
88M	selenium-Wikibase-T167432
100K	test-csteipp-sensiolabs-securityadvisorieschecker
502M	wikimedia-fundraising-civicrm

High prio because this is filling up /srv

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Just had to clean this up on integration-slave-jessie-1003.

Mentioned in SAL (#wikimedia-releng) [2017-12-22T22:38:59Z] <thcipriani> integration-slave-jessie-1004 removed mediawiki-core-jsduck, mwgate-php55lint, mediawikicore-php55lint as /srv mount was full T179963

That is happening once per week or so. The workaround is to clean out the workspaces manually but that is terrible.

We apparently two different templates to generate the PHP lint jobs. Comparing php55lint and mwgate-php55lint:

--- php55lint	2017-12-04 09:26:58.000000000 +0100
+++ mwgate-php55lint	2017-12-04 09:26:58.000000000 +0100
@@ -4,3 +4,4 @@
   <description>&lt;p&gt;Job is managed by &lt;a href=&quot;https://www.mediawiki.org/wiki/CI/JJB&quot;&gt;Jenkins Job Builder&lt;/a&gt;.&lt;/p&gt;
-&lt;p&gt;This job is triggered by Zuul&lt;/p&gt;
+&lt;p&gt;This job is triggered by Zuul.&lt;/p&gt;
+&lt;p&gt;Git submodules are NOT processed.&lt;/p&gt;
 &lt;!-- Managed by Jenkins Job Builder --&gt;</description>
@@ -15,3 +16,3 @@
       <strategy class="hudson.tasks.LogRotator">
-        <daysToKeep>30</daysToKeep>
+        <daysToKeep>15</daysToKeep>
         <numToKeep>-1</numToKeep>
@@ -113,5 +114,3 @@
     <extensions>
-      <hudson.plugins.git.extensions.impl.CloneOption>
-        <shallow>true</shallow>
-      </hudson.plugins.git.extensions.impl.CloneOption>
+      <hudson.plugins.git.extensions.impl.CleanCheckout/>
       <hudson.plugins.git.extensions.impl.SubmoduleOption>
@@ -124,3 +123,2 @@
       </hudson.plugins.git.extensions.impl.SubmoduleOption>
-      <hudson.plugins.git.extensions.impl.WipeWorkspace/>
     </extensions>

EDIT: mixed it up

php55lint

Wipes the workspace and does a shallow clone. That is the version we would want to use everywhere. I am not sure how git-changed-in-head.sh behave with a shallow clone but hopefully it will be fine.

mwgate-php55lint

Apparently does a full clone and a clean checkout. It does NOT wipe the workspace hence the .git directory keeps growing up as different repositories trigger that job (see my earlier comment T179963#3742837 ).

Change 405722 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Use shallow clone for phplint jobs

https://gerrit.wikimedia.org/r/405722

Change 405722 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Use shallow clone for phplint jobs

https://gerrit.wikimedia.org/r/405722

This issue keeps happening (well, out of space on the jenkins executers at least) but this patch is still open from right before All Hands :). Add it officially to our (too big of a) short-term-ish backlog.

And again...

gjg@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ du -sh *
511M	analytics-refinery-release
4.0K	analytics-refinery-release@tmp
422M	analytics-refinery-update-jars
4.0K	analytics-refinery-update-jars@tmp
822M	apps-android-wikipedia-publish
4.6G	apps-android-wikipedia-test
120M	commit-message-validator
236M	composer-package-validate
9.0M	debian-glue
1.8M	debian-glue-non-voting
81M	integration-zuul-layoutdiff
14M	integration-zuul-layoutvalidation-gate
481M	mediawiki-core-code-coverage
2.9G	mediawiki-core-php55lint
410M	mediawiki-core-php70lint
80M	mediawiki-vendor-composer-security
70M	mwext-CirrusSearch-whitespaces
764K	mwgate-composer-validate
1.5G	mwgate-php55lint
24M	mwgate-php56lint
1.5G	mwgate-php70lint
21M	operations-dns-lint
1.6M	operations-dns-tabs
236M	operations-mw-config-php55lint
99M	operations-mw-config-typos
236M	operations-puppet-wmf-style-guide
210M	phabricator-jessie-commits
3.4M	phabricator-jessie-debs
210M	phabricator-jessie-diffs
1.3M	php55lint
52M	php56lint
140M	selenium-Wikibase-chrome
541M	wikimedia-fundraising-civicrm
gjg@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ df -h
Filesystem                          Size  Used Avail Use% Mounted on
udev                                 10M     0   10M   0% /dev
tmpfs                               792M   81M  711M  11% /run
/dev/vda3                            19G   15G  3.1G  83% /
tmpfs                               2.0G     0  2.0G   0% /dev/shm
tmpfs                               5.0M     0  5.0M   0% /run/lock
tmpfs                               2.0G     0  2.0G   0% /sys/fs/cgroup
none                                256M  110M  147M  43% /var/lib/mysql
/dev/mapper/vd-second--local--disk   21G   20G     0 100% /srv
none                                256M     0  256M   0% /srv/home/jenkins-deploy/tmpfs
tmpfs                               396M     0  396M   0% /run/user/2947
tmpfs                               396M     0  396M   0% /run/user/11634
tmpfs                               396M     0  396M   0% /run/user/2890

Could this possibly be the cause of fetch/lock failures I'm seeing?

https://integration.wikimedia.org/ci/job/mwgate-php70lint/1140/console

Building remotely on integration-slave-jessie-1001 (DebianGlue contintLabsSlave DebianJessie) in workspace /srv/jenkins-workspace/workspace/mwgate-php70lint
[..] Fetching changes from the remote Git repository
 > git config remote.origin.url git://contint1001.wikimedia.org/mediawiki/extensions/ReplaceText # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from git://contint1001.wikimedia.org/mediawiki/extensions/ReplaceText  [..]
Caused by: hudson.plugins.git.GitException: Command "git config remote.origin.url git://contint1001.wikimedia.org/mediawiki/extensions/ReplaceText" returned status code 4:
stderr: error: failed to write new configuration file .git/config.lock
Krinkle renamed this task from mwgate-php55lint workspaces are getting huge to Workspaces for mwgate-php55lint / mwgate-php70lint are getting huge.May 2 2018, 5:29 PM
gjg@integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ du -sh
15G	.
gjg@integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ du -sh *
422M	analytics-refinery-update-jars
4.0K	analytics-refinery-update-jars@tmp
902M	apps-android-wikipedia-publish
5.0G	apps-android-wikipedia-test
100M	commit-message-validator
236M	composer-package-validate
1.6M	debian-glue-non-voting
4.0K	fail-archived-repositories
82M	integration-zuul-layoutdiff
11M	integration-zuul-layoutvalidation-gate
452M	mediawiki-core-code-coverage-php7
1.9G	mediawiki-core-php55lint
792M	mediawiki-core-php70lint
79M	mediawiki-vendor-composer-security
54M	mwext-CirrusSearch-whitespaces
776K	mwgate-composer-validate
1.4G	mwgate-php55lint
24M	mwgate-php56lint
2.0G	mwgate-php70lint
12M	operations-dns-lint
1.6M	operations-dns-tabs
211M	operations-mw-config-php55lint
100M	operations-mw-config-typos
238M	operations-puppet-wmf-style-guide
210M	phabricator-jessie-commits
4.1M	phabricator-jessie-debs
1.2M	php55lint
157M	php56lint
147M	selenium-Wikibase-chrome
524M	wikimedia-fundraising-civicrm

rm -rf'd again...

Mentioned in SAL (#wikimedia-releng) [2018-07-14T03:27:01Z] <Krinkle> Clearing various workspaces on integration-slave-jessie-1001 to fix operations-mw-config-php55lint Jenkins builds - T179963

Mentioned in SAL (#wikimedia-releng) [2018-08-27T13:47:38Z] <hashar> updating phplint jobs to use shallow clone AND wipe the workspace | https://gerrit.wikimedia.org/r/#/c/integration/config/+/405722/ | T179963

Change 405722 merged by jenkins-bot:
[integration/config@master] Use shallow clone for phplint jobs

https://gerrit.wikimedia.org/r/405722

hashar claimed this task.

Should be good now. I simply forgot about this task and its patch https://gerrit.wikimedia.org/r/405722

Shallow clones break the git-changed-in-head script, so it thinks that literally every file was changed.

Created a revert of this change as https://gerrit.wikimedia.org/r/#/c/integration/config/+/456515/

git-changed-in-head works with almost-shallow clones: git clone --depth 2 https://gerrit.wikimedia.org/...

Yup that works most of the time, but --depth 2 is not sufficient when a chain of patchset is being tested. I guess that is what prompted the revert.

I think I will just phase out those jobs entirely, they come from before we had all MediaWiki extensions and skins normalized to use composer test as an entry point with jakub-onderka/php-parallel-lint.

The git-changed-in-head part is an optimization to make it faster when it was used on huge repositories such as mediawiki/core or Wikibase. Nowadays that is handled directly by Quibble.

The only left use case is to lint php files for untrusted users. And eventually we will drop that use case T192217: Remove the "check" pipeline and Zuul's user-filter.

Krinkle lowered the priority of this task from High to Medium.Mar 12 2019, 6:06 PM

Moving out of active list of issues causing build failures. Might still be an issue long-term, but we seem to be able to manage it mostly through manual checks and depools.

I am just going to phase out those jobs starting now. There is no point in keeping them nowadays and they are better replaced by using composer test.

Change 498823 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] qrpedia: rm obsolete php linting jobs

https://gerrit.wikimedia.org/r/498823

Change 498824 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] phabricator/*: rm obsolete php linting jobs

https://gerrit.wikimedia.org/r/498824

Change 498823 merged by jenkins-bot:
[integration/config@master] qrpedia: rm obsolete php linting jobs

https://gerrit.wikimedia.org/r/498823

Change 498824 merged by jenkins-bot:
[integration/config@master] phabricator/*: rm obsolete php linting jobs

https://gerrit.wikimedia.org/r/498824

Change 498832 had a related patch set uploaded (by Hashar; owner: Hashar):
[analytics/wmde/scripts@master] build: add php-parallel-lint

https://gerrit.wikimedia.org/r/498832

Change 498832 merged by jenkins-bot:
[analytics/wmde/scripts@master] build: add php-parallel-lint

https://gerrit.wikimedia.org/r/498832

Change 498836 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] labs/toollabs rm php lint jobs

https://gerrit.wikimedia.org/r/498836

Change 498836 merged by jenkins-bot:
[integration/config@master] labs/toollabs rm php lint jobs

https://gerrit.wikimedia.org/r/498836

Change 498838 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] mediawiki/tools/release rm php lint jobs

https://gerrit.wikimedia.org/r/498838

Change 498838 merged by jenkins-bot:
[integration/config@master] mediawiki/tools/release rm php lint jobs

https://gerrit.wikimedia.org/r/498838

Change 498841 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] labs/tools/Luke081515IRCBot switch to composer test

https://gerrit.wikimedia.org/r/498841

Change 498841 merged by jenkins-bot:
[integration/config@master] labs/tools/Luke081515IRCBot switch to composer test

https://gerrit.wikimedia.org/r/498841

Change 498847 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Remove mediawiki/tools/code-utils

https://gerrit.wikimedia.org/r/498847

Change 498847 merged by jenkins-bot:
[integration/config@master] Remove mediawiki/tools/code-utils

https://gerrit.wikimedia.org/r/498847

Change 498855 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] DonationInterface: drop php 5.6 linting job

https://gerrit.wikimedia.org/r/498855

Change 498855 merged by jenkins-bot:
[integration/config@master] DonationInterface: drop php 5.6 linting job

https://gerrit.wikimedia.org/r/498855

Change 498857 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Remove PHP lint jobs from MediaWiki repos

https://gerrit.wikimedia.org/r/498857

Change 498857 merged by jenkins-bot:
[integration/config@master] Remove PHP lint jobs from MediaWiki repos

https://gerrit.wikimedia.org/r/498857

Change 498871 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] mediawiki-config: replace legacy php lint jobs

https://gerrit.wikimedia.org/r/498871

Change 498871 merged by jenkins-bot:
[integration/config@master] mediawiki-config: replace legacy php lint jobs

https://gerrit.wikimedia.org/r/498871

Change 498894 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Drop legacy php56lint from civicrm

https://gerrit.wikimedia.org/r/498894

Change 498897 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Remove php lint templates/macro etc

https://gerrit.wikimedia.org/r/498897

Change 498894 merged by jenkins-bot:
[integration/config@master] Drop legacy php56lint from civicrm

https://gerrit.wikimedia.org/r/498894

hashar claimed this task.

The jobs are gone!

Change 498897 merged by jenkins-bot:
[integration/config@master] Remove php lint templates/macro etc

https://gerrit.wikimedia.org/r/498897

Change 506627 had a related patch set uploaded (by Ladsgroup; owner: Hashar):
[analytics/wmde/scripts@production] build: add php-parallel-lint

https://gerrit.wikimedia.org/r/506627

Change 506627 merged by jenkins-bot:
[analytics/wmde/scripts@production] build: add php-parallel-lint

https://gerrit.wikimedia.org/r/506627

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:09 PM