Page MenuHomePhabricator

doc1001 permission problems for doc.wikimedia.org deploy
Closed, ResolvedPublic

Description

I've not been able to successfully deploy changes here for several months. There is always some permission problem or another.

Latest version:

$ ssh doc1001.eqiad.wmnet  git -C /srv/docroot pull
From https://gerrit.wikimedia.org/r/integration/docroot
   72d72d4..fb27369  master     -> origin/master
Updating 72d72d4..fb27369
warning: unable to unlink org/wikimedia/doc/favicon.php: Permission denied
error: unable to create file org/wikimedia/doc/favicon.ico: Permission denied

As such the following changes are (partly) undeployed, and the git status is dirty on the server:

Event Timeline

The long story is at T235715#5588924

The issue originates from the migration of doc.wikimedia.org to a new server which was T137890: Relocate CI generated docs and coverage reports. It is pending a couple patches I have attached to that T137890:

Change 484304 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] rsync: readd incoming and outgoing chmod

https://gerrit.wikimedia.org/r/484304

Change 484304 merged by Dzahn:
[operations/puppet@production] rsync: readd incoming and outgoing chmod

https://gerrit.wikimedia.org/r/484304

Both patches are merged now. Git status of /srv/docroot is clean.

I ran git pull as user krinkle:

krinkle@doc1001:/srv/docroot$ git -C /srv/docroot pull
Already up-to-date.

Looks like it. I'll find out next time but git-pull works fine now, but that already worked when there were no changes.

Next time was today, and it's not resolved.

Direct command
$ ssh doc1001.eqiad.wmnet git -C /srv/docroot pull
From https://gerrit.wikimedia.org/r/integration/docroot
   de29f03..77493ca  master     -> origin/master
Updating de29f03..77493ca
error: unable to create symlink org/wikimedia/doc/favicon.ico: Permission denied
Remote shell
krinkle@doc1001:/srv/docroot$ sudo -u doc-uploader git pull      
error: cannot open .git/FETCH_HEAD: Permission denied
Krinkle renamed this task from doc1001 permission problems to doc1001 permission problems for doc.wikimedia.org deploy.Jan 29 2020, 3:33 PM
Krinkle added a subscriber: Addshore.

Change 568600 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[integration/config@master] integration/docroot: Change post-merge message to recommend sudo

https://gerrit.wikimedia.org/r/568600

Change 568600 merged by jenkins-bot:
[integration/config@master] integration/docroot: Change post-merge message to recommend sudo

https://gerrit.wikimedia.org/r/568600

Mentioned in SAL (#wikimedia-operations) [2020-02-03T18:58:12Z] <mutante> < bblack> !log doc1001: chown -R nobody:wikidev /srv/docroot | < mutante> !doc1001 sudo -u doc-uploader chmod g+w /srv/docroot/org/wikimedia/doc | https://gerrit.wikimedia.org/r/c/operations/puppet/+/484304 | (T237707)

Mentioned in SAL (#wikimedia-operations) [2020-02-03T19:19:42Z] <mutante> doc1001 - chown -R doc-uploader:doc-uploader /srv/docroot ; temp. disabled puppet (T237707)

Change 569620 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] doc: use doc-uploader group for docroot privs, stop using shared=>true

https://gerrit.wikimedia.org/r/569620

Change 569620 merged by Dzahn:
[operations/puppet@production] doc: use doc-uploader group for docroot privs, stop using shared=>true

https://gerrit.wikimedia.org/r/569620

Mentioned in SAL (#wikimedia-operations) [2020-02-03T20:13:56Z] <mutante> doc1001 - re-enabled puppet after merging gerrit:569620 - Git::Clone[integration/docroot]/File[/srv/docroot]/mode: mode changed '2775' to '0755' - Profile::Doc/File[/srv/docroot/org/wikimedia/doc]/group: group changed 'doc-uploader' to 'wikidev', mode changed '0775' to '0755'. needs another follow-up (T237707)

Change 569637 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] doc: stop using wikidev group, use doc-uploader group

https://gerrit.wikimedia.org/r/569637

Change 569637 merged by Dzahn:
[operations/puppet@production] doc: stop using wikidev group, use doc-uploader group

https://gerrit.wikimedia.org/r/569637

We ran into permission problems again today. The latest attempt to fix is:

  • stop using the 'wikidev' group in general (default and used too widely)
  • stop setting "shared => true" with git::clone (not needed if we all (and puppet!) use the same user to git clone)
  • stop including umask change for wikidev user (reduce complexity, not needed anymore)
  • doc-uploader is also a group, so doc-uploader:doc-uploader is used as owner for everything below /src/docroot now
  • users are expected to deploy with "sudo -u doc-uploader" as it happened today

Before it was possible to deploy as doc-uploader or as regular user and differences in docs kept causing issues.

So please all deploy with "sudo -u doc-uploader git -C /srv/docroot/ pull" and hopefully that resolves the ticket for real now.

All members of contint-admins have privileges to run any command as doc-uploader'.

The permission madness has been finally addressed by moving deployment of integration/docroot.git to use scap and moving the published doc to a disconnected directory (/srv/doc). There is thus no more any permission clashes.