Steps to replicate the issue:
- Put toolforge jobs inside a Build Service created container on the Toolforge cluster
- here are a number of ways to do this, but likely the easiest is to put git+https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli and git+https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli in an requirements.txt file and then build the image
- toolforge jobs list
What happens?:
$ toolforge jobs list Traceback (most recent call last): File "/layers/heroku_python/dependencies/bin/toolforge-jobs", line 8, in <module> sys.exit(main()) ^^^^^^ File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/tjf_cli/cli.py", line 1120, in main user = getpass.getuser() ^^^^^^^^^^^^^^^^^ File "/layers/heroku_python/python/lib/python3.12/getpass.py", line 169, in getuser return pwd.getpwuid(os.getuid())[0] ^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: 'getpwuid(): uid not found: 55419'
What should have happened instead?:
$ toolforge jobs list $
Software version: 16.0.12
The assumptions about the runtime user and the ability to fetch NSS data on the user were reasonable on the Toolforge bastions and even in our legacy containers with NSS LDAP support, but the recommended modern container solution makes this method of detecting the executing tool impossible.
See also: T369569: `webservice` requires effective user to be the tool user and listed in NSS passwd data