Page MenuHomePhabricator

Unable to SSH into instances project 'wikilink'
Closed, ResolvedPublic

Description

One of our project members (username: Jsn.sherman, shell name: jsn) is having trouble SSH'ing into instances created in the project 'wikilink', but is able to SSH into instances from the project 'twl', not sure why.

Event Timeline

Project membership missing from LDAP:

11:15:43 0 ✓ zhuyifei1999@tools-sgebastion-08: ~$ id jsn
uid=16502(jsn) gid=500(wikidev) groups=500(wikidev),52777(project-twl),50062(project-bastion)
11:15:47 0 ✓ zhuyifei1999@tools-sgebastion-08: ~$ getent group project-wikilink
project-wikilink:*:54031:samwalton9,novaadmin,crucio

... but it's on https://openstack-browser.toolforge.org/project/wikilink ?!

Sounds like more Keystone Vs LDAP problems. @Andrew ?

Mentioned in SAL (#wikimedia-cloud) [2020-04-16T12:27:21Z] <arturo> removing user/projectadmin jsn from the project and add it again (T250365)

aborrero claimed this task.
aborrero added a subscriber: aborrero.

Might be fixed now:

$ groups jsn
jsn : wikidev project-wikilink project-bastion project-twl
$ getent group project-wikilink
project-wikilink:*:54031:crucio,jsn,samwalton9,novaadmin

However, when removing/adding the user from the project I saw the following keystone warnings:

(keystone.common.wsgi): 2020-04-16 12:27:38,563 WARNING Could not find role: projectadmin.
(py.warnings): 2020-04-16 12:28:25,487 WARNING /usr/lib/python3/dist-packages/oslo_policy/policy.py:865: UserWarning: Policy identity:list_roles failed scope check. The token used to make the request was project scoped but the policy requires ['system'] scope. This behavior may change in the future where using the intended scope is required
  warnings.warn(msg)

Those warnings deserve another phab task I think.

For the record, I did this:

aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role remove --user jsn --project wikilink user
aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role remove --user jsn --project wikilink projectadmin
aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role add --user jsn --project wikilink projectadmin
aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role add --user jsn --project wikilink user
aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role assignment list --project wikilink  --names
+--------------+----------------------+-------+------------------+--------+-----------+
| Role         | User                 | Group | Project          | Domain | Inherited |
+--------------+----------------------+-------+------------------+--------+-----------+
| projectadmin | UY Scuti@Default     |       | wikilink@Default |        | False     |
| user         | UY Scuti@Default     |       | wikilink@Default |        | False     |
| projectadmin | Jsn.sherman@Default  |       | wikilink@Default |        | False     |
| user         | Jsn.sherman@Default  |       | wikilink@Default |        | False     |
| projectadmin | Novaadmin@Default    |       | wikilink@Default |        | False     |
| user         | Novaadmin@Default    |       | wikilink@Default |        | False     |
| observer     | Novaobserver@Default |       | wikilink@Default |        | False     |
| projectadmin | Samwalton9@Default   |       | wikilink@Default |        | False     |
| user         | Samwalton9@Default   |       | wikilink@Default |        | False     |
+--------------+----------------------+-------+------------------+--------+-----------+

Closing ticket now, please reopen if required.

(keystone.common.wsgi): 2020-04-16 12:27:38,563 WARNING Could not find role: projectadmin.

I am pretty sure that this is red herring. It appears when roles are referred to by name rather than by uuid; I suspect that keystone is doing a uuid lookup (which fails) and then tries again with the name. I'm digging in the code now to see if that's right.

Does anyone know when (date/time) jsn would have been added to the project? This sounds similar to the bugs that I opened T249636: Audit Toolforge account approvals between 2020-03-30 and 2020-04-07 to ensure that database and LDAP state agree to look into.

FWIW this is fixed now. I'm now able to shell in. Thanks all!