Page MenuHomePhabricator

default security group for new project missing ssh rules for bastion
Closed, ResolvedPublic

Description

@RoySmith's new spi-tools project started life with this default security group (3a66c94c-1b29-4d5b-804c-165298f25ef6):

DirectionEther TypeIP ProtocolPort RangeRemote IP PrefixRemote Security GroupDescription
EgressIPv4AnyAny0.0.0.0/0--
EgressIPv6AnyAny::/0--
IngressIPv4AnyAny-default-
IngressIPv6AnyAny-default-

This is missing the expected rules to allow ssh via the bastions.

IngressIPv4TCP22 (SSH)172.16.0.0/21--
IngressIPv4TCP22 (SSH)-default-

Event Timeline

I have manually added the missing rules to unblock @RoySmith

Reedy renamed this task from [Regression] default security group for new project missing ssh rules for bastion to default security group for new project missing ssh rules for bastion.Apr 7 2021, 12:08 AM
Reedy added a project: Regression.

Oddly, I don't think this is a result of the ussuri upgrade. The project creation hook is failing before they get to the part that creates service group rules; it fails during the ldap group creation, like this:

Failed to create group cn=project-ussuriprojecttest1,ou=groups,dc=wikimedia,dc=org, attempt number 2: {'desc': 'Object class violation', 'info': "object class 'groupOfNames' requires attribute 'member'"} [('objectClass', [b'groupOfNames', b'posixGroup', b'top']), ('cn', [b'project-ussuriprojecttest1']), ('gidNumber', [b'52920'])]: ldap.OBJECT_CLASS_VIOLATION: {'desc': 'Object class violation', 'info': "object class 'groupOfNames' requires attribute 'member'"}

And, indeed, we are trying to create a group with no members, thanks to https://gerrit.wikimedia.org/r/c/operations/puppet/+/667423 which removed the default service users (e.g. novaadmin) from new projects.

So... probably this can be fixed by just skipping the group sync since there's nothing to sync. The ldap group should get created later on, when the first member is added to the project.

Change 677419 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] wmfkeystonehooks ldap groups: Handle groups with no members

https://gerrit.wikimedia.org/r/677419

There is also something strange about security groups in general. The wmfkeystonehooks module was not working on the empty LDAP group, but it retried and created it successfully (which I believe it has always done and is probably not great but ok). However, I combed through a lot of logs about the security group rules and those are very strange.

Port 22 security rule already exists.: neutronclient.common.exceptions.Conflict: Security group rule already exists. Rule id is acd02d97-900d-4735-b409-864728593fae. is an example. The log is full of those. I went digging and discovered that no matter how you filter the quests, the API returns ALL projects' rules and security groups for novaadmin. This does not happen in Horizon, and Horizon works well, but I think novaadmin cannot interact with security groups by project.

There is quite a lot of backlog in IRC about what I found regarding that, including getting the actual requests with -vv args in the CLI. The requests included what I'd imagine to be the appropriate project IDs and names, but it acted like I'd asked for --all-projects.

An easy example:

[bstorm@cloudcontrol1004]:log $ sudo wmcs-openstack security group rule list --os-project-id spi-tools default
More than one SecurityGroup exists with the name 'default'.

If you search openstack logs, you'll find the Security group rule already exists is the only error that directly relates, which is why I focused on that filtering thing.

Change 677419 merged by Andrew Bogott:

[operations/puppet@production] wmfkeystonehooks ldap groups: Handle groups with no members

https://gerrit.wikimedia.org/r/677419

Andrew claimed this task.