Page MenuHomePhabricator

Modernize openstack rbac
Closed, ResolvedPublic

Description

codfw1-dev:

  • replace 'observer' roles with 'reader' roles
  • replace 'user' roles with 'reader' roles
  • replace 'projectadmin' role with 'member' role
  • remove old 'user' and 'projectadmin' roles

eqiad1:

  • replace 'observer' roles with 'reader' roles
  • replace 'user' roles with 'reader' roles
  • replace 'projectadmin' role with 'member' role (?)
  • rename old 'user' and 'projectadmin' roles (preserves a revert option)
  • remove old 'user' and 'projectadmin' roles

all together:

  • enforce_scope = true
  • enforce_new_defaults = true
  • remove projectadmin-specific policy rules
  • remove observer-specific policy rules
  • review and remove as many custom policy rules as possible
  • replace most or all uses of 'novaadmin' with service roles with global admin

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/puppetproduction+8 -8
operations/puppetproduction+5 -5
labs/privatemaster+2 -0
operations/puppetproduction+6 -6
operations/puppetproduction+5 -2
operations/puppetproduction+0 -1
operations/puppetproduction+2 -2
operations/puppetproduction+0 -7
operations/puppetproduction+15 -39
operations/puppetproduction+3 -33
operations/puppetproduction+0 -25
operations/puppetproduction+4 -0
operations/puppetproduction+0 -8
operations/puppetproduction+0 -31
operations/puppetproduction+40 -51
operations/puppetproduction+8 -14
operations/puppetproduction+369 -579
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+68 -8
operations/puppetproduction+15 -65
operations/puppetproduction+2 -0
operations/puppetproduction+0 -19
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+3 -0
operations/puppetproduction+2 -1
operations/puppetproduction+19 -2
operations/puppetproduction+0 -1
operations/puppetproduction+2 -2
operations/puppetproduction+26 -22
operations/puppetproduction+2 -2
operations/puppetproduction+4 -0
operations/puppetproduction+1 -1
operations/puppetproduction+6 -45
operations/puppetproduction+25 -73
operations/puppetproduction+3 -3
operations/puppetproduction+2 -21
operations/puppetproduction+4 -4
operations/puppetproduction+5 -5
operations/puppetproduction+8 -8
operations/puppetproduction+6 -6
operations/puppetproduction+28 -23
operations/puppetproduction+10 -0
operations/puppetproduction+12 -0
operations/puppetproduction+24 -25
operations/puppetproduction+2 -2
operations/puppetproduction+36 -0
operations/puppetproduction+4 -4
operations/puppetproduction+5 -1
operations/puppetproduction+4 -2
labs/privatemaster+2 -0
operations/puppetproduction+7 -2
operations/puppetproduction+77 -61
operations/puppetproduction+2 -0
operations/puppetproduction+512 -360
cloud/wmcs-cookbooksmain+8 -8
operations/puppetproduction+2 -2
operations/puppetproduction+2 -2
operations/puppetproduction+2 -2
labs/strikermaster+32 -20
openstack/horizon/wmf-sudo-dashboardmain+2 -2
openstack/horizon/wmf-member-dashboardmain+3 -3
openstack/horizon/wmf-member-dashboardmain+16 -15
operations/puppetproduction+136 -137
operations/puppetproduction+45 -42
operations/puppetproduction+10 -10
operations/puppetproduction+5 -0
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 916876 merged by Andrew Bogott:

[operations/puppet@production] mwopenstackclient: better support projectless auth

https://gerrit.wikimedia.org/r/916876

Change 917390 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev: enforce scope and default policies

https://gerrit.wikimedia.org/r/917390

Change 917390 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev: enforce scope and default policies

https://gerrit.wikimedia.org/r/917390

Change 917396 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] neutron policy.yaml: don't override network_device policy

https://gerrit.wikimedia.org/r/917396

Change 917396 merged by Andrew Bogott:

[operations/puppet@production] neutron policy.yaml: don't override network_device policy

https://gerrit.wikimedia.org/r/917396

Change 917986 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] clouds.yaml: specify system_scope for special system sections

https://gerrit.wikimedia.org/r/917986

Change 917986 merged by Andrew Bogott:

[operations/puppet@production] clouds.yaml: specify system_scope for special system sections

https://gerrit.wikimedia.org/r/917986

Change 917993 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] clouds.yaml: further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917993

Change 917993 merged by Andrew Bogott:

[operations/puppet@production] clouds.yaml: further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917993

Change 917994 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] clouds.yaml: yet further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917994

Change 917994 merged by Andrew Bogott:

[operations/puppet@production] clouds.yaml: yet further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917994

Change 917995 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] clouds.yaml: and yet still further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917995

Change 917995 merged by Andrew Bogott:

[operations/puppet@production] clouds.yaml: and yet still further fixes to support system scope sections

https://gerrit.wikimedia.org/r/917995

https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#the-issues-we-are-facing-with-scope-concept <- implies that enabling scope now is premature and may never be necessary. I'm torn between mourning the lost effort and cheering the simpler model

Change 922899 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] enforce_policy_scope: false in codfw1dev

https://gerrit.wikimedia.org/r/922899

Change 922899 merged by Andrew Bogott:

[operations/puppet@production] enforce_policy_scope: false in codfw1dev

https://gerrit.wikimedia.org/r/922899

Change 922904 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] enforce_new_policy_defaults: false in codfw1dev

https://gerrit.wikimedia.org/r/922904

Change 922904 merged by Andrew Bogott:

[operations/puppet@production] enforce_new_policy_defaults: false in codfw1dev

https://gerrit.wikimedia.org/r/922904

The openstack rbac work that I've been doing[0] has hit some serious roadbumps, but I'm swiftly approaching a stopping point. Y'all are long overdue for an update, so here's a summary of where I'm at.

I've been trying to catch up with the official upstream recommendations about roles and scope[1]. Those docs suggest a new two-dimensional auth model, where any given token has a 'role' (reader/member/admin) and a 'scope' (project/domain/system). Roles are progressive, (such that e.g. a member role has all the rights of a reader), but scope is not, so a system token can only do system things and NOT domain or project things.

That latter fact (that a system token can't do project-scoped work) has been making me (and anyone using codfw1dev) miserable as we can no longer have an auth model that allows one account and one token to do all the things we're used to doing.

I've recently been made aware of a different work-in-progress document on this topic [2] that reiterates that last point. EVERYONE turns out to hate the scope model because it breaks all existing 'one account to rule them all' workflows.

So! There are three basic topic areas here, which I'll take one at a time:

  • Role renames
  • clouds.yaml
  • Supporting token scope
  • Supporting the 'new policy defaults'

role renames

Upstream recommends that we have three basic user roles: reader, member, and admin. Mercifully, these roles mapped fairly closely to our old observer/user/admin role. I've renamed everything to conform with upstream, so now our cloud uses reader/member/admin just like upstream. That has gone fairly well, and Horizon already reflects the new names. Likely there are obsolete docs on wikitech using the old names but I've updated the things I can find.

TODO:

  • keep an eye out for doc references to the old role names (user/projectadmin) and search/replace (with reader/member)

clouds.yaml

We rely heavily on environment settings for OpenStack config and auth, mostly via the novaenv.sh script which sets up the environment.

The new/right way to set up config and auth is clouds.yaml. I've set up clouds.yaml in most places to duplicate the use of the existing env scripts. So, for instance, on a VM we have /etc/openstack/clouds.yaml with an 'observer' section that provides the same creds as observerenv.sh. On cloudcontrols we have ~root/.config/openstack/clouds.yaml with a 'novaobserver' section.
Going forward

  • it's best to not 'source novaenv.sh' and instead either add "--os-cloud novaadmin" to your commandlines, or 'export OS_CLOUD=novaobserver'.
  • setting things in your environment (e.g. OS_PROJECT_ID=xxx) won't work, as clouds.yaml takes precedence. Train your fingers (and rewrite code) to use openstack flags instead ("--project xxx").

I've already updated many of our workflows (mwopenstackclients, cookbooks, etc) to use clouds.yaml.

TODO:

  • train our fingers to use OS_CLOUD rather than novaenv.sh
  • Document new cli workflows T336670

token scope

It turns out that essentially all cloud admins worldwide hate the new scope model. Upstream devs are responding to this by leaving the scope implementation in place (for edge cases that require it, most significantly Openstack Ironic which we don't use) and otherwise declaring every API to work with a project token.

That means that in the long run this change will be largely moot for us. I'm going to roll back whatever work I've done to try to adopt the new model and we'll all just have to suffer through the deprecation warnings (which are now themselves deprecated) until release B or C when things stabilize.

TODO:

  • leave oslo_policy->enforce_scope = false for now (and possibly forever)
  • ignore policy deprecation warnings about scope

new policy defaults

The current policy defaults seem to assume enforce_scope = true; when I turn oslo_policy->enforce_new_defaults = True without enforce_scope turned I see a fair number of incoherent failures.

TODO:

  • leave oslo_policy->enforce_new_defaults = false for now, revisit after the scope rework finishes in B or C.
  • ignore policy deprecation warnings about new defaults, for now

Thank you all for your patience with the ever-shifting ground on this topic.

[0] https://phabricator.wikimedia.org/T330759

[1] https://docs.openstack.org/nova/latest/configuration/policy-concepts.html

[2] https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#the-issues-we-are-facing-with-scope-concept

Change 923696 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] clouds.yaml: remove keystoneadmin section

https://gerrit.wikimedia.org/r/923696

Change 923696 merged by Andrew Bogott:

[operations/puppet@production] clouds.yaml: remove keystoneadmin section

https://gerrit.wikimedia.org/r/923696

Change 916590 merged by Andrew Bogott:

[operations/puppet@production] wmcs prometheus: include 'OPENSTACK->CLOUD' in prometheus config

https://gerrit.wikimedia.org/r/916590

Change 916588 merged by Andrew Bogott:

[operations/puppet@production] grid_configurator: use mwopenstackclients library

https://gerrit.wikimedia.org/r/916588

Andrew changed the task status from Open to Stalled.Jul 27 2023, 4:11 PM

This is stalled pending the upstream making some decisions.

Change #1119715 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack codfw1dev: enforce_policy_scope: true

https://gerrit.wikimedia.org/r/1119715

Change #1119715 merged by Andrew Bogott:

[operations/puppet@production] Openstack codfw1dev: enforce_policy_scope: true

https://gerrit.wikimedia.org/r/1119715

The above patch didn't break any of my known tests. I guess now we wait a bit and see if anyone runs into codfw1dev access surprises.

Change #1135819 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] eqiad1 openstack: enforce_policy_scope=True

https://gerrit.wikimedia.org/r/1135819

Change #1135820 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev: enforce_new_policy_defaults=true

https://gerrit.wikimedia.org/r/1135820

Change #1135819 merged by Andrew Bogott:

[operations/puppet@production] eqiad1 openstack: enforce_policy_scope=True

https://gerrit.wikimedia.org/r/1135819

Change #1135820 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev: enforce_new_policy_defaults=true

https://gerrit.wikimedia.org/r/1135820

Change #1140767 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev backups: enforce_policy_scope: true

https://gerrit.wikimedia.org/r/1140767

Change #1140768 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] eqiad1: enforce_new_policy_defaults: True

https://gerrit.wikimedia.org/r/1140768

Change #1140767 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev backups: enforce_policy_scope: true

https://gerrit.wikimedia.org/r/1140767

Change #1140773 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloud-vps: clean up a couple of hiera settings

https://gerrit.wikimedia.org/r/1140773

Change #1140768 merged by Andrew Bogott:

[operations/puppet@production] eqiad1: enforce_new_policy_defaults: True

https://gerrit.wikimedia.org/r/1140768

Change #1140773 merged by Andrew Bogott:

[operations/puppet@production] cloud-vps: clean up a couple of hiera settings

https://gerrit.wikimedia.org/r/1140773

Change #1141977 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] keystone: update policy.yaml files

https://gerrit.wikimedia.org/r/1141977

Change #1141978 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] nova policy.yaml: update with advice from oslopolicy-validator

https://gerrit.wikimedia.org/r/1141978

Change #1141979 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] nova policy.json: remove a bunch of redundant rules

https://gerrit.wikimedia.org/r/1141979

Change #1141980 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] glance: update policy.yaml

https://gerrit.wikimedia.org/r/1141980

Change #1141981 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Cinder: explicitly use new policy rules

https://gerrit.wikimedia.org/r/1141981

Change #1141982 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder policy.yaml: update, remove redundant rules

https://gerrit.wikimedia.org/r/1141982

Change #1141983 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Neutron: update policy rules

https://gerrit.wikimedia.org/r/1141983

Change #1141984 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Designate: update policy.yaml

https://gerrit.wikimedia.org/r/1141984

taavi changed the task status from Stalled to Open.May 6 2025, 9:18 AM

Change #1141977 merged by Andrew Bogott:

[operations/puppet@production] keystone: update policy.yaml files

https://gerrit.wikimedia.org/r/1141977

Change #1141978 merged by Andrew Bogott:

[operations/puppet@production] nova policy.yaml: update with advice from oslopolicy-validator

https://gerrit.wikimedia.org/r/1141978

Change #1141979 merged by Andrew Bogott:

[operations/puppet@production] nova policy.json: remove a bunch of redundant rules

https://gerrit.wikimedia.org/r/1141979

Change #1141980 merged by Andrew Bogott:

[operations/puppet@production] glance: update policy.yaml

https://gerrit.wikimedia.org/r/1141980

Change #1141981 merged by Andrew Bogott:

[operations/puppet@production] Cinder: explicitly use new policy rules

https://gerrit.wikimedia.org/r/1141981

Change #1141982 merged by Andrew Bogott:

[operations/puppet@production] cinder policy.yaml: update, remove redundant rules

https://gerrit.wikimedia.org/r/1141982

Change #1141983 merged by Andrew Bogott:

[operations/puppet@production] Neutron: update policy rules

https://gerrit.wikimedia.org/r/1141983

Change #1141984 merged by Andrew Bogott:

[operations/puppet@production] Designate: update policy.yaml

https://gerrit.wikimedia.org/r/1141984

Change #1142683 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Remove oslo_policy section

https://gerrit.wikimedia.org/r/1142683

Change #1142683 merged by Andrew Bogott:

[operations/puppet@production] Remove oslo_policy section

https://gerrit.wikimedia.org/r/1142683

Change #1143612 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder: use 'cinder' service user rather than 'novaadmin'

https://gerrit.wikimedia.org/r/1143612

Change #1143612 abandoned by Andrew Bogott:

[operations/puppet@production] cinder: use 'cinder' service user rather than 'novaadmin'

Reason:

I don't think we're ready for this yet

https://gerrit.wikimedia.org/r/1143612

Change #1161113 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack [service_user] config: use internal endpoint for service users

https://gerrit.wikimedia.org/r/1161113

Change #1161114 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack [keystone_authtoken]: remove auth_url setting

https://gerrit.wikimedia.org/r/1161114

Change #1161115 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder: use 'cinder' service user rather than 'novaadmin'

https://gerrit.wikimedia.org/r/1161115

Change #1161113 merged by Andrew Bogott:

[operations/puppet@production] Openstack [service_user] config: use internal endpoint for service users

https://gerrit.wikimedia.org/r/1161113

Change #1161114 merged by Andrew Bogott:

[operations/puppet@production] Openstack [keystone_authtoken]: remove auth_url setting

https://gerrit.wikimedia.org/r/1161114

Change #1161115 merged by Andrew Bogott:

[operations/puppet@production] cinder: use 'cinder' service user rather than 'novaadmin'

https://gerrit.wikimedia.org/r/1161115

Change #1162060 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[labs/private@master] Added stand-in passwords for nova service user

https://gerrit.wikimedia.org/r/1162060

Change #1162063 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack Nova: use 'novaservice' service user rather than novaadmin

https://gerrit.wikimedia.org/r/1162063

Change #1162060 merged by Andrew Bogott:

[labs/private@master] Added stand-in passwords for nova service user

https://gerrit.wikimedia.org/r/1162060

Change #1162063 merged by Andrew Bogott:

[operations/puppet@production] Openstack Nova: use 'novaservice' service user rather than novaadmin

https://gerrit.wikimedia.org/r/1162063

Change #1162077 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack Nova: use 'novaservice' service user rather than novaadmin

https://gerrit.wikimedia.org/r/1162077

Change #1162077 merged by Andrew Bogott:

[operations/puppet@production] Openstack Nova: use 'novaservice' service user rather than novaadmin

https://gerrit.wikimedia.org/r/1162077

Keystone logs are still fairly full of warnings like

"DeprecationWarning: Policy enforcement is depending on the value of token."

however, if I delete /all/ keystone custom policies and let it just rely on code defaults, it still throws those warnings. So I guess they'll always be with us.

Keystone logs are still fairly full of warnings like

"DeprecationWarning: Policy enforcement is depending on the value of token."

however, if I delete /all/ keystone custom policies and let it just rely on code defaults, it still throws those warnings. So I guess they'll always be with us.

The Keystone test suite silences these via warnings.filterwarnings: https://github.com/openstack/keystone/blob/70d34239e2cc86c363fb69808559bf7dea8f3432/keystone/tests/unit/ksfixtures/warnings.py#L39

The indirection of oslo.context makes it difficult to see exactly where this is actually being triggered in the runtime system: https://github.com/openstack/oslo.context/blob/0184e52950cf59e6300f28473eb956adbd43d266/oslo_context/context.py#L101-L106

The token key that is being complained about is setup in https://github.com/openstack/keystone/blob/70d34239e2cc86c363fb69808559bf7dea8f3432/keystone/common/context.py#L41-L66. The way this object works, anything that uses token in a policy lookup through a keystone.common.context.RequestContext object will trigger the warning.

Climbing out of this rabbit hole for now. We might be able to figure out more in a controlled runtime where we used PYTHONWARNINGS=error::DeprecationWarning in the environment to turn the warnings into exceptions so we could get the stacktrace.

I think I care about deprecation warnings when they apply to our custom policies, but don't care when keystone is issuing warnings about policies that shipped directly from keystone upstream. I'm happy assuming they're approximately a 'note to self' from the keystone team and ignoring them unless you think I'm missing something.