Page MenuHomePhabricator

Cloud VPS project creation cookbook times out really often
Closed, ResolvedPublic

Description

For the past few times when we've created a Cloud VPS project via the cookbook, the creation failed which left tofu in a state that needed manual fixing. This raises several questions:

  • Why is creating projects this slow?
  • Can we make it faster?
  • If not, is there a timeout that needs to be raised?
  • Can we gracefully handle the failure and retry in Tofu? (or automatically handle the required state import?)
  • Is tofu really the smartest way to handle these? (T398285)

Details

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I'm looking for a few examples of this on logstash.

On July 4, The keystone hooks for the project 'wikidata-deleted' took 2:09 with the majority of the time spent in DNS domain creation
On July 2, 'dumpstorrents' 2:23, again most time spent (more than 2 minutes) in domain creation
On June 12, 'zuul', 1:32

Meanwhile, projects created by magnum (which don't get dns entries) take less than 2 seconds for creation.

So, if the slowdown is happening during the actual project creation stage, domain creation (or more likely, domain transfer between projects) is the slow bit.

It's also possible that what looks like a timeout during creation is actually a timeout during the addition of an initial user to a project (which would be quite slow if the user is being added to the bastion project for the first time.)

@taavi, are you seeing delays of more than 2 or so minutes? And do you happen to know which stage of the cookbook is failing? And is the timeout happening in the API call, or happening because tofu is impatient?

@taavi, are you seeing delays of more than 2 or so minutes?

Yes.

And do you happen to know which stage of the cookbook is failing? And is the timeout happening in the API call, or happening because tofu is impatient?

The Tofu run fails to create the project because HAProxy returns a HTTP 504 (iirc) error for the create project keystone API call.

Change #1182188 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Keystone hooks: speed up domain creation

https://gerrit.wikimedia.org/r/1182188

Change #1182188 merged by Andrew Bogott:

[operations/puppet@production] Keystone hooks: speed up domain creation

https://gerrit.wikimedia.org/r/1182188

I believe this to be fixed