Page MenuHomePhabricator

Captchas are broken in the beta cluster
Closed, ResolvedPublic

Description

Captchas are currently broken in the beta cluster. Viewing https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:CreateAccount&returnto=Main+Page doesn't display any captcha. Special:Captcha/image complains that a CAPTCHA png file is not available in the global-swift-eqiad cluster: https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:Captcha/image&wpCaptchaId=109323179:

image.png (236×2 px, 49 KB)

This is preventing new accounts from being created on Beta and thus, it hinders Growth-Team's work on workflows involving user registration.

Event Timeline

I've spent some time investigating this problem. The issue is that beta's swift server now does not have the right swift key in its /etc/swift/proxy-server.conf file. Currently, the account definition section looks like this:

[filter:tempauth]
use = egg:swift#tempauth
token_life = 604800
user_mw_media =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_mw
user_mw_thumbor =   http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_mw
user_netbox_attachments =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_netbox
user_pagecompilation_zim =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_pagecompilation
user_performance_arclamp =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_performance
user_phabricator_files =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_phab
user_swift_dispersion =  .admin http://deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud/v1/AUTH_dispersion

When I took beta's swift key from deployment-deploy03:/srv/mediawiki-staging/private/PrivateSettings.php ($wmgSwiftConfig), added it to the user_mw_media line (before .admin) and restarted the swift-proxy service, CAPTCHA started working.

I've ran puppet to clear my manual changes for now. CAPTCHAs work for now, presumably because ConfirmEdit now has them cached.

I'm not sure how to fix this problem in Puppet, or why it happened. In particular, I can't find any sign of the password in puppet on labs. I'd expect it under profile::swift::global_account_keys, but I can't find it there.

Hopefully this is enough troubleshooting info for someone with more Puppet knowledge to be able to fix this in a more permanent way.

According to the template, these come from swift::proxy::accounts and swift::proxy::credentials arrays, which in turn come from profile::swift::accounts and profile::swift::global_account_keys. Accounts are set here in production and here for beta, keys are set in the private puppet repo for production and in cloud/instance-puppet for beta, in _.yaml and deployment-ms-fe.yaml. I think the problem is that the latter is taking priority over the former, and has no keys at all? The relevant commits are f18acee850 and 35eb65395 by @Andrew.

Thanks for the info, @Tgr! I fixed the credentials structure in project puppet, removed the deployment-ms-fe.yaml entry and ran puppet on deployment-ms-fe04. Credentials now appeared at the expected place. After a service swift-proxy restart, CAPTCHAs start showing again.

Urbanecm_WMF claimed this task.
Urbanecm_WMF triaged this task as High priority.

Boldly resolving.

Ladsgroup subscribed.

SRE (and to be exact data persistence team) don't maintain swift in beta cluster. I try to spend some time on it in my 10% time but it's explicitly excluded from the team's responsibilities.

@Ladsgroup: Is there anyone specifically responsible or is this a similar situation to beta as a whole?

How swift credentials are done in prod was changed as part of T162123 to make it possible for our swiftrepl replacement to have access to the necessary credentials from puppet (rather than by hand deployment).