Page MenuHomePhabricator

502 Bad Gateway on Beta Cluster
Closed, ResolvedPublic

Event Timeline

Paladox triaged this task as High priority.EditedMay 1 2017, 4:29 PM
Paladox updated the task description. (Show Details)
Paladox subscribed.

This is what shinken reported on #wikimedia-releng

<shinken-wm> PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - string 'Wikipedia' not found on 'https://en.m.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 325 bytes in 0.015 second response time
[17:15:19] <shinken-wm> PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - string 'Wikipedia' not found on 'https://en.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 325 bytes in 0.033 second response time

EddieGP raised the priority of this task from High to Unbreak Now!.May 1 2017, 4:38 PM
EddieGP subscribed.

The whole beta cluster is down. I know that's not production, but still something to look into asap.

thcipriani claimed this task.
thcipriani subscribed.

For some reason https://gerrit.wikimedia.org/r/#/c/350505/3/modules/varnish/templates/text-common.inc.vcl.erb caused reload-vcl to fail on deployment-cache-text04 during a puppet run.

Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: Command failed with error code 106   
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: Message from VCC-compiler:           
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: Unused probe varnish, defined:       
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: ('/etc/varnish/wikimedia-common_text-
backend.inc.vcl' Line 59 Pos 7)                                                                                                                                           
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: probe varnish {                      
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: ------#######--                      
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns:                                      
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: (That was just a warning)            
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: Unused acl wikimedia_nets, defined:  
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: ('/etc/varnish/wikimedia-common_text-
backend.inc.vcl' Line 28 Pos 5)                                                                                                                                           
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: acl wikimedia_nets {                 
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: ----##############--                 
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns:                                      
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: (That was just a warning)            
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: VCL compiled.                        
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: CLI communication error (hdr)        
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]/returns: Error: vcl.load 9fa3e2eb-51d3-41a4-a9
82-9bd5ed820af8 /etc/varnish/wikimedia_text-backend.vcl failed/etc/varnish/wikimedia_text-backend.vcl 9fa3e2eb-51d3-41a4-a982-9bd5ed820af8 reload failed                  
Error: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]: Failed to call refresh: /usr/share/varnish/rel
oad-vcl  || (touch /var/tmp/reload-vcl-failed; false) returned 1 instead of one of [0]                                                                                    
Error: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]: /usr/share/varnish/reload-vcl  || (touch /var/
tmp/reload-vcl-failed; false) returned 1 instead of one of [0]
[...]
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]/returns: Command failed with error code 106
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]/returns: VCL compiled.
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]/returns: CLI communication error (hdr)
Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]/returns: Error: vcl.load ea4f0acb-0547-4858-8610-a2a2c76bad48 /etc/varnish/wikimedia_text-frontend.vcl failed/etc/varnish/wikimedia_text-frontend.vcl ea4f0acb-0547-4858-8610-a2a2c76bad48 reload failed
Error: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]: Failed to call refresh: /usr/share/varnish/reload-vcl -n frontend || (touch /var/tmp/reload-vcl-failed-frontend; false) returned 1 instead of one of [0]
Error: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-frontend]/Exec[load-new-vcl-file-frontend]: /usr/share/varnish/reload-vcl -n frontend || (touch /var/tmp/reload-vcl-failed-frontend; false) returned 1 instead of one of [0]

I'm still unclear what that failure happened.

A restart of the varnish-frontend and varnish services fixed it.

Subsequent puppet runs have been fine.