Page MenuHomePhabricator

Upgrade hosts to haproxy 2.8.10
Closed, ResolvedPublic

Description

As the rfc5424 bug is fixed in 2.8.10 and it's already imported into our apt repositories, it's time to switch the cp hosts to this HAProxy version and try again with rfc5424 format

Event Timeline

Change #1046674 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: install haproxy 2.8.10 on cp4037

https://gerrit.wikimedia.org/r/1046674

Change #1046674 merged by Fabfur:

[operations/puppet@production] hiera: install haproxy 2.8.10 on cp4037

https://gerrit.wikimedia.org/r/1046674

Mentioned in SAL (#wikimedia-operations) [2024-06-17T15:28:36Z] <fabfur> upgrading haproxy to 2.8.10 on cp4037 (T367756)

Mentioned in SAL (#wikimedia-operations) [2024-06-18T08:43:05Z] <fabfur> cp4037 currently depooled and puppet disabled for T367756

Fabfur renamed this task from Upgrade ulsfo hosts to haproxy 2.8.10 to Upgrade hosts to haproxy 2.8.10.Jun 18 2024, 10:04 AM
Fabfur updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2024-06-18T10:05:19Z] <fabfur> cp3066 currently depooled and puppet disabled for T367756

Change #1047029 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: enable benthos on cp3066

https://gerrit.wikimedia.org/r/1047029

Mentioned in SAL (#wikimedia-operations) [2024-06-18T10:58:14Z] <fabfur> cp3066 repooled and puppet enabled (T367756)

Change #1047039 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on ulsfo

https://gerrit.wikimedia.org/r/1047039

Change #1047039 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on ulsfo

https://gerrit.wikimedia.org/r/1047039

Mentioned in SAL (#wikimedia-operations) [2024-06-18T12:47:24Z] <fabfur> upgrade haproxy to v2.8.10 on all ulsfo cp hosts (T367756)

Mentioned in SAL (#wikimedia-operations) [2024-06-18T15:36:08Z] <fabfur> upgrade haproxy to v2.8.10 on cp3066 (T367756)

Mentioned in SAL (#wikimedia-operations) [2024-06-18T15:39:28Z] <fabfur> upgrade haproxy to v2.8.10 on cp5030,cp5032 (T367756)

Change #1047436 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on eqsin

https://gerrit.wikimedia.org/r/1047436

Change #1047442 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] benthos:cache: switch to rfc5424 format

https://gerrit.wikimedia.org/r/1047442

Change #1047436 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on eqsin

https://gerrit.wikimedia.org/r/1047436

Change #1047483 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: test downgrading haproxy on cp5017

https://gerrit.wikimedia.org/r/1047483

Change #1047483 merged by Fabfur:

[operations/puppet@production] hiera: test downgrading haproxy on cp5017

https://gerrit.wikimedia.org/r/1047483

Change #1047442 merged by Fabfur:

[operations/puppet@production] benthos:cache: switch to rfc5424 format

https://gerrit.wikimedia.org/r/1047442

Change #1047536 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] benthos:cache: fixed typo in field name

https://gerrit.wikimedia.org/r/1047536

Change #1047536 merged by Fabfur:

[operations/puppet@production] benthos:cache: fixed typo in field name

https://gerrit.wikimedia.org/r/1047536

Change #1047545 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] benthos:cache: delete message if sequence number is missing

https://gerrit.wikimedia.org/r/1047545

Change #1047545 merged by Fabfur:

[operations/puppet@production] benthos:cache: delete message if sequence number is missing

https://gerrit.wikimedia.org/r/1047545

After upgrading HAProxy to 2.8.10 on whole ulsfo we still see some errors in the kafka DLQ like:

<134>1 2024-06-19T15:42:31.440224+00:00 - haproxy 1227962 - [benthoslog haproxy_pid="1227962" ip="<REDACTED>" sequence="1582219" accept_date="19/Jun/2024:15:42:30.993" time_backend_response="287" http_status="200" response_size="301073" termination_state="--" uri_host="be-tarask.wikipedia.org" referer="https://be-tarask.wikipedia.org/wiki/%D0%92%D1%96%D0%BA%D1%96%D0%BF%D1%8D%D0%B4%D1%8B%D1%8F:%D0%9F%D1%80%D0%B0%D0%B5%D0%BA%D1%82:%D0%A2%D1%8D%D0%BC%D0%B0%D1%82%D1%8B%D1%87%D0%BD%D1%8B_%D1%82%D1%8B%D0%B4%D0%B7%D0%B5%D0%BD%D1%8C/%D0%90%D1%80%D1%88%D0%B0%D0%BD%D1%81%D0%BA%D0%B0%D1%8F_%D0%B1%D1%96%D1%82%D0%B2%D0%B0" user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)" accept_language="en-US" range="-" accept="*/*" tls="vers=TLSv1.3;keyx=UNKNOWN;auth=ECDSA;ciph=AES-256-GCM-SHA384;prot=h1;sess=new" cache_status="miss" content_type="text/javascript; charset=" x_analytics="WMF-Last-Access=19-Jun-2024;WMF-Last-Access-Global=19-Jun-2024;https=1;client_port=35204" x_cache="cp4044 miss, cp4044 miss" backend="mw-web.codfw.main-7666f67487-nw9fq" http_method="GET" uri_path="/w/load.php" uri_query="?lang=be-tarask&modules=ext.categoryTree%2CeventLogging%2CnavigationTiming%2Cpopups%2CwikimediaEvents%7Cext.centralNotice.choiceData%2Cdisplay%2CgeoIP%2CimpressionDiet%2CkvStore%2CstartUp%7Cext.centralauth.centralautologin%7Cext.checkUser.clientHints%7Cext.cx.eventlogging.campaigns%7Cext.discussionTools.init%2Cminervaicons%7Cext.echo.centralauth%7Cext.uls.interface%2Cpreferences%2Cwebfonts%7Cext.urlShortener.toolbar%7Cjquery%2Cmoment%2Coojs%2Coojs-ui-core%2Coojs-ui-windows%2Crangefix%2Csite%7Cjquery.client%2CtextSelection%7Cmediawiki.String%2CTitle%2CUri%2Capi%2Cbase%2Ccldr%2Ccookie%2Cexperiments%2CjqueryMsg%2Clanguage%2Crouter%2Cstorage%2Ctoc%2Cuser%2Cutil%2CvisibleTimeout%7Cmediawiki.editfont.styles%7Cmediawiki.libs.pluralruleparser%7Cmediawiki.page.media%2Cready%7Cmediawiki.page.watch.ajax%7Cmmv.bootstrap%2Ccodex%2Chead%7Cmmv.bootstrap.autostart%7Coojs-ui-windows.icons%7Cskins.vector.clientPreferences%2Cjs%7Cskins.vector.icons.js&skin=vector-2022&versio<134>1 2024-06-19T15:42:31.441664+00:00 - haproxy 1227962 - [benthoslog haproxy_pid="1227962" ip="<REDACTED>" sequence="1582448" accept_date="19/Jun/2024:15:42:31.440" time_backend_response="1" http_status="200" response_size="17243" termination_state="--" uri_host="en.m.wikipedia.org" referer="https://en.m.wikipedia.org/wiki/13th_Amendment_to_the_United_States_Constitution" user_agent="Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Mobile Safari/537.36" accept_language="en-US,en;q=0.9" range="-" accept="text/css,*/*;q=0.1" tls="vers=TLSv1.3;keyx=UNKNOWN;auth=ECDSA;ciph=AES-256-GCM-SHA384;prot=h2;sess=new" cache_status="hit-front" content_type="text/css; charset=utf-8" x_analytics="WMF-Last-Access=19-Jun-2024;WMF-Last-Access-Global=19-Jun-2024;https=1;client_port=39466" x_cache="cp4044 hit, cp4044 hit/1766" backend="ATS/9.1.4" http_method="GET" uri_path="/w/load.php" uri_query="?lang=en&modules=ext.cite.styles%7Cext.relatedArticles.styles%7Cext.wikimediaBadges%7Cext.wikimediamessages.styles%7Cmediawiki.hlist%7Cmobile.init.styles%7Cskins.minerva.amc.styles%7Cskins.minerva.base.styles%7Cskins.minerva.codex.styles%7Cskins.minerva.content.styles.images%7Cskins.minerva.icons.wikimedia%7Cskins.minerva.mainMenu.icons%2Cstyles%7Cwikibase.client.init&only=styles&skin=minerva"] 1582448 1 0 0 200 {en.m.wikipedia.org} {hit-front} -

After upgrading HAProxy to 2.8.10 on whole ulsfo we still see some errors in the kafka DLQ like:

this needs to be reported to upstream

Change #1049104 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on eqsin

https://gerrit.wikimedia.org/r/1049104

Mentioned in SAL (#wikimedia-operations) [2024-06-25T13:07:26Z] <fabfur> temporary disabled puppet on cp4037 to test benthos configuration (T367756)

Update on this investigation: apparently capturing the frame with tcpdump from HAProxy to Benthos, doesn't show the "log merging". The whole frame(s) data looks correctly encoded so it's probably something wrong on the Benthos side (parsing)

Change #1049104 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on drmrs

https://gerrit.wikimedia.org/r/1049104

Mentioned in SAL (#wikimedia-operations) [2024-07-01T10:23:38Z] <fabfur> upgrading A:cp-drmrs to haproxy 2.8.10 (T367756)

Change #1051110 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on magru

https://gerrit.wikimedia.org/r/1051110

Change #1051130 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: removed unused haproxy28 overrides

https://gerrit.wikimedia.org/r/1051130

Change #1051110 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on magru

https://gerrit.wikimedia.org/r/1051110

Mentioned in SAL (#wikimedia-operations) [2024-07-01T12:49:18Z] <fabfur> upgrading A:cp-magru to haproxy 2.8.10 (T367756)

Change #1051130 merged by Fabfur:

[operations/puppet@production] hiera: removed unused haproxy28 overrides

https://gerrit.wikimedia.org/r/1051130

Change #1051143 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on codfw

https://gerrit.wikimedia.org/r/1051143

Mentioned in SAL (#wikimedia-operations) [2024-07-01T14:35:40Z] <fabfur> upgrading A:cp-codfw to haproxy 2.8.10 (T367756)

Change #1051143 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on codfw

https://gerrit.wikimedia.org/r/1051143

Change #1051292 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on eqiad

https://gerrit.wikimedia.org/r/1051292

Change #1051292 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on eqiad

https://gerrit.wikimedia.org/r/1051292

Mentioned in SAL (#wikimedia-operations) [2024-07-02T10:28:30Z] <fabfur> upgrading A:cp-eqiad to haproxy 2.8.10 (T367756)

Change #1051348 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on esams

https://gerrit.wikimedia.org/r/1051348

Change #1051348 merged by Fabfur:

[operations/puppet@production] hiera: upgrade haproxy to 2.8 on esams

https://gerrit.wikimedia.org/r/1051348

Mentioned in SAL (#wikimedia-operations) [2024-07-02T14:55:46Z] <fabfur> upgrading A:cp-esams to haproxy 2.8.10 (T367756)

Change #1051394 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] hiera: consolidate haproxy version to 2.8

https://gerrit.wikimedia.org/r/1051394

Change #1051394 merged by Fabfur:

[operations/puppet@production] hiera: consolidate haproxy version to 2.8

https://gerrit.wikimedia.org/r/1051394

All cp hosts has been upgraded to 2.8.10

Change #1047029 abandoned by Fabfur:

[operations/puppet@production] hiera: enable benthos on cp3066

Reason:

See T370741

https://gerrit.wikimedia.org/r/1047029