Page MenuHomePhabricator

LSobanski (Lukasz Sobanski)
Woo$

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Aug 31 2020, 5:40 PM (169 w, 6 d)
Availability
Available
LDAP User
LSobanski
MediaWiki User
LSobanski (WMF) [ Global Accounts ]

Recent Activity

Thu, Nov 30

LSobanski added a comment to T349595: Clarify if NDAs (to access #WMF-NDA protected Phab tasks) are on paper or in Legalpad's L2 or both.

I wonder if https://phabricator.wikimedia.org/L3 would also be in scope for this question, as moving it elsewhere would allow us to decommission Legalpad?

Thu, Nov 30, 4:56 PM · WMF-Legal, Legalpad, User-AKlapper, Phabricator, WMF-NDA-Requests

Wed, Nov 29

LSobanski added a comment to T341489: Create OTRS Database Snapshot.

We're on the second to last upgrade. We ran into some performance issues the reason for which is not fully clear yet. Arnold is out this week, looking at the calendar there is a chance we'll be done before the holiday break.

Wed, Nov 29, 12:25 PM · collaboration-services, DBA
LSobanski closed T343898: Alert triage: overdue alert [warning] puppet fails on idp-test1002 as Resolved.

Resolving as the alert is no longer active.

Wed, Nov 29, 8:50 AM · Infrastructure-Foundations, sre-alert-triage
LSobanski closed T342757: Alert triage: overdue warning alert as Resolved.

Resolving as the alert is no longer active.

Wed, Nov 29, 8:48 AM · sre-alert-triage, cloud-services-team

Tue, Nov 28

LSobanski awarded T351928: git over ssh is not working on GitLab test instance a Like token.
Tue, Nov 28, 12:33 PM · collaboration-services, GitLab (Infrastructure)
LSobanski created T352168: Alert in need of triage: SmartNotHealthy (instance an-worker1086:9100).
Tue, Nov 28, 12:31 PM · Data-Platform-SRE, sre-alert-triage

Mon, Nov 27

LSobanski added a comment to T352036: PuppetZeroResources - miscweb2003.

Related to T347355: Create alerts for https://query.wikidata.org/bigdata/ldf

Mon, Nov 27, 4:25 PM · collaboration-services
LSobanski renamed T352036: PuppetZeroResources - miscweb2003 from PuppetZeroResources to PuppetZeroResources - miscweb2003.
Mon, Nov 27, 4:23 PM · collaboration-services
LSobanski closed T351941: ProbeDown - vrts1001 as Resolved.
Mon, Nov 27, 4:23 PM · collaboration-services
LSobanski moved T352003: Create a dedicated image for Debian package builds from Incoming to Backlog on the collaboration-services board.
Mon, Nov 27, 4:22 PM · Patch-For-Review, collaboration-services
LSobanski triaged T352003: Create a dedicated image for Debian package builds as Medium priority.
Mon, Nov 27, 4:22 PM · Patch-For-Review, collaboration-services
LSobanski changed the status of T352003: Create a dedicated image for Debian package builds from Open to Stalled.

Stalled until we have clarity on Dockerfile based builds in Gitlab.

Mon, Nov 27, 4:22 PM · Patch-For-Review, collaboration-services
LSobanski moved T351928: git over ssh is not working on GitLab test instance from Incoming to Backlog on the collaboration-services board.
Mon, Nov 27, 4:17 PM · collaboration-services, GitLab (Infrastructure)
LSobanski triaged T351928: git over ssh is not working on GitLab test instance as Medium priority.
Mon, Nov 27, 4:16 PM · collaboration-services, GitLab (Infrastructure)
LSobanski added a comment to T347355: Create alerts for https://query.wikidata.org/bigdata/ldf.

@bking, this change is causing Puppet failures for miscweb1003 because of the existence of duplicate blackbox checks.

Mon, Nov 27, 4:02 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE
LSobanski created T352003: Create a dedicated image for Debian package builds.
Mon, Nov 27, 9:45 AM · Patch-For-Review, collaboration-services

Fri, Nov 24

LSobanski closed T351725: Daily backup job not running for gerrit1003 as Resolved.

As of midday 2023-11-22 the backup size is down to around 8GB, in line with what it was before the recent increase. Resolving.

Fri, Nov 24, 6:40 PM · collaboration-services, bacula, Data-Persistence-Backup
LSobanski triaged T351941: ProbeDown - vrts1001 as Medium priority.
Fri, Nov 24, 6:36 PM · collaboration-services
LSobanski updated subscribers of T351941: ProbeDown - vrts1001.
Nov 24 17:37:01 vrts1001 systemd[1]: apache2.service: A process of this unit has been killed by the OOM killer.
Fri, Nov 24, 6:36 PM · collaboration-services
LSobanski renamed T351941: ProbeDown - vrts1001 from ProbeDown to ProbeDown - vrts1001.
Fri, Nov 24, 6:33 PM · collaboration-services

Thu, Nov 23

LSobanski lowered the priority of T351725: Daily backup job not running for gerrit1003 from Medium to Low.

I'll leave this open for another week or two to see if the backup size changes after the reboots.

Thu, Nov 23, 3:03 PM · collaboration-services, bacula, Data-Persistence-Backup
LSobanski triaged T351725: Daily backup job not running for gerrit1003 as Medium priority.
Thu, Nov 23, 2:59 PM · collaboration-services, bacula, Data-Persistence-Backup

Wed, Nov 22

LSobanski added a comment to T351764: Preserve existing JSDuck documentation on docs.wikimedia.org.

If we end up copying a static snapshot that will no longer be generated afterwards to a folder on a host, let's make sure a backup of it exists as well.

Wed, Nov 22, 1:34 PM · collaboration-services, Tech-Docs-Team, Technical-Debt (RW-Tech-Debt), Front-end-Standards-Group, Documentation

Tue, Nov 21

Dzahn awarded T351725: Daily backup job not running for gerrit1003 a Like token.
Tue, Nov 21, 7:09 PM · collaboration-services, bacula, Data-Persistence-Backup
LSobanski added a comment to T351725: Daily backup job not running for gerrit1003.

"Job ... is waiting. Cannot find any appendable volumes"

That would mean on the bacula storage daemon (somewhere on backup1*), not on gerrit/client hosts. Are you sure this is related to T351658 (it can be, if made the client fail), but the error points to backups storage instead?

Tue, Nov 21, 5:52 PM · collaboration-services, bacula, Data-Persistence-Backup
LSobanski added a project to T351725: Daily backup job not running for gerrit1003: collaboration-services.
Tue, Nov 21, 2:51 PM · collaboration-services, bacula, Data-Persistence-Backup
LSobanski created T351725: Daily backup job not running for gerrit1003.
Tue, Nov 21, 2:50 PM · collaboration-services, bacula, Data-Persistence-Backup

Mon, Nov 20

LSobanski removed a project from T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate": collaboration-services.

Removing collaboration-services as I don't see any clear activity for us here.

Mon, Nov 20, 4:24 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
LSobanski moved T351507: VMs in Cloud VPS share the same machine-id from Incoming to Consultation on the collaboration-services board.
Mon, Nov 20, 4:21 PM · cloud-services-team, Cloud-VPS, collaboration-services
LSobanski updated the task description for T341991: Migrate SRE repositories to GitLab - operations/debs.
Mon, Nov 20, 9:49 AM · GitLab (Project Migration), collaboration-services

Fri, Nov 17

LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Fri, Nov 17, 1:36 PM · GitLab (Project Migration), collaboration-services
LSobanski added a comment to T343707: Migrate SRE repositories to GitLab - Archiving unused Gerrit repositories.

operations/software/wmfbackups

Fri, Nov 17, 12:46 PM · Projects-Cleanup, Release-Engineering-Team (Priority Backlog 📥), collaboration-services
LSobanski renamed T351469: ProbeDown - contint2002 from ProbeDown to ProbeDown - contint2002.
Fri, Nov 17, 11:10 AM · collaboration-services

Wed, Nov 15

LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Wed, Nov 15, 8:43 AM · GitLab (Project Migration), collaboration-services

Tue, Nov 14

LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Tue, Nov 14, 3:12 PM · GitLab (Project Migration), collaboration-services
LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Tue, Nov 14, 3:10 PM · GitLab (Project Migration), collaboration-services
LSobanski added a project to T343896: Alert triage: overdue alert [critical] The following units failed: wikidatardf-lexemes-dumps.service: Data-Platform-SRE.

The alert has since recovered but looking at the names in the linked change I'm adding Data Platform SRE to review.

Tue, Nov 14, 11:38 AM · Data-Platform-SRE, sre-alert-triage
LSobanski moved T351162: PuppetZeroResources - stewards1001 from Incoming to Work in Progress on the collaboration-services board.
Tue, Nov 14, 10:54 AM · collaboration-services
LSobanski triaged T351162: PuppetZeroResources - stewards1001 as Medium priority.
Tue, Nov 14, 10:54 AM · collaboration-services
LSobanski renamed T351162: PuppetZeroResources - stewards1001 from PuppetZeroResources to PuppetZeroResources - stewards1001.
Tue, Nov 14, 10:54 AM · collaboration-services
LSobanski awarded T351084: Alert in need of triage: PuppetConstantChange (instance pybal-test2003:9100) a Love token.
Tue, Nov 14, 10:53 AM · Traffic
LSobanski created T351191: Invalid certificate for Alertmanager silence links.
Tue, Nov 14, 10:31 AM · SRE Observability, observability

Mon, Nov 13

ssingh awarded T351084: Alert in need of triage: PuppetConstantChange (instance pybal-test2003:9100) a Orange Medal token.
Mon, Nov 13, 4:50 PM · Traffic
LSobanski closed T350257: ProbeDown - VRTS high CPU usage as Resolved.
Mon, Nov 13, 4:30 PM · collaboration-services
LSobanski moved T350795: Add linting of research landing-page to gitlab CI from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:26 PM · Research, collaboration-services
LSobanski triaged T350795: Add linting of research landing-page to gitlab CI as High priority.
Mon, Nov 13, 4:26 PM · Research, collaboration-services
LSobanski moved T350803: Import open Gerrit patches as branches in GitLab from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:24 PM · collaboration-services
LSobanski triaged T350803: Import open Gerrit patches as branches in GitLab as Low priority.
Mon, Nov 13, 4:24 PM · collaboration-services
LSobanski moved T350791: move design.wikimedia.org to kubernetes from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:15 PM · Design, Wikimedia-Design, GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski moved T350796: move security.wikimedia.org to kubernetes from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:15 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski moved T350794: move os-reports.wikimedia.org to kubernetes from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:15 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski moved T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes from Incoming to Backlog on the collaboration-services board.
Mon, Nov 13, 4:15 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski triaged T350796: move security.wikimedia.org to kubernetes as Medium priority.
Mon, Nov 13, 4:14 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski triaged T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes as Medium priority.
Mon, Nov 13, 4:14 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski triaged T350794: move os-reports.wikimedia.org to kubernetes as Medium priority.
Mon, Nov 13, 4:14 PM · GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski triaged T350791: move design.wikimedia.org to kubernetes as Medium priority.
Mon, Nov 13, 4:11 PM · Design, Wikimedia-Design, GitLab (Pipeline Services Migration🐤), collaboration-services
LSobanski created T351084: Alert in need of triage: PuppetConstantChange (instance pybal-test2003:9100).
Mon, Nov 13, 1:32 PM · Traffic
LSobanski created T351083: Alert in need of triage: BGP status (instance cr2-eqdfw).
Mon, Nov 13, 1:31 PM · netops, sre-alert-triage, Infrastructure-Foundations

Fri, Nov 10

LSobanski added a comment to T347623: Migrate Traffic repositories from Gerrit to Gitlab.

@BCornwall For operations/software/varnish, looks like it should just be archived and not migrated? Let me know if that's the case.

Fri, Nov 10, 12:28 PM · Patch-For-Review, GitLab (Project Migration), Traffic

Thu, Nov 9

LSobanski assigned T350257: ProbeDown - VRTS high CPU usage to Arnoldokoth.

The one thing that remains is to revert the config change and restart the daemon to see if this brings the problem back.

Thu, Nov 9, 5:38 PM · collaboration-services

Wed, Nov 8

LSobanski created T350803: Import open Gerrit patches as branches in GitLab.
Wed, Nov 8, 3:47 PM · collaboration-services

Tue, Nov 7

LSobanski assigned T350658: Dependencies from backports in wmf-debci to MatthewVernon.
Tue, Nov 7, 10:15 AM · collaboration-services, SRE
LSobanski moved T350658: Dependencies from backports in wmf-debci from Incoming to Consultation on the collaboration-services board.
Tue, Nov 7, 10:15 AM · collaboration-services, SRE

Mon, Nov 6

LSobanski lowered the priority of T350257: ProbeDown - VRTS high CPU usage from High to Medium.
Mon, Nov 6, 4:20 PM · collaboration-services
LSobanski moved T349833: Document process for switching remote refs for gerrit->gitlab switch from Incoming to Backlog on the collaboration-services board.
Mon, Nov 6, 4:19 PM · collaboration-services
LSobanski triaged T349833: Document process for switching remote refs for gerrit->gitlab switch as Low priority.
Mon, Nov 6, 4:19 PM · collaboration-services
LSobanski updated the task description for T349833: Document process for switching remote refs for gerrit->gitlab switch.
Mon, Nov 6, 4:18 PM · collaboration-services
LSobanski moved T304491: Standardize Debian package builds on GitLab CI from Work in Progress to Work in Progress (Tracking tasks) on the collaboration-services board.
Mon, Nov 6, 12:00 PM · collaboration-services, GitLab (CI & Job Runners), serviceops
LSobanski assigned T350478: Investigate docker-gc.service failures on GitLab runners to Jelto.
Mon, Nov 6, 12:00 PM · collaboration-services
LSobanski closed T349990: ProbeDown - VRTS as Resolved.

I think this can be closed in the light of later troubleshooting.

Mon, Nov 6, 11:56 AM · collaboration-services
LSobanski moved T350478: Investigate docker-gc.service failures on GitLab runners from Incoming to Work in Progress on the collaboration-services board.
Mon, Nov 6, 11:56 AM · collaboration-services
LSobanski updated the task description for T341991: Migrate SRE repositories to GitLab - operations/debs.
Mon, Nov 6, 11:25 AM · GitLab (Project Migration), collaboration-services
LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Mon, Nov 6, 11:15 AM · GitLab (Project Migration), collaboration-services
LSobanski added a comment to T343707: Migrate SRE repositories to GitLab - Archiving unused Gerrit repositories.

All good recommendations, so far I confirmed:

  • operations/docker-images/debian
  • operations/software/certpy
  • operations/software/hhvm_exporter
Mon, Nov 6, 11:14 AM · Projects-Cleanup, Release-Engineering-Team (Priority Backlog 📥), collaboration-services
LSobanski updated the task description for T341468: Migrate SRE repositories to GitLab.
Mon, Nov 6, 11:10 AM · GitLab (Project Migration), collaboration-services
LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Mon, Nov 6, 11:04 AM · GitLab (Project Migration), collaboration-services
LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Mon, Nov 6, 11:00 AM · GitLab (Project Migration), collaboration-services
LSobanski added a comment to T347623: Migrate Traffic repositories from Gerrit to Gitlab.

Would operations/software/knead-wikidough and operations/software/liberica fit here as well?

Mon, Nov 6, 11:00 AM · Patch-For-Review, GitLab (Project Migration), Traffic
LSobanski updated the task description for T341504: Migrate SRE repositories to GitLab - operations/software.
Mon, Nov 6, 10:56 AM · GitLab (Project Migration), collaboration-services

Nov 3 2023

Dzahn awarded T343707: Migrate SRE repositories to GitLab - Archiving unused Gerrit repositories a Like token.
Nov 3 2023, 5:45 PM · Projects-Cleanup, Release-Engineering-Team (Priority Backlog 📥), collaboration-services
LSobanski added a comment to T343707: Migrate SRE repositories to GitLab - Archiving unused Gerrit repositories.

operations/debs/wikistats

Nov 3 2023, 5:14 PM · Projects-Cleanup, Release-Engineering-Team (Priority Backlog 📥), collaboration-services
LSobanski added a comment to T350478: Investigate docker-gc.service failures on GitLab runners.

No, it fails on different runners, I just used 1004 as an example.

Nov 3 2023, 3:14 PM · collaboration-services
LSobanski created T350478: Investigate docker-gc.service failures on GitLab runners.
Nov 3 2023, 12:22 PM · collaboration-services

Nov 2 2023

Dzahn awarded T334292: Define a common layout for Collaboration service dashboards and apply it to the existing dashboards a Mountain of Wealth token.
Nov 2 2023, 10:49 PM · collaboration-services

Nov 1 2023

Dzahn awarded T345618: Switchover people.wikimedia.org - September 2023 a 100 token.
Nov 1 2023, 7:56 PM · collaboration-services
LSobanski renamed T350257: ProbeDown - VRTS high CPU usage from ProbeDown - VRTS to ProbeDown - VRTS high CPU usage.
Nov 1 2023, 7:03 PM · collaboration-services
LSobanski raised the priority of T350257: ProbeDown - VRTS high CPU usage from Medium to High.
Nov 1 2023, 3:54 PM · collaboration-services
LSobanski added a comment to T350257: ProbeDown - VRTS high CPU usage.

Both Apache and ClamAV were oom-killed.

Nov 1 2023, 1:10 PM · collaboration-services
LSobanski added a comment to T349990: ProbeDown - VRTS.

Looks like Apache restarted as all its metrics are missing during this time: https://grafana.wikimedia.org/d/000000371/vrts?orgId=1&from=1698612983731&to=1698644084645

Nov 1 2023, 12:58 PM · collaboration-services
LSobanski closed T350107: PuppetFailure - Contint as Resolved.
Nov 1 2023, 12:53 PM · collaboration-services
LSobanski moved T350257: ProbeDown - VRTS high CPU usage from Incoming to Work in Progress on the collaboration-services board.
Nov 1 2023, 12:53 PM · collaboration-services
LSobanski triaged T350257: ProbeDown - VRTS high CPU usage as Medium priority.
Nov 1 2023, 12:53 PM · collaboration-services
LSobanski added a comment to T350257: ProbeDown - VRTS high CPU usage.

The service is up after a Puppet run and needs to be monitored. The resource usage could be related to the Envoy timeout changes made in T349471: Error when accessing a specific VRTS ticket: "upstream request timeout".

Nov 1 2023, 12:52 PM · collaboration-services
LSobanski added a comment to T350257: ProbeDown - VRTS high CPU usage.

100% CPU and ~100% memory usage since yesterday: https://grafana.wikimedia.org/d/000000371/vrts?orgId=1&from=1698832908164&to=1698842575883

Nov 1 2023, 12:46 PM · collaboration-services
LSobanski added a comment to T350257: ProbeDown - VRTS high CPU usage.

The error on ticket.wikimedia.org is:

Nov 1 2023, 12:39 PM · collaboration-services
LSobanski renamed T350257: ProbeDown - VRTS high CPU usage from ProbeDown to ProbeDown - VRTS.
Nov 1 2023, 12:38 PM · collaboration-services
LSobanski added a comment to T350107: PuppetFailure - Contint.

Most likely related to T350118: Investigate PKI errors

Nov 1 2023, 12:34 PM · collaboration-services
LSobanski closed T350251: ProbeDown - GitLab as Resolved.

Related to GitLab security update - {T350215}.

Nov 1 2023, 12:32 PM · collaboration-services
LSobanski renamed T350251: ProbeDown - GitLab from ProbeDown to ProbeDown - GitLab.
Nov 1 2023, 12:31 PM · collaboration-services
LSobanski awarded T349474: security@ mailing list membership for LSobanski a Love token.
Nov 1 2023, 9:10 AM · SecTeam-Processed, Security-Team