Page MenuHomePhabricator

hashar (Antoine Musso)
LogisticsAdministrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 2:31 PM (398 w, 3 d)
Roles
Administrator
Availability
Available
IRC Nick
hashar
LDAP User
Hashar
MediaWiki User
Unknown

https://www.mediawiki.org/wiki/User:Hashar

Based in Nantes, France CET/CEST (UTC+1, UTC+2)

antoine-approve

Recent Activity

Yesterday

hashar updated subscribers of T309047: Coverage and patch-performance pipelines appear stuck.

After a chat with @TheresNoTime @dancy and @Legoktm on IRC.

Mon, May 23, 10:10 PM · Zuul, Continuous-Integration-Infrastructure
hashar renamed T309047: Coverage and patch-performance pipelines appear stuck from Coverage pipeline appears stuck to Coverage and patch-performance pipelines appear stuck.
Mon, May 23, 10:07 PM · Zuul, Continuous-Integration-Infrastructure
hashar edited projects for T309047: Coverage and patch-performance pipelines appear stuck, added: Continuous-Integration-Infrastructure, Zuul; removed Continuous-Integration-Config.

The changes in the patch-performance or coverage have a low precedence and are only triggered after anything else.

Mon, May 23, 8:24 PM · Zuul, Continuous-Integration-Infrastructure

Sun, May 22

hashar added a comment to T308405: Diff colours change between red/green and yellow/blue depending on what's viewed.

To be fair after all those years using Gerrit it is probably the first time I encounter the yellow / blue diff. Looking at the DOM, the elements are marked with the CSS class dueToRebase which eventually leads me to:

Sun, May 22, 11:03 AM · Gerrit
thcipriani awarded T308943: CI fails with 'This change or one of its cross-repo dependencies was unable to be automatically merged' for a lot of repos a Barnstar token.
Sun, May 22, 1:39 AM · Release-Engineering-Team, Continuous-Integration-Infrastructure

Sat, May 21

hashar added a comment to T308943: CI fails with 'This change or one of its cross-repo dependencies was unable to be automatically merged' for a lot of repos.

Thanks for the confirmation @TheresNoTime !

Sat, May 21, 11:02 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar closed T308943: CI fails with 'This change or one of its cross-repo dependencies was unable to be automatically merged' for a lot of repos as Resolved.

Should be good now after I have restarted Zuul (ssh contint2001.wikimedia.org sudo systemctl restart zuul) which cleared the idling ssh connection.

Sat, May 21, 10:16 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar added a comment to T308943: CI fails with 'This change or one of its cross-repo dependencies was unable to be automatically merged' for a lot of repos.

From the Gerrit log https://logstash.wikimedia.org/app/dashboards#/view/AW1f-0k0ZKA7RpirlnKV

Max connection count for user jenkins-bot exceeded, rejecting new connection. currentSessionCount = 4, maxSessionCount = 4
Sat, May 21, 10:06 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar claimed T308943: CI fails with 'This change or one of its cross-repo dependencies was unable to be automatically merged' for a lot of repos.

o/ checking

Sat, May 21, 10:00 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar closed T308927: quibble-vendor-mysql-php72-selenium-docker: "cannot create directory ‘log’: Permission denied" as Resolved.

Somehow the build workspace are belong to root:root:

drwxr-xr-x  3 root           root    4096 May 21 12:19 quibble-composer-mysql-php72-selenium-docker
Sat, May 21, 3:53 PM · Release-Engineering-Team, ci-test-error (WMF-deployed Build Failure), Continuous-Integration-Infrastructure, Continuous-Integration-Config
hashar added a comment to T308397: Carry out an admin election of zhwiki on votewiki (May 2022).

https://vote.wikimedia.org/ has been switched to language zh. The only culprit I find is that https://vote.wikimedia.org/ redirects to https://vote.wikimedia.org/wiki/%E9%A6%96%E9%A1%B5 ( 首页 ) which does not exist.

I just added a redirect. We could have a Main Page/zh with a translation but I don't think it's necessary :)

Sat, May 21, 3:46 PM · Wikimedia-Site-requests, Elections, Trust-and-Safety, Chinese-Sites
hashar reassigned T308692: Phan 3.2.6 crashed on composer-php81-docker test from hashar to TheresNoTime.

Reassigning to @TheresNoTime who found out the reason above (T308692#7939339). I have merely pushed the button ;)

Sat, May 21, 9:13 AM · Continuous-Integration-Config, PHP 8.1 support, phan
hashar created T308908: Icinga Check SSL might have a time based race condition.
Sat, May 21, 9:11 AM · SRE, Gerrit

Fri, May 20

hashar claimed T304947: Investigate sending Gerrit events to our data lake.
Fri, May 20, 3:25 PM · Patch-For-Review, Gerrit, Data³
hashar added a comment to T304947: Investigate sending Gerrit events to our data lake.

I believe I have fixed the above issues. I have published the generated schemas at https://people.wikimedia.org/~hashar/T304947/schemas/

Fri, May 20, 3:25 PM · Patch-For-Review, Gerrit, Data³

Thu, May 19

hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

Thank you so much @thiemowmde for your assistance and to have taken some extra time to explain the last log messages I have posted. Follow up T308753 is an excellent idea to improve the user reporting, it kind of confused us since there is no message logged on the backend at error level so I was a bit clueless as to where the error was. Turns out it is "normal" behavior and does not deserve a backend error log ;)

Thu, May 19, 5:26 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar added a comment to T308753: FileImporter: Generic error "failed to commit operations" doesn't mention actual reason.

Thanks @thiemowmde !

Thu, May 19, 2:19 PM · Wikimedia-Hackathon-2022, good first task, Patch-For-Review, WMDE-TechWish-Maintenance, Move-Files-To-Commons
hashar added a comment to T308397: Carry out an admin election of zhwiki on votewiki (May 2022).

https://vote.wikimedia.org/ has been switched to language zh. The only culprit I find is that https://vote.wikimedia.org/ redirects to https://vote.wikimedia.org/wiki/%E9%A6%96%E9%A1%B5 ( 首页 ) which does not exist.

Thu, May 19, 2:01 PM · Wikimedia-Site-requests, Elections, Trust-and-Safety, Chinese-Sites
hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

And for the failed to commit message I believe it is:

| May 19, 2022 @ 13:26:58.182 | FileImporter | mw1369 | commonswiki | FileImporter\Services\Importer::commitImportOperationsFailed to commit operations.
| May 19, 2022 @ 13:26:58.168 | FileImporter | mw1369 | commonswiki | FileImporter\Operations\FileRevisionFromRemoteUrl::commit failed to commit.
| May 19, 2022 @ 13:26:53.640 | FileImporter | mw1369 | commonswiki | Performing submit on ImportPlan for URL: https://zh.wikibooks.org/wiki/File:Wiki.png
| May 19, 2022 @ 13:26:53.640 | FileImporter | mw1369 | commonswiki | FileImporter\Services\Importer::import started
| May 19, 2022 @ 13:26:53.122 | FileImporter | mw1369 | commonswiki | Calculated two-hop interwiki prefix b:zh to zh.wikibooks.org
| May 19, 2022 @ 13:26:52.780 | FileImporter | mw1369 | commonswiki | Getting ImportPlan for URL: https://zh.wikibooks.org/wiki/File:Wiki.png
Thu, May 19, 1:51 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

Looking at log for channel:FileImporter i see message:

Getting ImportPlan for URL: https://zh.wikibooks.org/wiki/File:Wiki.png
Calculated two-hop interwiki prefix b:zh to zh.wikibooks.org
ImportException: This page has been protected to prevent editing or other actions.

And

Getting ImportPlan for URL: https://zh.wikibooks.org/wiki/File:Wiki.png
Calculated two-hop interwiki prefix b:zh to zh.wikibooks.org
ImportException: File already on wiki
Thu, May 19, 1:43 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

Sorry I was confused https://gerrit.wikimedia.org/r/c/mediawiki/extensions/FileImporter/+/793415 is for master and is the long term proper fix.

Thu, May 19, 12:54 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

Thank you Tiemo! Given @Tgr looked at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/FileImporter/+/793415 and that WMDE people are in a meeting this afternoon, I settle on deploying the change you have proposed. The worse case scenario is that FileImporter is as broken as it is currently.

Thu, May 19, 12:47 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar added a comment to T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

Side track \MediaWiki\User\UserNameUtils::isUsable() has been changed by https://gerrit.wikimedia.org/r/c/mediawiki/core/+/786345 but that got deployed by wmf.10 and it merely replace a string by a constant.

Thu, May 19, 8:07 AM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar updated subscribers of T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons.

+ @thiemowmde who has proposed the change to FileImporter

Thu, May 19, 8:04 AM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
hashar closed T307405: Broken dashboard links on Zuul Status page as Resolved.

Should be good now

Thu, May 19, 7:27 AM · Graphite, Upstream, Release-Engineering-Team, Continuous-Integration-Infrastructure, Zuul

Wed, May 18

hashar assigned T307405: Broken dashboard links on Zuul Status page to TheresNoTime.

@TheresNoTime proposed a patch and fixed a few other links. It is blocked on an unrelated Phan build failure on the repo when being run under php 8.1 which is T308692
and has a fix :)

Wed, May 18, 5:55 PM · Graphite, Upstream, Release-Engineering-Team, Continuous-Integration-Infrastructure, Zuul
hashar created T308693: phan is apparently not run on CI for integration/docroot.
Wed, May 18, 5:15 PM · Continuous-Integration-Infrastructure, phan
hashar added a comment to T308478: Add Antoine Musso to Phabricator hosts.

Confirmed. Thank you very much @Marostegui

Wed, May 18, 3:17 PM · Phabricator, Release-Engineering-Team, SRE, SRE-Access-Requests
hashar closed T307740: contint/releases/hosts with helm installed: puppet - Could not find group deployment as Resolved.

With https://gerrit.wikimedia.org/r/791565 deployed , the CI servers have /var/cache/helm owned by helm:contint-admins and Puppet runs fine now :] Thank you @jbond

Wed, May 18, 3:16 PM · Continuous-Integration-Infrastructure, serviceops, SRE
hashar closed T307339: Upgrade Jenkins to the next LTS, 2.332.2 as Resolved.
Wed, May 18, 8:37 AM · Patch-For-Review, Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure
hashar created T308637: Use Jenkins upstream systemd unit instead of our own.
Wed, May 18, 8:34 AM · Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T307339: Upgrade Jenkins to the next LTS, 2.332.2.

The agents could not connect:

10:12:07 <hashar> [05/18/22 08:06:32] [SSH] Copying latest remoting.jar...
10:12:07 <hashar> java.io.IOException: Could not copy remoting.jar into '/srv/jenkins/workspace' on agent
Wed, May 18, 8:18 AM · Patch-For-Review, Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T307339: Upgrade Jenkins to the next LTS, 2.332.2.

The package ship a /etc/default/jenkins which our Puppet erase and also comes with a systemd unit which we might consider using instead of our Puppet one.

Wed, May 18, 8:02 AM · Patch-For-Review, Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure
hashar updated the task description for T307339: Upgrade Jenkins to the next LTS, 2.332.2.
Wed, May 18, 7:36 AM · Patch-For-Review, Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure
hashar added a parent task for T307339: Upgrade Jenkins to the next LTS, 2.332.2: Unknown Object (Task).
Wed, May 18, 7:35 AM · Patch-For-Review, Release-Engineering-Team (Next), Jenkins, Continuous-Integration-Infrastructure

Tue, May 17

hashar added a comment to T304947: Investigate sending Gerrit events to our data lake.

My pull request has been merged and the maintainer addressed all the concerns directly via https://github.com/victools/jsonschema-generator/pull/255/ (merged as well).

Tue, May 17, 9:40 AM · Patch-For-Review, Gerrit, Data³
hashar added a comment to T300303: DBQueryError: Deadlock found when trying to get lock; try restarting transaction (db1138)Function: WikiPage::updateCategoryCountsQuery: UPDATE `category` SET cat_pages = cat_pages - 1,cat_files = cat_files - 1 WHERE cat_title = '[title]'.

One occurrence today with 1.39.0-wmf.10, there are two messages shown for the same reqId:

Tue, May 17, 7:38 AM · User-brennen, Platform Engineering, Wikimedia-production-error

Mon, May 16

RhinosF1 awarded T308478: Add Antoine Musso to Phabricator hosts a Like token.
Mon, May 16, 7:15 PM · Phabricator, Release-Engineering-Team, SRE, SRE-Access-Requests
hashar updated the task description for T308478: Add Antoine Musso to Phabricator hosts.
Mon, May 16, 6:54 PM · Phabricator, Release-Engineering-Team, SRE, SRE-Access-Requests
hashar added a comment to T306828: Open technical questions about DDD dashboards .

Not sure there is much I can do to help @Mhurd on that front, but I can surely finally read the python script which generates the metrics.db file and I have filed a task to get access to the Phabricator hosts T308478

Mon, May 16, 6:50 PM · Data³
brennen awarded T308478: Add Antoine Musso to Phabricator hosts a Like token.
Mon, May 16, 6:50 PM · Phabricator, Release-Engineering-Team, SRE, SRE-Access-Requests
hashar created T308478: Add Antoine Musso to Phabricator hosts.
Mon, May 16, 6:49 PM · Phabricator, Release-Engineering-Team, SRE, SRE-Access-Requests
hashar added a comment to T304947: Investigate sending Gerrit events to our data lake.

I have made a bit more progress today and managed to get a Json Schema which validates a comment added event from the Gerrit Java class!

Mon, May 16, 6:34 PM · Patch-For-Review, Gerrit, Data³
hashar moved T308290: train-dev's Gerrit zuul plugin returns a different object than production Gerrit from INBOX to Doing on the Release-Engineering-Team board.
Mon, May 16, 11:11 AM · Release-Engineering-Team (Doing), MediaWiki Train Development Environment, Scap
hashar added a comment to T308290: train-dev's Gerrit zuul plugin returns a different object than production Gerrit.

That is due to T307621 . The upstream repo was lagging behind and its master branch did not work with Gerrit 3.

Mon, May 16, 11:10 AM · Release-Engineering-Team (Doing), MediaWiki Train Development Environment, Scap
hashar updated the task description for T308290: train-dev's Gerrit zuul plugin returns a different object than production Gerrit.
Mon, May 16, 11:07 AM · Release-Engineering-Team (Doing), MediaWiki Train Development Environment, Scap
hashar closed T308020: Run scap CI against Stretch, Buster, Bullseye as Resolved.

We now ensure Scap test suite works against Debian Stretch, Buster and Bullseye. I even caught a python 3.9 incompatibility while doing so https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/775857 :)

Mon, May 16, 8:23 AM · Release-Engineering-Team (Doing), Patch-For-Review, Scap

Sat, May 14

hashar triaged T308382: Error 500 (Server Error): Internal server error: Endpoint: /changes/*~*/comments as Low priority.

Thank you to have taken the time to file this report. I have looked at the Gerrit server and added the stacktrace to this task description.

Sat, May 14, 11:20 AM · Gerrit
hashar updated the task description for T308382: Error 500 (Server Error): Internal server error: Endpoint: /changes/*~*/comments.
Sat, May 14, 11:15 AM · Gerrit
hashar updated the task description for T308382: Error 500 (Server Error): Internal server error: Endpoint: /changes/*~*/comments.
Sat, May 14, 11:15 AM · Gerrit
hashar closed T307137: Gerrit replication after a restart takes roughly 5 hours as Resolved.

The latency is at ~ 100 ms which is good. I don't think there is any specific improvement to be made.

Sat, May 14, 11:13 AM · Release-Engineering-Team (Doing), Gerrit
hashar added a comment to T307137: Gerrit replication after a restart takes roughly 5 hours.

After the deployment of 4 threads for replication to codfw, we can see it is faster:

Sat, May 14, 11:12 AM · Release-Engineering-Team (Doing), Gerrit

Fri, May 13

hashar added a comment to T304947: Investigate sending Gerrit events to our data lake.

P27828 is a CommentAdded json event

Fri, May 13, 10:08 PM · Patch-For-Review, Gerrit, Data³
hashar added a comment to P27829 Gerrit CommentAdded generated Json schema.

Automatically generated schema for Gerrit CommentAdded event for T304947

Fri, May 13, 10:06 PM
hashar updated the language for P27828 Gerrit CommentAdded json event from autodetect to json.
Fri, May 13, 10:05 PM
hashar created P27829 Gerrit CommentAdded generated Json schema.
Fri, May 13, 10:05 PM
hashar updated the title for P27828 Gerrit CommentAdded json event from Gerrit comment added json event to Gerrit CommentAdded json event.
Fri, May 13, 10:04 PM
hashar updated the title for P27828 Gerrit CommentAdded json event from Gerrit comment added to Gerrit comment added json event.
Fri, May 13, 10:03 PM
hashar created P27828 Gerrit CommentAdded json event.
Fri, May 13, 10:02 PM
hashar updated subscribers of T307538: Write a GitLab "Migrating a Project" runbook / manual based on Blubber migration.

For the archival of Gerrit repository it is done manually via Projects-Cleanup which has a form listing all the steps required. We had a long standing task to automatize that process which is T175499

Fri, May 13, 9:18 AM · Release-Engineering-Team (GitLab-a-thon 🦊), User-brennen, User-dduvall, GitLab (Project Migration)
hashar added a comment to T305729: Kubernetes credentials on deployment servers should be available to deployers, not all users.

Puppet fails on the contint* hosts cause /var/cache/helm went from being owned by wikidev to deployment, a group which does not exist on those hosts. Filed as T307740

Fri, May 13, 8:52 AM · Release-Engineering-Team (Radar), Patch-For-Review, Kubernetes, MW-on-K8s, serviceops
hashar updated subscribers of T307740: contint/releases/hosts with helm installed: puppet - Could not find group deployment.

From the contint1001 /var/log/puppet.log* files, the last good run was:

Apr 27 15:32:34 contint1001 puppet-agent[14311]: Caching catalog for contint1001.wikimedia.org
Apr 27 15:32:34 contint1001 puppet-agent[14311]: Applying configuration version '(891b0a4c36) Giuseppe Lavagetto - varnish: switch to using new-style request filters'
Apr 27 15:32:56 contint1001 puppet-agent[14311]: Applied catalog in 22.20 seconds
Fri, May 13, 8:45 AM · Continuous-Integration-Infrastructure, serviceops, SRE

Thu, May 12

hashar added a project to T308194: Make visual regression tests run in CI (non-blocking) for the Vector repo: Release-Engineering-Team.

For each Gerrit patch set made to Vector, CI runs a command that takes a visual diff of master against that change/changes on top of master.

Thu, May 12, 8:58 PM · Release-Engineering-Team, Readers-Web-Backlog (Needs Prioritization (Tech)), Web Team Visual Regression Framework
hashar added a comment to T307137: Gerrit replication after a restart takes roughly 5 hours.

I first try a manual replication to gerrit2001 but only one thread was processing. I guess the plugin had to be reloaded somehow I have choose to restart Gerrit instead. And:

$ gerrit show-queue -w|grep -v waiting
+ ssh -p 29418 hashar@gerrit.wikimedia.org gerrit show-queue -w
Task     State        StartTime         Command
------------------------------------------------------------------------------
e8eb46b9              19:58:37.654      [a80eae9a] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/pywikibot/pycolorname.git [..all..]
481132bd              19:58:37.653      [08001a65] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/operations/software/bernard.git [..all..]
28fb9e88              19:58:37.652      [e8002665] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/blubber-doc/example/calculator-service.git [..all..]
88340a4d              19:58:37.652      [482a92e5] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/operations/debs/wikimedia-search-qa.git [..all..]
3dd0eb0f              19:59:07.419      [1dd3a71b] push git@github.com:wikimedia/analytics-log2udp2 [..all..]
...
Thu, May 12, 8:12 PM · Release-Engineering-Team (Doing), Gerrit
hashar added a comment to T307137: Gerrit replication after a restart takes roughly 5 hours.
19:57:12 <hashar> !log Restarting Gerrit
Thu, May 12, 8:10 PM · Release-Engineering-Team (Doing), Gerrit
hashar updated the task description for T307599: Investigate alternatives to docker-in-docker for container image creation in GitLab.
Thu, May 12, 4:28 PM · Release-Engineering-Team (GitLab-a-thon 🦊), GitLab (CI & Job Runners)
hashar added a comment to T303857: Need a service account on deploy servers for automated train pre-sync operations.

I am not sure what happened but the deployment group does not exist on contint2001 / contint1001 (which has Helm)

Error: Could not find group deployment
Error: /Stage[main]/Helm/File[/var/cache/helm]/group: change from 'wikidev' to 'deployment' failed: Could not find group deployment
Thu, May 12, 3:53 PM · Release-Engineering-Team (Radar), SRE-Access-Requests, serviceops, SRE, Infrastructure-Foundations
hashar added a comment to T226869: Run browser tests in parallel.

Although PHPUnit integration tests and QUnit are centrally set in mediawiki/core, webdriver.io tests are split in each repositories. That is the model also used for to the linters (which are run via composer test and npm test) and let us use different versions of webdriver.io. It is too challenging if not impossible to force migrate all repositories at the same time.

Thu, May 12, 3:31 PM · WMDE-TechWish-Maintenance, MW-1.38-notes (1.38.0-wmf.17; 2022-01-10), User-zeljkofilipin, Quibble, Patch-For-Review, MediaWiki-Core-Tests, Browser-Tests
hashar closed T299492: Quibble ci-fullrun jobs should use Apache backend as Resolved.

Indeed looks done.

Thu, May 12, 9:10 AM · Patch-For-Review, Quibble
hashar closed T299492: Quibble ci-fullrun jobs should use Apache backend, a subtask of T299491: Switch QUnit tests to use Apache backend, as Resolved.
Thu, May 12, 9:10 AM · WMDE-TechWish-Maintenance, WMDE-TechWish-Sprint-2022-01-19, Unplanned-Sprint-Work, Quibble

Wed, May 11

hashar added a comment to T307405: Broken dashboard links on Zuul Status page.

I guess they can be replaced by:
https://grafana.wikimedia.org/d/000000321/zuul
https://grafana.wikimedia.org/d/000000284/continuous-integration

Wed, May 11, 5:52 PM · Graphite, Upstream, Release-Engineering-Team, Continuous-Integration-Infrastructure, Zuul
hashar claimed T307620: Delete .git/logs on zuul-merger git repo to prevent reflog from being created.
Wed, May 11, 2:23 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Continuous-Integration-Infrastructure
hashar moved T307620: Delete .git/logs on zuul-merger git repo to prevent reflog from being created from INBOX to Doing on the Release-Engineering-Team board.
Wed, May 11, 2:23 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Continuous-Integration-Infrastructure
hashar moved T308020: Run scap CI against Stretch, Buster, Bullseye from INBOX to Doing on the Release-Engineering-Team board.
Wed, May 11, 2:22 PM · Release-Engineering-Team (Doing), Patch-For-Review, Scap
hashar edited projects for T307538: Write a GitLab "Migrating a Project" runbook / manual based on Blubber migration, added: Release-Engineering-Team (GitLab-a-thon 🦊); removed Release-Engineering-Team (Doing).
Wed, May 11, 2:21 PM · Release-Engineering-Team (GitLab-a-thon 🦊), User-brennen, User-dduvall, GitLab (Project Migration)
hashar edited projects for T307599: Investigate alternatives to docker-in-docker for container image creation in GitLab, added: Release-Engineering-Team (GitLab-a-thon 🦊); removed Release-Engineering-Team (Doing).
Wed, May 11, 2:21 PM · Release-Engineering-Team (GitLab-a-thon 🦊), GitLab (CI & Job Runners)
hashar edited projects for T307810: Investigate buildkitd instances as image builders for GitLab, added: Release-Engineering-Team (GitLab-a-thon 🦊); removed Release-Engineering-Team (Doing).
Wed, May 11, 2:20 PM · GitLab (CI & Job Runners), Release-Engineering-Team (GitLab-a-thon 🦊)
hashar moved T307620: Delete .git/logs on zuul-merger git repo to prevent reflog from being created from Untriaged to In-progress on the Continuous-Integration-Infrastructure board.
Wed, May 11, 2:18 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Continuous-Integration-Infrastructure
hashar moved T307655: Replacement needed for obsolete Diamond/Graphite monitoring of integration instances from Untriaged to In-progress on the Continuous-Integration-Infrastructure board.
Wed, May 11, 2:18 PM · cloud-services-team (Kanban), Cloud-VPS, observability, Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar added a comment to T307655: Replacement needed for obsolete Diamond/Graphite monitoring of integration instances.

I have been using the dashboard Cloud VPS Project Board but it retrieves metrics from Graphite that is how I found out the integration instances have vanished.

Wed, May 11, 2:18 PM · cloud-services-team (Kanban), Cloud-VPS, observability, Release-Engineering-Team, Continuous-Integration-Infrastructure
hashar moved T307810: Investigate buildkitd instances as image builders for GitLab from GitLab-a-thon 🦊 to Doing on the Release-Engineering-Team board.
Wed, May 11, 2:10 PM · GitLab (CI & Job Runners), Release-Engineering-Team (GitLab-a-thon 🦊)
hashar moved T307599: Investigate alternatives to docker-in-docker for container image creation in GitLab from GitLab-a-thon 🦊 to Doing on the Release-Engineering-Team board.
Wed, May 11, 2:10 PM · Release-Engineering-Team (GitLab-a-thon 🦊), GitLab (CI & Job Runners)
hashar moved T307538: Write a GitLab "Migrating a Project" runbook / manual based on Blubber migration from GitLab-a-thon 🦊 to Doing on the Release-Engineering-Team board.
Wed, May 11, 2:10 PM · Release-Engineering-Team (GitLab-a-thon 🦊), User-brennen, User-dduvall, GitLab (Project Migration)
hashar added a comment to T308129: Avoid "_XSERVTransmkdir" warning noise when starting Xvfb in fresh-node.

This is an exact duplicate of T202710. It is due to /tmp being owned by root which mismatch the UID of the Xvfb process (nobody) and it thus spurts an error completely missing the fact it can actually write to /tmp.

Wed, May 11, 1:49 PM · User-zeljkofilipin, patch-welcome, Fresh, Performance-Team
hashar updated the task description for T308129: Avoid "_XSERVTransmkdir" warning noise when starting Xvfb in fresh-node.
Wed, May 11, 1:40 PM · User-zeljkofilipin, patch-welcome, Fresh, Performance-Team
hashar updated subscribers of T307178: CiviCRM CI jobs fails when migrating from Stretch to Bullseye.

I have deployed an experimental job which uses a Bullseye image and triggered it on https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/787694 by commenting check experimental. That leads to:

+ /src/wikimedia/fundraising/crm/bin/ci-create-dbs.sh
ERROR 1238 (HY000) at line 1: Variable 'innodb_file_format' is a read only variable
Wed, May 11, 1:31 PM · Patch-For-Review, Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure
hashar added a comment to T308013: Assign SPDX headers to puppet.git.

Before October 1st 2012, the code is my own and per my contract at the time: "source code contributed as part of this contract relationship will be licensed under an applicable open source license" and I hereby place it under Apache License 2 with copyright Antoine Musso <hashar@free.fr>. Then I don't know whether there is a lot left from this era beside the Rake puppet-lint, contint and jenkins modules.

Wed, May 11, 12:13 PM · Patch-For-Review, Infrastructure-Foundations, SRE
hashar added a comment to T225730: Reduce runtime of MW shared gate Jenkins jobs to 5 min.

This is really getting frustrating for the wmf branches. E.g. gerrit 785944 spent an hour in CI, then errored out with Build timed out (after 60 minutes). Marking the build as failed.. gerrit 785941 took 92 minutes to merge. It's basically getting impossible to do an extension backport within the one-hour deploy window; not to mention multiple backports.

Wed, May 11, 8:44 AM · MW-1.39-notes (1.39.0-wmf.8; 2022-04-18), MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), Release-Engineering-Team (Next), MW-1.36-notes (1.36.0-wmf.36; 2021-03-23), MW-1.35-notes (1.35.0-wmf.27; 2020-04-07), Patch-For-Review, Developer Productivity, Code-Health, Performance-Team (Radar), Epic, MediaWiki-Core-Tests, Continuous-Integration-Config
hashar added a comment to T287582: Move some Wikibase selenium tests to a standalone job.

When trying the jobs yesterday they failed again. The reason is that the extension dependencies are not injected by Zuul since the change is in progress and not deployed. I have been testing them by manually triggering them on Jenkins, I have added a temp change https://gerrit.wikimedia.org/r/c/integration/config/+/790984 to inject EXT_DEPENDENCIES=mediawiki/skins/MinervaNeue\nmediawiki/extensions/MobileFrontend\nmediawiki/extensions/UniversalLanguageSelector.

Wed, May 11, 8:35 AM · Release-Engineering-Team (Next), Wikidata, Patch-For-Review, Continuous-Integration-Config, Wikidata-Campsite, Wikibase (3rd party installations), wdwb-tech
hashar reopened T304860: Blubber must not use easy_install to install pip, but python3-pip as "Open".

Reopening since https://gerrit.wikimedia.org/r/c/mediawiki/services/wikispeech/mishkal/+/774770 still has to be reviewed and merged. I have reached by email @Sebastian_Berlin-WMSE and @kalle from Wikimedia Sweden.

Wed, May 11, 7:52 AM · Release-Engineering-Team (Next), Patch-For-Review, User-dduvall, Release Pipeline (Blubber)
hashar added a parent task for T308039: cxserver CI fails with missing file: T307507: Fully deprecate service-pipeline-test and service-pipeline-test-and-publish jobs.
Wed, May 11, 7:38 AM · Language-Team (Language-2022-April-June), Unplanned-Sprint-Work, ci-test-error, CX-cxserver
hashar added a subtask for T307507: Fully deprecate service-pipeline-test and service-pipeline-test-and-publish jobs: T308039: cxserver CI fails with missing file.
Wed, May 11, 7:38 AM · Patch-For-Review, User-dduvall, Release-Engineering-Team (Doing), Continuous-Integration-Config, Release Pipeline

Tue, May 10

hashar updated the task description for T308020: Run scap CI against Stretch, Buster, Bullseye.
Tue, May 10, 7:58 PM · Release-Engineering-Team (Doing), Patch-For-Review, Scap
hashar claimed T308020: Run scap CI against Stretch, Buster, Bullseye.
Tue, May 10, 7:57 PM · Release-Engineering-Team (Doing), Patch-For-Review, Scap
hashar created T308020: Run scap CI against Stretch, Buster, Bullseye.
Tue, May 10, 2:08 PM · Release-Engineering-Team (Doing), Patch-For-Review, Scap
hashar closed T300340: Use Memcached with Quibble as Resolved.

It is finally fully deployed

Tue, May 10, 12:24 PM · Patch-For-Review, Quibble
hashar closed T300340: Use Memcached with Quibble, a subtask of T294260: Upgrade dockerfiles to use composer 2.1.9 per CVE-2021-41116, as Resolved.
Tue, May 10, 12:23 PM · Continuous-Integration-Infrastructure, Composer
hashar added a comment to T307990: Changes to cu_changes.sql causing jenkins tests failure.

I guess we want to:

Tue, May 10, 10:05 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), ci-test-error, CheckUser, Regression
hashar added a comment to T199544: Make AbuseFilter work on PostgreSQL and SQLite (epic).

The CI change https://gerrit.wikimedia.org/r/c/integration/config/+/653121 enforces Sqlite for AbuseFilter but since CI inject other extensions they should be checked against Sqlite as well. T307990 is about dropping Sqlite support in CheckUser which ends up causing AbuseFilter CI to break.

Tue, May 10, 10:04 AM · Epic, PostgreSQL, SQLite, AbuseFilter
hashar updated subscribers of T307990: Changes to cu_changes.sql causing jenkins tests failure.

I have looked at the CI configuration (integration/config), AbuseFilter is apparently the sole extension running Sqlite based jobs on CI. That was done by @Jdforrester-WMF via https://gerrit.wikimedia.org/r/c/integration/config/+/653121 for T251967: quibble-vendor-sqlite-php72-docker is broken by AbuseFilter. The reason is the task about adding Sqlite and PostgreSQL support to AbuseFilter T199544.

Tue, May 10, 10:02 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), ci-test-error, CheckUser, Regression