Page MenuHomePhabricator

hashar (Antoine "hashar" Musso)
UserAdministrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 2:31 PM (344 w, 12 h)
Roles
Administrator
Availability
Available
IRC Nick
hashar
LDAP User
Hashar
MediaWiki User
Unknown

https://www.mediawiki.org/wiki/User:Hashar

Based in Nantes, France CET/CEST (UTC+1, UTC+2)

antoine-approve

Recent Activity

Thu, May 6

hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

https://github.com/jenkinsci/gearman-plugin/pull/13 updates the code to a bit more modern interface :]

Thu, May 6, 9:40 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar updated subscribers of T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

I have send a tentative patch upstream to url encode POST made to conduit: https://gerrit-review.googlesource.com/c/plugins/its-phabricator/+/305522 But I don't know how to write tests in Java and I don't have any idea how to test the code I wrote :-\

Thu, May 6, 8:24 PM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

its-phabricator code to send to conduit is:

conduit/ConduitConnection.java
/**
 * Calls a conduit method with some parameters
 *
 * @param method The name of the method that should get called
 * @param params A map of parameters to pass to the call
 * @return The call's result, if there has been no error
 * @throws ConduitException
 */
JsonElement call(String method, Map<String, Object> params, String token)
    throws ConduitException {
  String methodUrl = apiUrlBase + method;
Thu, May 6, 3:44 PM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar closed T267172: integration-quibble-fullrun should use a tmpfs for the mysql database as Resolved.

Done in January via a721b6c0b7468df793b03d7e42e8c09a02884f52 which referenced the parent task T266777: integration instances suffer from high IO latency due to Ceph

Thu, May 6, 12:37 PM · Release-Engineering-Team (Seen), Continuous-Integration-Config

Wed, May 5

hashar added a subtask for T271863: Quibble runs core *integration* tests against Parsoid-as-an-extension, not *unit* tests: T281607: Release quibble 0.0.47 so we can build quibble stretch images again (py3.5 dependency).
Wed, May 5, 3:28 PM · Patch-For-Review, Quibble, Parsoid
hashar added a subtask for T199403: `composer test` in MediaWiki core doesn't work like it does in other repositories: T281607: Release quibble 0.0.47 so we can build quibble stretch images again (py3.5 dependency).
Wed, May 5, 3:28 PM · MW-1.35-notes, MW-1.31-release-notes, MW-1.36-notes (1.36.0-wmf.31; 2021-02-16), Patch-For-Review, Composer, MediaWiki-Core-Tests
hashar added parent tasks for T281607: Release quibble 0.0.47 so we can build quibble stretch images again (py3.5 dependency): T271863: Quibble runs core *integration* tests against Parsoid-as-an-extension, not *unit* tests, T199403: `composer test` in MediaWiki core doesn't work like it does in other repositories.
Wed, May 5, 3:28 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure, Quibble
hashar claimed T281607: Release quibble 0.0.47 so we can build quibble stretch images again (py3.5 dependency).
Wed, May 5, 1:54 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure, Quibble
hashar triaged T281981: Special:RecentChanges with userExpLevel=newcomer causes Fatal exception of type "Wikimedia\Rdbms\DBQueryError": Unknown column 'actor_user' as Unbreak Now! priority.
Wed, May 5, 1:38 PM · MW-1.37-notes (1.37.0-wmf.4; 2021-05-04), Platform Engineering, Growth-Team, MediaWiki-Recent-changes, Wikimedia-production-error
hashar added a parent task for T281981: Special:RecentChanges with userExpLevel=newcomer causes Fatal exception of type "Wikimedia\Rdbms\DBQueryError": Unknown column 'actor_user': T281145: 1.37.0-wmf.4 deployment blockers.
Wed, May 5, 1:38 PM · MW-1.37-notes (1.37.0-wmf.4; 2021-05-04), Platform Engineering, Growth-Team, MediaWiki-Recent-changes, Wikimedia-production-error
hashar added a subtask for T281145: 1.37.0-wmf.4 deployment blockers: T281981: Special:RecentChanges with userExpLevel=newcomer causes Fatal exception of type "Wikimedia\Rdbms\DBQueryError": Unknown column 'actor_user'.
Wed, May 5, 1:38 PM · User-brennen, Release-Engineering-Team (Doing), Release, Train Deployments
hashar added a comment to T281607: Release quibble 0.0.47 so we can build quibble stretch images again (py3.5 dependency).

Argh. So I looked at pending open patches:

Wed, May 5, 1:34 PM · Release-Engineering-Team, Continuous-Integration-Infrastructure, Quibble
hashar removed a project from T257378: EventLogging dev image should have verbose output enabled: Release-Engineering-Team.

It seems to be solely for the Analytics team.

Wed, May 5, 12:53 PM · Patch-For-Review, Analytics-EventLogging, Analytics, dev-images
hashar removed a project from T256131: Migrate push-notifications service to latest node LTS version: Release-Engineering-Team.
Wed, May 5, 12:50 PM · serviceops, Push-Notification-Service, Product-Infrastructure-Team-Backlog
hashar removed a project from T255981: Persistant error 500 getting category members: Release-Engineering-Team.
Wed, May 5, 12:50 PM · Platform Team Workboards (Clinic Duty Team), Upstream, Commons, ApiFeatureUsage, Pywikibot
hashar closed T252069: Gerrit New UI marks comments from Old UI as "Resolved" as Declined.

We have indeed upgraded to Gerrit 3.2 (June 2020) which comes solely with the new UI. So I guess that makes this task obsolete ;)

Wed, May 5, 12:49 PM · Release-Engineering-Team, Gerrit
hashar closed T248842: Last MW Core Security Release tags haven't pushed through to github as Resolved.

As per my previous comment, I could not find why the tag would not have replicated. Afterward the creation of a tag got replicated properly. We also have since upgraded from Gerrit 2.15 to Gerrit 3.2 so I don't think this task is relevant nowadays.

Wed, May 5, 12:48 PM · Release-Engineering-Team, Wikimedia-GitHub, Gerrit
hashar removed a project from T245841: mcrouter proxies and scap proxies: Release-Engineering-Team.
Wed, May 5, 12:45 PM · SRE, serviceops
hashar removed a project from T243977: Expose flags for controlling API tests from the action API : Release-Engineering-Team.
Wed, May 5, 12:43 PM · MediaWiki-API, Platform Team Initiatives (API Integration Tests), Code-Health
hashar closed T243540: Exception: No PHID found for slug #mwwmf_deploy! as Resolved.

Assuming that got solved by the last patch.

Wed, May 5, 12:42 PM · Release-Engineering-Team, ReleaseTaggerBot
hashar removed a project from T239271: Upgrade Sinon.JS: Release-Engineering-Team.
Wed, May 5, 12:41 PM · Readers-Web-Backlog (Tracking), MediaWiki-Core-Tests, Technical-Debt
hashar added a project to T65744: Zuul: Implement support for customizing status_url to include the change.id: Technical-Debt.
Wed, May 5, 12:38 PM · Technical-Debt, Release-Engineering-Team, Zuul, Upstream, Continuous-Integration-Infrastructure
hashar closed T281674: Archive the search/xgboot repo in gerrit as Resolved.

Should be good. It is gone from CI and read only in Gerrit.

Wed, May 5, 12:38 PM · Release-Engineering-Team (Doing), Discovery-Search (Current work), Projects-Cleanup
hashar updated the task description for T281674: Archive the search/xgboot repo in gerrit.
Wed, May 5, 12:37 PM · Release-Engineering-Team (Doing), Discovery-Search (Current work), Projects-Cleanup
hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

The authentication related callers are to switch the thread to have the Jenkins instance privileges (ACL.SYSTEM instead of some user privileges). The StopJobWorker and SetDescriptionWorker threads oth call GearmanPluginUtil.findBuild() which does invoke the authentication check:

src/main/java/hudson/plugins/gearman/GearmanPluginUtil.java
/**
 * Function to finds the build with the unique build id.
 *
 * @param jobName
 *      The jenkins job or project name without folder name
 * @param buildNumber
 *      The jenkins build number
 * @return
 *      the build Run if found, otherwise return null
 */
public static Run<?,?> findBuild(String jobName, int buildNumber) {
Wed, May 5, 8:18 AM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

And that happened again yesterday:

May 04, 2021 7:13:03 PM SEVERE hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException
Wed, May 5, 7:52 AM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure

Tue, May 4

hashar added a comment to T281674: Archive the search/xgboot repo in gerrit.

Great, thank you @EBernhardson . Will do the archival with @Gehel :]

Tue, May 4, 7:55 PM · Release-Engineering-Team (Doing), Discovery-Search (Current work), Projects-Cleanup
hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

If I abandon a change in Gerrit with the content %28%2B%29 it is shown as is in Gerrit but in Phabricator the comment has somehow been decoded and show up as (+). So the comment got url decoded, either by Gerrit its-phabricator / its-base plugin due to the Soy templates being too smart, or Phabricator conduit expect the payload to be urlencoded (that sounds weird).

Tue, May 4, 4:25 PM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

From the Jenkins code in core/src/main/java/hudson/model/Computer.java for getHostName(), and playing it in https://integration.wikimedia.org/ci/script:

import java.net.NetworkInterface
Tue, May 4, 2:57 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure

Mon, May 3

hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

The code:

src/main/java/hudson/plugins/gearman/GearmanProxy.java
// constructor
private GearmanProxy() {
    gewtHandles = Collections.synchronizedList(new ArrayList<ExecutorWorkerThread>());
    gmwtHandles = Collections.synchronizedList(new ArrayList<ManagementWorkerThread>());
Mon, May 3, 5:37 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T281737: Zuul can't stop jobs or set the build description.

The graph will remain high until the Gearman server is restarted to flush the backlog. That has to be done by restarting the Zuul scheduler which I will do tomorrow morning when CI is less busy.

Mon, May 3, 5:31 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar lowered the priority of T281737: Zuul can't stop jobs or set the build description from Unbreak Now! to Medium.

After restarting, the manager is:

$ zuul-gearman.py workers|cut -b-90|grep manager
21 ::ffff:127.0.0.1 172.17.0.1_manager : set_description:172.17.0.1 stop:172.17.0.1
Mon, May 3, 5:27 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar claimed T281737: Zuul can't stop jobs or set the build description.

The stop* and set_description* functions are handled by ManagementWorkerThread in the Gearman plugin. It has a single worker:

$ zuul-gearman.py workers|grep stop:contint2001
30 ::ffff:127.0.0.1 contint2001.wikimedia.org_manager : stop:contint2001.wikimedia.org set_description:contint2001.wikimedia.org
Mon, May 3, 5:11 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar created T281737: Zuul can't stop jobs or set the build description.
Mon, May 3, 4:52 PM · Release-Engineering-Team (Doing), Patch-For-Review, Zuul, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T281674: Archive the search/xgboot repo in gerrit.

The JJB config for the search/mjolnir has a comment referring to search/xgboost so maybe it is still used. Though most probably it is an artifact from the past and the code has since been updated to no more rely on xgboost.

Mon, May 3, 8:22 AM · Release-Engineering-Team (Doing), Discovery-Search (Current work), Projects-Cleanup
hashar updated the task description for T281674: Archive the search/xgboot repo in gerrit.
Mon, May 3, 8:08 AM · Release-Engineering-Team (Doing), Discovery-Search (Current work), Projects-Cleanup

Fri, Apr 30

hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

My message had Abandonning ... , the template thus has %%%Abandonning and I guess that %A is interpreted as well as other %. I am not quite sure I understand what is going, but I suspect it might that its-phabricator does not url encode the payload when sending to Phabricator conduit. I am wondering what would happen if one abandon a change with some urlencoded string.

Fri, Apr 30, 2:41 PM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

Tried again and there is a server side error:

TRACE com.googlesource.gerrit.plugins.its.base.workflow.AddSoyComment
Rendered template /var/lib/gerrit2/review_site/etc/its/templates/PatchSetAbandoned.soy to:
Change 683803 **abandoned** by Hashar:
%%%[test/gerrit-ping@master] __version__%20%7C%20https%3A%2F%2Fexample.org%2F%2B%2Ffoo.html%%%
Reason:
%%%Abandonning%20after%20https%3A%2F%2Fgerrit.wikimedia.org%2Fr%2Fc%2Foperations%2Fpuppet%2F%2B%2F683810%20got%20deployed%%%
https://gerrit.wikimedia.org/r/683803
Fri, Apr 30, 2:22 PM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar claimed T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.
Fri, Apr 30, 8:20 AM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar updated subscribers of T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

Fun thing, @Paladox raised that on Google Closure Template mailing list at https://groups.google.com/g/closure-templates-discuss/c/lXOqqmkO7fE :)

Fri, Apr 30, 8:20 AM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

I went to elevate a bunch of logging level in Gerrit. Loggers can be found using:

gerrit logging ls | grep \\.its\\.
com.googlesource.gerrit.plugins.its.base.its.ItsConfig: INFO
com.googlesource.gerrit.plugins.its.base.util.CommitMessageFetcher: INFO
com.googlesource.gerrit.plugins.its.base.util.IssueExtractor: INFO
com.googlesource.gerrit.plugins.its.base.validation.ItsValidateComment: INFO
com.googlesource.gerrit.plugins.its.base.workflow.ActionController: INFO
com.googlesource.gerrit.plugins.its.base.workflow.ActionExecutor: INFO
com.googlesource.gerrit.plugins.its.base.workflow.AddSoyComment: INFO
com.googlesource.gerrit.plugins.its.base.workflow.ItsRulesProjectCacheImpl: INFO
com.googlesource.gerrit.plugins.its.base.workflow.ItsRulesProjectCacheRefresher: INFO
com.googlesource.gerrit.plugins.its.base.workflow.RuleBase: INFO
com.googlesource.gerrit.plugins.its.phabricator.PhabricatorItsFacade: INFO
com.googlesource.gerrit.plugins.its.phabricator.PhabricatorModule: INFO
com.googlesource.gerrit.plugins.its.phabricator.conduit.ConduitConnection: INFO
Fri, Apr 30, 8:11 AM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar created T281552: test task for Gerrit its-phabricator plugin.
Fri, Apr 30, 7:45 AM · Patch-For-Review
hashar added a comment to T280197: Gerritbot turns "+" into space, thus breaking most Gerrit URLs.

The change for T93331 is to wrap the abandoning reason with a Phabricator literal block marker: %%% . The code at https://gerrit.wikimedia.org/r/c/operations/puppet/+/675479/2/modules/gerrit/templates/its/PatchSetAbandoned.soy.erb can be summarized as:

Reason:{\n}
- {$reason}
+ %%%{$reason}%%%{\n}
Fri, Apr 30, 7:39 AM · Release-Engineering-Team (Doing), Developer Productivity, Regression, GerritBot
hashar added a hashtag to GerritBot: #its-phabricator.
Fri, Apr 30, 7:31 AM

Thu, Apr 29

hashar added a comment to T263008: Gerrit out of heap.

I wrote a blog post summarizing this adventure: Blog Post: Tracking memory issue in a Java application

Thu, Apr 29, 12:18 PM · Security, Upstream, Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), Release-Engineering-Team (Development services), Gerrit
hashar added a comment to T281347: tox-docker CI test doesn't pick up overrides for pylint.

I have nuked the cache, but it still pick up the "wrong" flake8:

Collecting flake8
  Downloading flake8-3.9.1-py2.py3-none-any.whl (73 kB)
Thu, Apr 29, 7:52 AM · ci-test-error, SRE
hashar updated subscribers of T281347: tox-docker CI test doesn't pick up overrides for pylint.

+ @Volans due to overall knowledge about python linters / tox etc.

Thu, Apr 29, 7:33 AM · ci-test-error, SRE

Wed, Apr 28

hashar edited projects for T281347: tox-docker CI test doesn't pick up overrides for pylint, added: ci-test-error; removed Jenkins.
Wed, Apr 28, 6:25 PM · ci-test-error, SRE
hashar awarded T278994: Automatically tag the release task in the commit message of the weekly branch cut a Love token.
Wed, Apr 28, 5:22 PM · Release-Engineering-Team (Yak Shaving 🐃🪒)
hashar added a comment to T234020: Switch mediawiki code coverage from xdebug to pcov.

I note that the MediaWiki core code coverage run went from a few hours run down to just 13 minutes! https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/ Well done @Daimona

Wed, Apr 28, 8:46 AM · Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Patch-For-Review, Continuous-Integration-Config, Test-Coverage
hashar merged T272908: MediaWiki core phpunit coverage job is taking 6+ hours into T234020: Switch mediawiki code coverage from xdebug to pcov.
Wed, Apr 28, 8:46 AM · Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Patch-For-Review, Continuous-Integration-Config, Test-Coverage
hashar merged task T272908: MediaWiki core phpunit coverage job is taking 6+ hours into T234020: Switch mediawiki code coverage from xdebug to pcov.
Wed, Apr 28, 8:46 AM · Test-Coverage, MediaWiki-Core-Tests
hashar closed T272908: MediaWiki core phpunit coverage job is taking 6+ hours as Resolved.

The job is https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/ , it indeed used to take a few hours to complete and is now completing in 13 minutes! The reason is @Daimona switched from xdebug to pcov: T234020 :]

Wed, Apr 28, 8:45 AM · Test-Coverage, MediaWiki-Core-Tests
hashar closed T69295: Fatal error: Call to a member function getMessage() on a non-object in /mediawiki/extensions/Translate/tag/TranslatablePage.php on line 253, a subtask of T69216: Have unit tests of all wmf deployed extensions pass when installed together, in both PHP-Zend and HHVM (tracking), as Declined.
Wed, Apr 28, 8:32 AM · Tracking-Neverending, Continuous-Integration-Infrastructure
hashar closed T69295: Fatal error: Call to a member function getMessage() on a non-object in /mediawiki/extensions/Translate/tag/TranslatablePage.php on line 253 as Declined.

I have no idea whether that is still an issue nor how to even reproduce it. I think the original intent was to run the PHPUnit tests with all Wikimedia extensions installed. Then that was like 6 years ago... ;)

Wed, Apr 28, 8:32 AM · MediaWiki-extensions-Translate
hashar closed T267150: CI job mw-tools-codesniffer-mwcore-testrun fails on clean run as Resolved.

The job passes now, I apparently just forgot to mark this resolved.

Wed, Apr 28, 8:11 AM · Release-Engineering-Team (Doing), Continuous-Integration-Config
hashar closed T281291: Disable Gerrit user Mdhollo as Resolved.

The theory is we can disable the account in wikitech/ldap. There is an account in wikitech https://wikitech.wikimedia.org/wiki/Special:ListUsers?username=Mdhollo :

Wed, Apr 28, 7:45 AM · SRE, Gerrit-Privilege-Requests, LDAP-Access-Requests, Release-Engineering-Team, Gerrit
hashar closed T249268: Reduce size of artifacts stored on the CI Jenkins master as Resolved.

And that should be good now! :]

Wed, Apr 28, 7:06 AM · Release-Engineering-Team (Doing), Patch-For-Review, Technical-Debt, Continuous-Integration-Infrastructure

Tue, Apr 27

hashar added a project to T275468: Apache on doc1001 does not see updated PHP files for hours/days after deployment: Regression.
Tue, Apr 27, 9:02 PM · Regression, Release-Engineering-Team-TODO, Continuous-Integration-Infrastructure, serviceops
hashar added a project to T28699: Exclusion certain methods from doxygen call graphs: Doxygen.

Revisiting a decade old bug.

Tue, Apr 27, 5:34 PM · Doxygen, Documentation, Technical-Debt, MediaWiki-Documentation
hashar closed T240430: debian-glue jobs ignored error messages about libeatmydata.so in LD_PRELOAD as Resolved.

I have confirmed that eatmydata is now included in all cowbuilder images. When jenkins debian glue invokes the build, it does try to include eatmydata and show it is already available:

Tue, Apr 27, 5:24 PM · Release-Engineering-Team (Doing), Technical-Debt, Patch-For-Review, SRE, Continuous-Integration-Infrastructure
hashar added a comment to T281122: Wikibase selenium tests timeout, seemingly due to "memory compaction" events on CI VMs.

@Addshore could it be that the MediaWiki api is slow to respond? We have some debug logs available and attached to the build as mw-debug-www.log.gz. Also, the jobs use the php built-in server which process requests serially so that might cause issues. There is an Apache based flavor of the job which I think can be triggered via the experimental pipeline, though that surfaces some race conditions in the tests :-/

Tue, Apr 27, 5:21 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS, User-Addshore, wdwb-tech, Wikibase
hashar added a comment to T281122: Wikibase selenium tests timeout, seemingly due to "memory compaction" events on CI VMs.

TLDR: looks like there is not enough memory, at least on integration-agent-docker-1006

Log 2 and Log 3 fail at the same time on the same integration node (02:44:07 vs 02:44:15)

Grafana for the node running the job https://grafana-labs.wikimedia.org/d/000000590/instance-details?orgId=1&from=1619053200000&to=1619056800000&var-project=integration&var-job=node&var-node=integration-agent-docker-1006

From that dashboard under Memory Detail Meminfo we can see the memory committed exceeded the limit:

Maybe that in turns trigger the memory compaction system which seems to be a Linux feature to "defrag" the memory.

I note that integration-agent-docker-1006 seems to only have 8G of RAM!!!!!!!!!

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           7721        2310        3353         572        2057        4562
Swap:          7947         452        7495

I am pretty sure we had them with 32G since they can run up to 4 jobs concurrently. I can't reach out to Horizon right now to confirm the VM memory.

Tue, Apr 27, 5:19 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS, User-Addshore, wdwb-tech, Wikibase
hashar added a comment to T281122: Wikibase selenium tests timeout, seemingly due to "memory compaction" events on CI VMs.

TLDR: looks like there is not enough memory, at least on integration-agent-docker-1006

Tue, Apr 27, 3:40 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS, User-Addshore, wdwb-tech, Wikibase
hashar closed T279033: Upgrade Jenkins to 2.277.x as Resolved.

Great. The only issue I have encountered was with the Token Macro plugin on the CI Jenkins which I have somehow forgot to upgrade.

Tue, Apr 27, 3:24 PM · Release-Engineering-Team (Doing), Jenkins, Continuous-Integration-Infrastructure
hashar edited projects for T240430: debian-glue jobs ignored error messages about libeatmydata.so in LD_PRELOAD, added: Release-Engineering-Team (Doing); removed Release-Engineering-Team (Radar).
Tue, Apr 27, 2:56 PM · Release-Engineering-Team (Doing), Technical-Debt, Patch-For-Review, SRE, Continuous-Integration-Infrastructure
hashar added a comment to T279033: Upgrade Jenkins to 2.277.x.

The CI Jenkins upgrade apparently went fine \o/

Tue, Apr 27, 1:37 PM · Release-Engineering-Team (Doing), Jenkins, Continuous-Integration-Infrastructure
hashar claimed T279033: Upgrade Jenkins to 2.277.x.
Tue, Apr 27, 12:29 PM · Release-Engineering-Team (Doing), Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T279033: Upgrade Jenkins to 2.277.x.

The note about the LDAP plugin is that we must have 1.26 installed which is the latest still compatible with the 2.263.3 Jenkins we are running and has:

Tue, Apr 27, 12:28 PM · Release-Engineering-Team (Doing), Jenkins, Continuous-Integration-Infrastructure
Jakob_WMDE awarded T279068: Wikibase data-client bridge selenium failure on buster but not stretch a Meh! token.
Tue, Apr 27, 8:32 AM · Patch-For-Review, MW-1.37-notes (1.37.0-wmf.1; 2021-04-13), User-Ladsgroup, Wikidata, Wikidata-Campsite, Wikibase

Mon, Apr 26

hashar closed T277127: Gerrit Apache out of workers as Resolved.

Perfect, thank you @Dzahn and @elukey ;)

Mon, Apr 26, 8:07 PM · Release-Engineering-Team (Doing), Gerrit
hashar added a comment to T280901: MinervaNeue legacy ruby browser tests need to be updated to work on buster/ruby2.5.

That is because the job Docker image moved from Stretch to Buster and thus it upgrades ruby from 2.3 to 2.5. In ruby 2.4 integer got changed hence why we get:

generator.c:861:25: error: ‘rb_cFixnum’ undeclared (first use in this function);
did you mean ‘mFixnum’?
     } else if (klass == rb_cFixnum) {
                         ^~~~~~~~~~
Mon, Apr 26, 8:03 PM · Patch-For-Review, Readers-Web-Backlog (Kanbanana-FY-2020-21), MinervaNeue, Continuous-Integration-Infrastructure
hashar updated the task description for T280901: MinervaNeue legacy ruby browser tests need to be updated to work on buster/ruby2.5.
Mon, Apr 26, 7:58 PM · Patch-For-Review, Readers-Web-Backlog (Kanbanana-FY-2020-21), MinervaNeue, Continuous-Integration-Infrastructure
hashar added a comment to T279068: Wikibase data-client bridge selenium failure on buster but not stretch.

Danke Schon @Jakob_WMDE !

Mon, Apr 26, 7:55 PM · Patch-For-Review, MW-1.37-notes (1.37.0-wmf.1; 2021-04-13), User-Ladsgroup, Wikidata, Wikidata-Campsite, Wikibase
hashar added a comment to T279033: Upgrade Jenkins to 2.277.x.

Neat! Looks like that requires Winstone-Jetty is configured to handle SSL/TLS connections but on our setup we use plain HTTP (and envoy / the cache infra for encryption. So I am guessing we are not affected.

Mon, Apr 26, 1:04 PM · Release-Engineering-Team (Doing), Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T279068: Wikibase data-client bridge selenium failure on buster but not stretch.

@Ladsgroup thank you so much for fixing this test!

Mon, Apr 26, 11:59 AM · Patch-For-Review, MW-1.37-notes (1.37.0-wmf.1; 2021-04-13), User-Ladsgroup, Wikidata, Wikidata-Campsite, Wikibase
hashar closed T60094: ParserTests::testParserTest cannot be run in parallel due to /tmp/Foobar.svg, a subtask of T49063: Jenkins: Investigate using concurrent builds of the same job, as Resolved.
Mon, Apr 26, 9:02 AM · Continuous-Integration-Infrastructure
hashar closed T60094: ParserTests::testParserTest cannot be run in parallel due to /tmp/Foobar.svg as Resolved.

It is not an issue with Jenkins. The parser test suite creates a few files for testing purpose and delete them on completion. This task is that the test suite always referred to /tmp/Foobar.svg so when one runs tests in parallel there is an "opportunity" for the file to be deleted by one of the suite while the other has not completed yet and might then fail cause the file got deleted.

Mon, Apr 26, 9:02 AM · MediaWiki-Parser
hashar reopened T277127: Gerrit Apache out of workers as "Open".

I have looked at the Apache scoreboard and Gerrit thread pool and they seem fine now. Gerrit is at less than 40 threads out of a total pool of 60. Apache does have some spikes, but they are below the 150 workers limit. So that part is solved indeed.

Mon, Apr 26, 8:43 AM · Release-Engineering-Team (Doing), Gerrit
hashar added a comment to T252273: Archive the SmoothGallery extension.

SoothGallery no more shows up at https://gerrit.wikimedia.org/r/plugins/gitiles/translatewiki/+/refs/heads/master/groups/MediaWiki/mediawiki-extensions.txt , maybe that is the only thing that needs to be done to remove it from Translatewiki?

Mon, Apr 26, 7:40 AM · User-Kizule, MediaWiki-extensions-SmoothGallery, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup
hashar added a comment to T280185: Archive the DumpHTML extension.

It seems this task is now completed :]

Mon, Apr 26, 7:40 AM · MediaWiki-extensions-DumpHTML, User-Zabe, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup
hashar updated the task description for T280185: Archive the DumpHTML extension.
Mon, Apr 26, 7:39 AM · MediaWiki-extensions-DumpHTML, User-Zabe, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup
hashar updated the task description for T252273: Archive the SmoothGallery extension.
Mon, Apr 26, 7:39 AM · User-Kizule, MediaWiki-extensions-SmoothGallery, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup
hashar updated the task description for T280185: Archive the DumpHTML extension.
Mon, Apr 26, 7:38 AM · MediaWiki-extensions-DumpHTML, User-Zabe, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup
hashar updated the task description for T252273: Archive the SmoothGallery extension.
Mon, Apr 26, 7:38 AM · User-Kizule, MediaWiki-extensions-SmoothGallery, translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup

Sat, Apr 17

Ladsgroup awarded T252434: Test MW code in buster as well as stretch a Love token.
Sat, Apr 17, 5:36 AM · Patch-For-Review, Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), Continuous-Integration-Infrastructure, Quibble, Release-Engineering-Team (CI & Testing services)

Fri, Apr 16

hashar changed the start date for E1368: COVID lockdown / Homeschooling in France from Apr 4 2021, 11:00 PM to Apr 4 2021, 10:00 PM.
Fri, Apr 16, 3:22 PM · Release-Engineering-Team, events
hashar created E1368: COVID lockdown / Homeschooling in France.
Fri, Apr 16, 3:22 PM · Release-Engineering-Team, events
hashar closed T279817: TRAVIS jobs cannot be restarted anymore as Declined.

I guess that was a glitch on Travis side. If it happens again, I guess the best is to reach out to them directly, though I have no idea whether they offer support for Free plan.

Fri, Apr 16, 2:06 PM · Continuous-Integration-Infrastructure, Release-Engineering-Team (CI & Testing services), Pywikibot

Wed, Apr 14

Krinkle awarded T228838: Consider enabling all MW log channels by default for WMF a Orange Medal token.
Wed, Apr 14, 8:06 PM · Release-Engineering-Team (Radar), observability, Platform Engineering (Icebox), Developer Productivity, MediaWiki-Debug-Logger
hashar closed T275946: Can't delete weird ref using git in Gerrit as Resolved.

We deleted the faulty refs/master directly on disk and that was not noticed by Gerrit. The fix is to request a full replication:

ssh -p 29418 gerrit.wikimedia.org replication start --now mediawiki/extensions/AutoCreateCategoryPages
Wed, Apr 14, 2:02 PM · Release-Engineering-Team (Development services), Upstream, Release-Engineering-Team-TODO, Gerrit
hashar closed T275946: Can't delete weird ref using git in Gerrit, a subtask of T255802: PHPCS dashboard shows that AutoCreateCategoryPages extension haven't phpcs but it have, as Resolved.
Wed, Apr 14, 2:02 PM · Tools
hashar closed T275946: Can't delete weird ref using git in Gerrit, a subtask of T275292: LibUp hasn't successfully run on AutoCreateCategoryPages for 7 weeks, as Resolved.
Wed, Apr 14, 2:02 PM · LibUp

Tue, Apr 13

hashar closed T277645: Clean GerritSite.css from GWT styling as Resolved.

Thank you @Paladox !

Tue, Apr 13, 7:13 PM · Release-Engineering-Team (Development services), Gerrit
hashar added a comment to T268225: Switch Gerrit from Java 8 to Java 11.

The puppet bits are preparation work to let us "trivially" switch from java 8 to java 11.

Tue, Apr 13, 7:01 PM · Release-Engineering-Team (Seen), Gerrit (Gerrit 3.3)

Mon, Apr 12

hashar added a comment to T279817: TRAVIS jobs cannot be restarted anymore.

K-and-R looks like an organization and is unrelated. If I look at https://travis-ci.com/organizations/wikimedia/plan it states:

Mon, Apr 12, 10:46 AM · Continuous-Integration-Infrastructure, Release-Engineering-Team (CI & Testing services), Pywikibot

Fri, Apr 9

hashar renamed T263293: Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" SSH_MSG_CHANNEL_WINDOW_ADJUST from Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" to Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" SSH_MSG_CHANNEL_WINDOW_ADJUST.
Fri, Apr 9, 9:26 AM · Release-Engineering-Team (Seen), Gerrit
hashar added projects to T263293: Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" SSH_MSG_CHANNEL_WINDOW_ADJUST: Release-Engineering-Team (Development services), Release-Engineering-Team-TODO.

Not much we can do ourselves but that has been discussed this week on the Gerrit mailing list ( https://groups.google.com/g/repo-discuss/c/UUvDYv2qWqY ).

Fri, Apr 9, 9:26 AM · Release-Engineering-Team (Seen), Gerrit
hashar updated the task description for T263293: Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" SSH_MSG_CHANNEL_WINDOW_ADJUST.
Fri, Apr 9, 9:21 AM · Release-Engineering-Team (Seen), Gerrit

Thu, Apr 8

hashar updated the task description for T263293: Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" SSH_MSG_CHANNEL_WINDOW_ADJUST.
Thu, Apr 8, 7:44 PM · Release-Engineering-Team (Seen), Gerrit