Page MenuHomePhabricator

hashar (Antoine "hashar" Musso (WMF))
WMF Software developer - Release Engineering

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 2:31 PM (249 w, 6 d)
Availability
Available
IRC Nick
hashar
LDAP User
Hashar
MediaWiki User
Unknown

https://www.mediawiki.org/wiki/User:Hashar

Based in Nantes, France CET/CEST (UTC+1, UTC+2)

Main IRC channel is #wikimedia-releng

antoine-approve

Recent Activity

Today

hashar added a comment to T220739: 1.34.0-wmf.14 deployment blockers.

I have rollbacked due to T228436

Thu, Jul 18, 3:54 PM · Patch-For-Review, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (Deployment services), Release, Train Deployments
hashar added a comment to T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges().

Indeed the rollback got rid of the spam of errors. To be investigated now is what kind of db transactions are taking a while in Wikimedia\Rdbms\LBFactory->commitMasterChanges() :-\ There might be informations in logstash log if MediaWiki logs anything for databases.

Thu, Jul 18, 3:53 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar added a comment to T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges().

That happens mostly on enwiki. Over one hour 4k occurrences out of 4400 total matches.

Thu, Jul 18, 3:29 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar updated subscribers of T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges().
Thu, Jul 18, 3:24 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar added a parent task for T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges(): T220739: 1.34.0-wmf.14 deployment blockers.
Thu, Jul 18, 3:24 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar added a subtask for T220739: 1.34.0-wmf.14 deployment blockers: T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges().
Thu, Jul 18, 3:24 PM · Patch-For-Review, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (Deployment services), Release, Train Deployments
hashar triaged T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges() as Unbreak Now! priority.
Thu, Jul 18, 3:24 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar created T228436: web request timeout after 200 seconds due to Wikimedia\Rdbms\LBFactory->__destruct() > Wikimedia\Rdbms\LBFactory->commitMasterChanges().
Thu, Jul 18, 3:23 PM · Performance-Team, MediaWiki-Database, Wikimedia-production-error
hashar added a comment to T228425: User.php: Cannot create a user with no name, no ID, and no actor ID.

@SBisson definitely :-]

Thu, Jul 18, 2:39 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MediaWiki-Recent-changes, MediaWiki-API, Wikimedia-production-error
hashar added a comment to T228425: User.php: Cannot create a user with no name, no ID, and no actor ID.

Same on enwiki https://en.wikipedia.org/w/api.php?format=json&action=query&list=recentchanges&rctoken=patrol

Thu, Jul 18, 2:32 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MediaWiki-Recent-changes, MediaWiki-API, Wikimedia-production-error
hashar lowered the priority of T228425: User.php: Cannot create a user with no name, no ID, and no actor ID from Unbreak Now! to Normal.

Can be reproduced using logged in or not with:

Thu, Jul 18, 2:30 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MediaWiki-Recent-changes, MediaWiki-API, Wikimedia-production-error
hashar updated the task description for T228425: User.php: Cannot create a user with no name, no ID, and no actor ID.
Thu, Jul 18, 2:27 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MediaWiki-Recent-changes, MediaWiki-API, Wikimedia-production-error
hashar reopened T226236: Upload docker-ce 18.06.3 upstream package for Stretch, a subtask of T224591: Migrate contint* hosts to Stretch/Buster, as Open.
Thu, Jul 18, 1:16 PM · Continuous-Integration-Infrastructure (phase-out-jessie), Operations
hashar reopened T226236: Upload docker-ce 18.06.3 upstream package for Stretch, a subtask of T226233: Rebuild integration-slave-docker-* instances to use less RAM, new name and Stretch, as Open.
Thu, Jul 18, 1:16 PM · Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Infrastructure (phase-out-jessie)
hashar reopened T226236: Upload docker-ce 18.06.3 upstream package for Stretch as "Open".

Thanks, I can confirm the component is around and it addresses the concern of mixing up upgrades with Toolforge. However that imports 18.09.7 but we need the previous version 18.06.x for now :-\

Thu, Jul 18, 1:16 PM · serviceops, Operations, Continuous-Integration-Infrastructure (phase-out-jessie)
hashar closed T227067: ReleaseNotesTest:testReleaseNotesFilesExistAndAreNotMalformed takes ~ 4 seconds as Resolved.

Too many changes to backport for REL1_32 or REL1_31 so I just skip those. It is not a huge win anyway :-]

Thu, Jul 18, 1:08 PM · Release-Engineering-Team-TODO (201907), MW-1.33-notes, MW-1.34-notes (1.34.0-wmf.13; 2019-07-09), MediaWiki-Core-Testing
hashar closed T227159: Enable sandbox branches in gerrit as Resolved.

Done on All-Projects.git via https://gerrit.wikimedia.org/r/#/c/All-Projects/+/524206/

Thu, Jul 18, 1:07 PM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO (201907), Gerrit
hashar triaged T228381: zuul systemd service: [/lib/systemd/system/zuul.service:15] Failed to parse usec_t value, ignoring: infinity as Normal priority.
Thu, Jul 18, 12:49 PM · Patch-For-Review, Zuul, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907), Continuous-Integration-Infrastructure
hashar moved T228381: zuul systemd service: [/lib/systemd/system/zuul.service:15] Failed to parse usec_t value, ignoring: infinity from INBOX to Blocked externally on the Release-Engineering-Team-TODO (201907) board.
Thu, Jul 18, 12:49 PM · Patch-For-Review, Zuul, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907), Continuous-Integration-Infrastructure
hashar added a comment to T228250: PHP Notice: Undefined property: stdClass::$module in OATHAuth/src/OATHUserRepository.php on line 193.

Hurrah :]

Thu, Jul 18, 9:08 AM · MediaWiki-extensions-OATHAuth
hashar closed T227605: contint1001 spurious disk space alarms as Resolved.

Yes that looks good now. Thank you!

Thu, Jul 18, 9:00 AM · Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Infrastructure
hashar added a comment to T228381: zuul systemd service: [/lib/systemd/system/zuul.service:15] Failed to parse usec_t value, ignoring: infinity.

That is with systemd 215-17+deb8u13 from Jessie. systemd.service (5) states:

Thu, Jul 18, 8:54 AM · Patch-For-Review, Zuul, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907), Continuous-Integration-Infrastructure
hashar claimed T228381: zuul systemd service: [/lib/systemd/system/zuul.service:15] Failed to parse usec_t value, ignoring: infinity.
Thu, Jul 18, 8:44 AM · Patch-For-Review, Zuul, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907), Continuous-Integration-Infrastructure
hashar created T228381: zuul systemd service: [/lib/systemd/system/zuul.service:15] Failed to parse usec_t value, ignoring: infinity.
Thu, Jul 18, 8:43 AM · Patch-For-Review, Zuul, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907), Continuous-Integration-Infrastructure
hashar added a comment to T228155: Add phan to PageForms.

Given PageForms is supposedly usable without SMW, would it makes sense to have run TWO phan analysis? One with SMW and another one without it?

Thu, Jul 18, 7:58 AM · Patch-For-Review, MediaWiki-extensions-Page_Forms, Continuous-Integration-Config
hashar added projects to T228376: Switch translatewiki repo from composer-test-hhvm to composer-test: Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO (201907).
Thu, Jul 18, 7:55 AM · Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Config
hashar added a comment to T228376: Switch translatewiki repo from composer-test-hhvm to composer-test.

integration/config.git has:

zuul/layout.yaml
- name: translatewiki
  test:
    - translatewiki-rake-docker
    - translatewiki-composer-hhvm-docker
  gate-and-submit:
    - translatewiki-rake-docker
    - translatewiki-composer-hhvm-docker
Thu, Jul 18, 7:54 AM · Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Config
hashar added a comment to T228288: debmonitor send status update before the package actually got upgraded.

@Volans sounds good. I guess my concern was to potentially have a wrong state, but the daily crontab would indeed align the debmonitor database with the reality. So that addresses my concern :-]

Thu, Jul 18, 7:48 AM · SRE-tools

Yesterday

hashar added a comment to T228158: Increase TTL of failed builds.

The XML configuration for a job looks like:

<project>
  <properties>
    <jenkins.model.BuildDiscarderProperty>
      <strategy class="hudson.tasks.LogRotator">
        <daysToKeep>30</daysToKeep>
        <numToKeep>-1</numToKeep>
        <artifactDaysToKeep>-1</artifactDaysToKeep>
        <artifactNumToKeep>-1</artifactNumToKeep>
      </strategy>
    </jenkins.model.BuildDiscarderProperty>
...
Wed, Jul 17, 9:35 PM · Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T227818: WDQS GUI deploy build fails.

Hurrah! Thank you to have remaindered us about this task :-]

Wed, Jul 17, 4:43 PM · Continuous-Integration-Infrastructure, Jenkins, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Wikidata Query UI, Wikimedia-production-error (Shared Build Failure), Wikidata
hashar added a comment to T184086: Add prometheus exporter to Gerrit.

I have quickly talked with @Paladox about it. He has tried the metrics-reporter-prometheus plugin and it does exposes all metrics to another endpoint. So we should add that plugin to our deployment, configure it, add a prometheus server and potentially we would have bunch of new metrics to build a dashboard with.

Wed, Jul 17, 4:31 PM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, Patch-For-Review, Gerrit, Operations
hashar closed T227818: WDQS GUI deploy build fails as Resolved.

The job has produced https://gerrit.wikimedia.org/r/#/c/wikidata/query/gui-deploy/+/523962

Wed, Jul 17, 4:19 PM · Continuous-Integration-Infrastructure, Jenkins, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Wikidata Query UI, Wikimedia-production-error (Shared Build Failure), Wikidata
hashar added a comment to T227818: WDQS GUI deploy build fails.

The job happens after a change has been merged (Zuul pipeline: postmerge).

Wed, Jul 17, 4:17 PM · Continuous-Integration-Infrastructure, Jenkins, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Wikidata Query UI, Wikimedia-production-error (Shared Build Failure), Wikidata
hashar claimed T225735: Cleanup CI puppet manifests.
Wed, Jul 17, 3:43 PM · Release-Engineering-Team-TODO (201907), Patch-For-Review, Release-Engineering-Team (CI & Testing services), Technical-Debt, Continuous-Integration-Infrastructure
hashar claimed T226233: Rebuild integration-slave-docker-* instances to use less RAM, new name and Stretch.
Wed, Jul 17, 3:39 PM · Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Infrastructure (phase-out-jessie)
hashar moved T189560: mediawiki/vendor REL1_* no longer ship dependencies for wmf extensions that are not in the mediawiki tarball from Doing to Ready on the Release-Engineering-Team-TODO (201907) board.
Wed, Jul 17, 3:38 PM · Release-Engineering-Team-TODO (201907), Patch-For-Review, Wikimedia-production-error (Shared Build Failure), Continuous-Integration-Config, CX-deployments, AbuseFilter
hashar claimed T227818: WDQS GUI deploy build fails.

That job got copied from the one that builds wikimedia/portals assets. It suffered from the exact same issue: T227448

Wed, Jul 17, 3:35 PM · Continuous-Integration-Infrastructure, Jenkins, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Wikidata Query UI, Wikimedia-production-error (Shared Build Failure), Wikidata
hashar created T228288: debmonitor send status update before the package actually got upgraded.
Wed, Jul 17, 3:20 PM · SRE-tools
hashar committed rECIR4e4991e0ad0d: Do not serialize ResultsType instance (authored by dcausse).
Do not serialize ResultsType instance
Wed, Jul 17, 1:50 PM
hashar added a comment to T228276: PHP Warning: Attempted to serialize unserializable builtin class Closure$CirrusSearch\Profile\CompletionSearchProfileRepository::__construct;2912.

Another related one Serialization of 'Closure' is not allowed. reqId XS8cxQpAIDAAAF2FuqQAAABB

Wed, Jul 17, 1:17 PM · MW-1.34-notes (1.34.0-wmf.14; 2019-07-16), Discovery-Search, CirrusSearch, Wikimedia-production-error
hashar added a comment to T228250: PHP Notice: Undefined property: stdClass::$module in OATHAuth/src/OATHUserRepository.php on line 193.

Seems WMF production database had some schema changes recently for the oauth_users database table:

Wed, Jul 17, 12:53 PM · MediaWiki-extensions-OATHAuth
hashar added a comment to T220739: 1.34.0-wmf.14 deployment blockers.
Wed, Jul 17, 12:50 PM · Patch-For-Review, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (Deployment services), Release, Train Deployments
hashar added a comment to T184086: Add prometheus exporter to Gerrit.

There is now a Javamelody prometheus exporter at https://gerrit.wikimedia.org/r/monitoring?format=prometheus . It reports JVM related metrics which are rendered on Grafana at https://grafana.wikimedia.org/d/Bw2mQ3iWz/gerrit-javamelody . That solely report Java internal metrics.

Wed, Jul 17, 12:45 PM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, Patch-For-Review, Gerrit, Operations
hashar triaged T228084: Support posting screenshots in Gerrit as Normal priority.
Wed, Jul 17, 7:13 AM · Upstream, Gerrit
hashar merged task T227909: Document how to add a new development dependency for an extension in Quibble into T220723: Install extension require-dev dependencies in wmf-quibble-vendor-mysql-hhvm-docker.
Wed, Jul 17, 7:09 AM · Documentation, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Quibble
hashar merged T227909: Document how to add a new development dependency for an extension in Quibble into T220723: Install extension require-dev dependencies in wmf-quibble-vendor-mysql-hhvm-docker.
Wed, Jul 17, 7:09 AM · Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Librarization, Quibble
hashar added a comment to T227909: Document how to add a new development dependency for an extension in Quibble.

Erik Bernhardson has hit the same issue to add symfony/yaml for the CirrusSearch extension. T220723

Wed, Jul 17, 7:09 AM · Documentation, Release-Engineering-Team-TODO (201907), Release-Engineering-Team (CI & Testing services), Quibble

Tue, Jul 16

hashar triaged T228171: InvalidArgumentException: Invalid sort: last_edit_asc=1 as High priority.

Train blockers are unbreak now usually, then I am not sure how bad it is. Over 7 days there have been just 6 occurrences and they seem like end users altering the query manually. Hardly qualifies as a blocker to me?

Tue, Jul 16, 4:50 PM · Discovery-Search (Current work), CirrusSearch, MediaWiki-Search, Wikimedia-production-error
hashar added a project to T228158: Increase TTL of failed builds: Jenkins.

The retention policy for a job shows up as:

Tue, Jul 16, 3:25 PM · Jenkins, Continuous-Integration-Infrastructure
hashar updated the task description for T228171: InvalidArgumentException: Invalid sort: last_edit_asc=1.
Tue, Jul 16, 2:58 PM · Discovery-Search (Current work), CirrusSearch, MediaWiki-Search, Wikimedia-production-error
hashar added a project to T228171: InvalidArgumentException: Invalid sort: last_edit_asc=1: CirrusSearch.
Tue, Jul 16, 2:55 PM · Discovery-Search (Current work), CirrusSearch, MediaWiki-Search, Wikimedia-production-error
hashar awarded T99740: Use static php array files for l10n cache instead of CDB a 100 token.
Tue, Jul 16, 1:41 PM · Performance-Team (Radar), Deployments, MediaWiki-Internationalization
hashar updated subscribers of T228056: Puppet times out on newly created instance in the puppet-diffs project.

So I guess lets just unset base::resolving::labs_additional_domains ?

Tue, Jul 16, 11:45 AM · Cloud-VPS, cloud-services-team
hashar added a comment to T228056: Puppet times out on newly created instance in the puppet-diffs project.

So I guess lets just unset base::resolving::labs_additional_domains ?

Tue, Jul 16, 10:26 AM · Cloud-VPS, cloud-services-team
hashar added a comment to T228056: Puppet times out on newly created instance in the puppet-diffs project.

Looking at compiler1004, it seems the metadata retrieval done by puppet has a 1 minute timeout of some sort:

Tue, Jul 16, 10:05 AM · Cloud-VPS, cloud-services-team
hashar added a comment to T228056: Puppet times out on newly created instance in the puppet-diffs project.

Ah indeed on compiler1002.puppet-diffs.eqiad.wmflabs:

Tue, Jul 16, 9:21 AM · Cloud-VPS, cloud-services-team
hashar added a comment to T221969: Puppet catalog compiler - increasing max concurrent jobs.

https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler-test/ now has NUM_THREADS=4.

Tue, Jul 16, 9:12 AM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar merged Restricted Task into T226240: Create mirror of Gerrit repositories for consumption by various tools.
Tue, Jul 16, 9:08 AM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, Gerrit

Mon, Jul 15

hashar added a comment to T228047: puppet compiler fails on releases1001.eqiad.wmnet due to lack of Service[bacula-director].

*facepalm* I was debugging using the production branch catalog instead of the change catalog. Thank you !!!!! :-]

Mon, Jul 15, 4:36 PM · Operations, puppet-compiler, Continuous-Integration-Infrastructure
hashar added a comment to T225713: CPU scaling governor audit.

Have you planned the cloudvirt yet? I guess that is a bit more challenging since instances would have to be moved ahead of time, but I am genuinely interested in seeing whether that improves the bad CPU experience I have noticed.

Mon, Jul 15, 4:27 PM · User-fgiunchedi, Operations
hashar added a comment to T227992: Create #ci-test-error tag for tracking Gerrit repos failing tests.

I would guess that the new tag would be kind of a super set of ci-test-error isn't it? Regardless yes please be bold :-]

Mon, Jul 15, 4:11 PM · Project-Admins
hashar added a comment to T221969: Puppet catalog compiler - increasing max concurrent jobs.

Puppet fails due to /etc/puppetdb being a directory when it tries to make it a symlink. Eventually I found we use the default puppetdb:

# apt-cache policy puppetdb
puppetdb:
  Installed: 2.3.8-1~wmf1+stretch
  Candidate: 4.4.0-1~wmf2
  Version table:
     4.4.0-1~wmf2 1001
       1001 http://apt.wikimedia.org/wikimedia stretch-wikimedia/component/puppetdb4 amd64 Packages
 *** 2.3.8-1~wmf1+stretch 1001
       1001 http://apt.wikimedia.org/wikimedia stretch-wikimedia/main amd64 Packages
        100 /var/lib/dpkg/status
Mon, Jul 15, 3:20 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar removed a parent task for T228056: Puppet times out on newly created instance in the puppet-diffs project: T221969: Puppet catalog compiler - increasing max concurrent jobs.
Mon, Jul 15, 2:57 PM · Cloud-VPS, cloud-services-team
hashar removed a subtask for T221969: Puppet catalog compiler - increasing max concurrent jobs: T228056: Puppet times out on newly created instance in the puppet-diffs project.
Mon, Jul 15, 2:57 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar added a comment to T221969: Puppet catalog compiler - increasing max concurrent jobs.

In hiera config:

- profile::puppetdb::master: compiler1002.puppet-diffs.eqiad.wmflabs
+ profile::puppetdb::master: compiler1003.puppet-diffs.eqiad.wmflabs
Mon, Jul 15, 2:57 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar added a comment to T221969: Puppet catalog compiler - increasing max concurrent jobs.

On compiler1003 I have applied role::puppet_compiler and the hiera config. Postgre fails though:

Notice: /Stage[main]/Postgresql::Slave/Exec[pg_basebackup-compiler1002.puppet-diffs.eqiad.wmflabs]/returns: pg_basebackup: could not connect to server: FATAL:  no pg_hba.conf entry for replication connection from host "172.16.2.57", user "replication", SSL on
Notice: /Stage[main]/Postgresql::Slave/Exec[pg_basebackup-compiler1002.puppet-diffs.eqiad.wmflabs]/returns: FATAL:  no pg_hba.conf entry for replication connection from host "172.16.2.57", user "replication", SSL off
Error: /usr/bin/pg_basebackup -X stream -D /srv/postgres/9.6/main -h compiler1002.puppet-diffs.eqiad.wmflabs -U replication -w returned 1 instead of one of [0]
Error: /Stage[main]/Postgresql::Slave/Exec[pg_basebackup-compiler1002.puppet-diffs.eqiad.wmflabs]/returns: change from notrun to 0 failed: /usr/bin/pg_basebackup -X stream -D /srv/postgres/9.6/main -h compiler1002.puppet-diffs.eqiad.wmflabs -U replication -w returned 1 instead of one of [0]
Notice: /Stage[main]/Postgresql::Slave/File[/srv/postgres/9.6/main/recovery.conf]: Dependency Exec[pg_basebackup-compiler1002.puppet-diffs.eqiad.wmflabs] has failures: true
Warning: /Stage[main]/Postgresql::Slave/File[/srv/postgres/9.6/main/recovery.conf]: Skipping because of failed dependencies
Notice: /Stage[main]/Puppetdb::App/File[/etc/puppetdb]: Not removing directory; use 'force' to override
Notice: /Stage[main]/Puppetdb::App/File[/etc/puppetdb]: Not removing directory; use 'force' to override
Error: Could not remove existing file
Error: /Stage[main]/Puppetdb::App/File[/etc/puppetdb]/ensure: change from directory to link failed: Could not remove existing file
...
Mon, Jul 15, 2:49 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar created P8747 ZUUL_BRANCH=REL1_31 quibble --git-cache /home/hashar/projects --db sqlite --run phpunit-unit.
Mon, Jul 15, 2:39 PM
hashar added a comment to T228056: Puppet times out on newly created instance in the puppet-diffs project.

Eventually the instance console has shown the login prompt after 2300 seconds (38 minutes). Running puppet again:

Notice: Applied catalog in 7.76 seconds
Mon, Jul 15, 2:33 PM · Cloud-VPS, cloud-services-team
hashar changed the status of T221969: Puppet catalog compiler - increasing max concurrent jobs from Open to Stalled.

I tried provisioning a new instance but puppet takes age / fail retrieiving files. I have filled T228056 for that.

Mon, Jul 15, 1:46 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar added a parent task for T228056: Puppet times out on newly created instance in the puppet-diffs project: T221969: Puppet catalog compiler - increasing max concurrent jobs.
Mon, Jul 15, 1:45 PM · Cloud-VPS, cloud-services-team
hashar added a subtask for T221969: Puppet catalog compiler - increasing max concurrent jobs: T228056: Puppet times out on newly created instance in the puppet-diffs project.
Mon, Jul 15, 1:45 PM · Release-Engineering-Team-TODO (201907), puppet-compiler, Continuous-Integration-Infrastructure
hashar triaged T228056: Puppet times out on newly created instance in the puppet-diffs project as High priority.
Mon, Jul 15, 1:44 PM · Cloud-VPS, cloud-services-team
hashar created T228056: Puppet times out on newly created instance in the puppet-diffs project.
Mon, Jul 15, 1:43 PM · Cloud-VPS, cloud-services-team
hashar renamed T228047: puppet compiler fails on releases1001.eqiad.wmnet due to lack of Service[bacula-director] from puppet compiler fails on releases1001.eqiad.wmnet to puppet compiler fails on releases1001.eqiad.wmnet due to lack of Service[bacula-director].
Mon, Jul 15, 1:14 PM · Operations, puppet-compiler, Continuous-Integration-Infrastructure
hashar added a comment to T228047: puppet compiler fails on releases1001.eqiad.wmnet due to lack of Service[bacula-director].

On releases1001.eqiad.wmnet in /var/lib/puppet/state/state.yaml there is a Service[bacula-fd] but no Service[bacula]. Then puppet seems to work just fine on the host.

Mon, Jul 15, 1:14 PM · Operations, puppet-compiler, Continuous-Integration-Infrastructure
hashar created T228047: puppet compiler fails on releases1001.eqiad.wmnet due to lack of Service[bacula-director].
Mon, Jul 15, 1:04 PM · Operations, puppet-compiler, Continuous-Integration-Infrastructure
hashar closed T196347: Quibble may need to rebuild localization cache before running tests as Resolved.

I have made Quibble to populate the localization cache, so that is good enough for now.

Mon, Jul 15, 8:55 AM · Release-Engineering-Team-TODO (201907), Continuous-Integration-Config, Patch-For-Review, Quibble
hashar triaged T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5 as Normal priority.
Mon, Jul 15, 8:52 AM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services
hashar added a comment to T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5.

Seems we would want to tag 0068b08e120d43b09a41115aa08b1043deb58f4b

commit 0068b08e120d43b09a41115aa08b1043deb58f4b
Author: Marko Obrovac <mobrovac@wikimedia.org>
Date:   Tue Apr 30 17:36:30 2019 -0700
Mon, Jul 15, 8:52 AM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services

Fri, Jul 12

hashar moved T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5 from Backlog to Repo setup on the Continuous-Integration-Config board.
Fri, Jul 12, 2:06 PM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services
hashar added projects to T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5: Release-Engineering-Team-TODO (201907), Continuous-Integration-Config.
Fri, Jul 12, 2:06 PM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services
hashar updated the task description for T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5.
Fri, Jul 12, 2:06 PM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services
hashar added a comment to T227159: Enable sandbox branches in gerrit.

@Hexmode on test/gerrit-ping.git can you try creating a reference in a sandbox and then deleting it? https://gerrit.wikimedia.org/r/522462 should enable that. Then I guess we can enable the feature by default via All-Projects.git.

Fri, Jul 12, 1:19 PM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO (201907), Gerrit
hashar committed rODHFcf946506cee8: dpkg-source to ignore .gitreview file (authored by hashar).
dpkg-source to ignore .gitreview file
Fri, Jul 12, 10:45 AM
hashar committed rODHF7a429a19af12: Add uploader full name (authored by hashar).
Add uploader full name
Fri, Jul 12, 10:45 AM
hashar closed T160990: deployment-ms-be03.deployment-prep and deployment-ms-be04.deployment-prep have high load / system CPU as Resolved.

Hurrah. Thank you @CDanis & @godog :]

Fri, Jul 12, 9:48 AM · Release-Engineering-Team-TODO (201907), RelEng-Archive-FY201718-Q1, Patch-For-Review, media-storage, Beta-Cluster-Infrastructure
hashar created T227859: Debian package for operations/software/service-checker FTBS due to missing tag upstream/0.1.5.
Fri, Jul 12, 9:43 AM · Continuous-Integration-Config, Release-Engineering-Team-TODO (201907), Services
Krinkle awarded T225735: Cleanup CI puppet manifests a Orange Medal token.
Fri, Jul 12, 12:36 AM · Release-Engineering-Team-TODO (201907), Patch-For-Review, Release-Engineering-Team (CI & Testing services), Technical-Debt, Continuous-Integration-Infrastructure

Thu, Jul 11

hashar closed T99955: Write browser tests for DonationInterface, a subtask of T86247: More and easier testing for DonationInterface, as Declined.
Thu, Jul 11, 12:33 PM · Fundraising-Backlog, MediaWiki-extensions-DonationInterface, Fundraising-Backlog-Old
hashar closed T99955: Write browser tests for DonationInterface, a subtask of T89188: Hackathon idea: Make the DonationInterface extension as friendly as possible, as Declined.
Thu, Jul 11, 12:33 PM · Fundraising-Backlog, Epic, Fundraising Sprint Kraftwerk, MediaWiki-extensions-DonationInterface, Wikimedia-Hackathon-2015, Fundraising-Backlog-Old
hashar closed T99955: Write browser tests for DonationInterface as Declined.

No one working on it, but can be reopened or filled again if there is interest later on.

Thu, Jul 11, 12:33 PM · Release-Engineering-Team (Unit & Int & System Tooling), Release-Engineering-Team-TODO, Fundraising-Backlog, Browser-Tests, MediaWiki-extensions-DonationInterface
hashar added a comment to T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung).

The Jenkins logger ( https://integration.wikimedia.org/ci/log/Jenkins%20Queue/ ) missed the FINEST log level. The reasons are the same as shown in the web gui, for example:

Thu, Jul 11, 10:13 AM · Release-Engineering-Team, Release-Engineering-Team-TODO, Patch-For-Review, Upstream, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung).
core/src/main/java/hudson/model/Queue.java
MappingWorksheet ws = new MappingWorksheet(p, candidates);
Mapping m = loadBalancer.map(p.task, ws);
if (m == null) {
    // if we couldn't find the executor that fits,
    // just leave it in the buildables list and
    // check if we can execute other projects
    LOGGER.log(Level.FINER, "Failed to map {0} to executors. candidates={1} parked={2}",
            new Object[]{p, candidates, parked.values()});
    p.transientCausesOfBlockage = reasons.isEmpty() ? null : reasons;
    continue;
}
Thu, Jul 11, 9:47 AM · Release-Engineering-Team, Release-Engineering-Team-TODO, Patch-For-Review, Upstream, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung).

In the Jenkins log bucket hudson.model.Queue.

Thu, Jul 11, 9:43 AM · Release-Engineering-Team, Release-Engineering-Team-TODO, Patch-For-Review, Upstream, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung).

Another threaddump P8736
https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTkvMDcvMTEvLS10aHJlYWRkdW1wLnR4dC0tOS0zMC0zMg==

Thu, Jul 11, 9:34 AM · Release-Engineering-Team, Release-Engineering-Team-TODO, Patch-For-Review, Upstream, Jenkins, Continuous-Integration-Infrastructure
hashar added a comment to P8736 Jenkins gearman/beta deadlock thread dump.

For T72597

Thu, Jul 11, 9:32 AM
hashar updated the task description for T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung).
Thu, Jul 11, 9:32 AM · Release-Engineering-Team, Release-Engineering-Team-TODO, Patch-For-Review, Upstream, Jenkins, Continuous-Integration-Infrastructure
hashar created P8736 Jenkins gearman/beta deadlock thread dump.
Thu, Jul 11, 9:31 AM
hashar added a comment to T153859: discernatron 'composer install' fails on two dependencies.

I am not sure discernatron is still used / maintained. Eventually there is an open change to migrate to a different library:

Thu, Jul 11, 8:22 AM · Patch-For-Review, Discovery
hashar added a comment to T167432: Run Wikibase daily browser tests on Jenkins.

The selenium-wikibase-chrome job was based on ruby mediawiki_selenium. This task was to run that test suite on patch submission but it never has been done since the suite was so long. The job eventually got deleted a month ago:

Thu, Jul 11, 7:33 AM · User-Addshore, Wikidata-Campsite, Wikidata-Turtles-Tech-Debt, Wikidata-Ministry-Of-Magic-Tech-Debt, MW-1.31-release-notes (WMF-deploy-2018-02-13 (1.31.0-wmf.21)), Wikidata-Sprint-2017-12-20, Wikidata