Volans (Riccardo Coccioli)
Operations Software Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Feb 10 2016, 11:25 AM (58 w, 5 d)
Availability
Available
IRC Nick
volans
LDAP User
Volans
MediaWiki User
RCoccioli (WMF)

Recent Activity

Today

Volans created T161545: Cumin: PuppetDB backend, allow to specify boolean values for resource parameters.
Mon, Mar 27, 6:00 PM · Operations-Software-Development
Volans added a comment to T161007: Decouple Mariadb semi-sync replication from $::mw_primary.

Right, I was forgetting the details of the implementation, I agree that it might affect only the misc shards where we just have one slave in the same DC. No objections.

Mon, Mar 27, 9:31 AM · Patch-For-Review, DBA, DC-Switchover-Prep-Q3-2016-17, Wikimedia-Multiple-active-datacenters, Operations
Volans added a comment to T161007: Decouple Mariadb semi-sync replication from $::mw_primary.

I'm not saying it will not work, I just suggested to monitor it, because the ideal rpl_semi_sync_master_timeout that minimize the the waiting while keeping almost always the semi-sync active might be different between the slaves in the same DC and the cross-DC replication due to higher latency.

Mon, Mar 27, 9:01 AM · Patch-For-Review, DBA, DC-Switchover-Prep-Q3-2016-17, Wikimedia-Multiple-active-datacenters, Operations

Thu, Mar 23

Volans added a comment to T161007: Decouple Mariadb semi-sync replication from $::mw_primary.

@jcrespo if I understand the patch correctly this means that we'll activate semi-sync also for the cross-DC replication?
If so I would consider not having it for the cross-DC or gather some data to ensure that the threshold is not usually reach to avoid it flapping between async and semi-sync.
See also the data I've gathered ~1 year ago in T131753

Thu, Mar 23, 7:02 PM · Patch-For-Review, DBA, DC-Switchover-Prep-Q3-2016-17, Wikimedia-Multiple-active-datacenters, Operations
Volans added a comment to T161145: Fix the general problem of randomly-bad puppet agent cron timings within redundant clusters.

I agree with the principle, but we should also take into account the total distribution against the puppetmasters to avoid congestions and be careful with the per-DC basis.

Thu, Mar 23, 6:32 PM · Operations
Volans added a comment to T160994: Create the failoid service as fallback for the DNS discovery.

Given that there are a lot of services on non-standard ports and the lvs_services configuration had multiple instances for each discovery entry with different ports (http/https) and the mapping will just be a hieradata structure convention, we agreed to instead reject all TCP traffic on failoid as a last rule for iptables.

Thu, Mar 23, 3:53 PM · Patch-For-Review, discovery-system, Operations
Volans closed T160994: Create the failoid service as fallback for the DNS discovery as "Resolved".

Service up and running on roentgenium and tureis with puppet role failoid, refusing connections to ports 80 and 443.

Thu, Mar 23, 2:05 PM · Patch-For-Review, discovery-system, Operations
Volans closed T160994: Create the failoid service as fallback for the DNS discovery, a subtask of T156100: DNS: dynamically generate entries for service discovery, as "Resolved".
Thu, Mar 23, 2:05 PM · Patch-For-Review, Wikimedia-Multiple-active-datacenters, Services (watching), Performance-Team, discovery-system, User-Joe, User-mobrovac, MediaWiki-Configuration, Operations, Wikimedia-Developer-Summit (2017)
Volans edited the description of T160994: Create the failoid service as fallback for the DNS discovery.
Thu, Mar 23, 2:04 PM · Patch-For-Review, discovery-system, Operations

Wed, Mar 22

Volans created P5106 gdnsd stop/start timing.
Wed, Mar 22, 2:46 PM

Tue, Mar 21

Volans added a comment to T160833: DBReplication logs are too verbose.

+1, as soon as one DB is slightly delayed (~10s) thousands of warnings are logged.

Tue, Mar 21, 7:03 PM · MediaWiki-Debug-Logger, Performance
Volans edited the description of T160994: Create the failoid service as fallback for the DNS discovery.
Tue, Mar 21, 6:37 PM · Patch-For-Review, discovery-system, Operations
Volans created T161007: Decouple Mariadb semi-sync replication from $::mw_primary.
Tue, Mar 21, 4:38 PM · Patch-For-Review, DBA, DC-Switchover-Prep-Q3-2016-17, Wikimedia-Multiple-active-datacenters, Operations
Volans renamed T160994: Create the failoid service as fallback for the DNS discovery from "Create the nulloid service as fallback for the DNS discovery" to "Create the failoid service as fallback for the DNS discovery".
Tue, Mar 21, 2:33 PM · Patch-For-Review, discovery-system, Operations
Volans created T160994: Create the failoid service as fallback for the DNS discovery.
Tue, Mar 21, 1:52 PM · Patch-For-Review, discovery-system, Operations

Fri, Mar 17

Volans edited the description of T160178: MediaWiki Datacenter Switchover automation.
Fri, Mar 17, 11:06 AM · Patch-For-Review, DC-Switchover-Prep-Q3-2016-17, Epic, Wikimedia-Multiple-active-datacenters, Operations

Thu, Mar 16

Volans closed T160621: Cumin: upgrade to v0.0.2 in prod as "Resolved".
Thu, Mar 16, 10:45 AM · Patch-For-Review, Operations-Software-Development
Volans moved T160621: Cumin: upgrade to v0.0.2 in prod from In Code Review to Done on the Operations-Software-Development board.
Thu, Mar 16, 10:45 AM · Patch-For-Review, Operations-Software-Development
Volans moved T160621: Cumin: upgrade to v0.0.2 in prod from In Progress to In Code Review on the Operations-Software-Development board.
Thu, Mar 16, 10:45 AM · Patch-For-Review, Operations-Software-Development
Volans moved T160621: Cumin: upgrade to v0.0.2 in prod from Backlog to In Progress on the Operations-Software-Development board.
Thu, Mar 16, 10:01 AM · Patch-For-Review, Operations-Software-Development
Volans created T160621: Cumin: upgrade to v0.0.2 in prod.
Thu, Mar 16, 10:01 AM · Patch-For-Review, Operations-Software-Development

Wed, Mar 15

Volans closed T159969: Cumin: add integration tests as "Resolved".
Wed, Mar 15, 10:22 PM · Patch-For-Review, Operations-Software-Development

Mon, Mar 13

Volans updated subscribers of T160349: Degraded RAID on ms-be2028.
Mon, Mar 13, 2:36 PM · Operations, ops-codfw
Volans closed T159968: Cumin: add support for batch execution as "Resolved".
Mon, Mar 13, 12:25 PM · Patch-For-Review, Operations-Software-Development
Volans added a comment to T160178: MediaWiki Datacenter Switchover automation.

My proposal is to have a python file for each task (where feasible) with the same external interface, so that it will be easy to import and call them from a centralized script with a simple menu. The centralized script will not interfere with stdout/stderr.

Mon, Mar 13, 11:55 AM · Patch-For-Review, DC-Switchover-Prep-Q3-2016-17, Epic, Wikimedia-Multiple-active-datacenters, Operations
Volans closed T159970: Cumin: auto ucfirst puppetdb resources as "Resolved".
Mon, Mar 13, 10:19 AM · Patch-For-Review, Operations-Software-Development
Volans moved T159970: Cumin: auto ucfirst puppetdb resources from In Code Review to Done on the Operations-Software-Development board.
Mon, Mar 13, 10:19 AM · Patch-For-Review, Operations-Software-Development
Volans updated subscribers of T160312: Degraded RAID on ms-be2008.

@fgiunchedi I've manually updated the task description because NRPE timed out (it took me ~1 minute to get the output).
As usual puppet is broken due to mkfs and alarming on Icinga

Mon, Mar 13, 10:03 AM · Operations, ops-codfw
Volans edited the description of T160312: Degraded RAID on ms-be2008.
Mon, Mar 13, 10:02 AM · Operations, ops-codfw

Fri, Mar 10

Volans moved T159969: Cumin: add integration tests from In Progress to In Code Review on the Operations-Software-Development board.
Fri, Mar 10, 5:29 PM · Patch-For-Review, Operations-Software-Development
Volans moved T159970: Cumin: auto ucfirst puppetdb resources from In Progress to In Code Review on the Operations-Software-Development board.
Fri, Mar 10, 2:55 PM · Patch-For-Review, Operations-Software-Development
Volans moved T159970: Cumin: auto ucfirst puppetdb resources from Backlog to In Progress on the Operations-Software-Development board.
Fri, Mar 10, 2:55 PM · Patch-For-Review, Operations-Software-Development

Wed, Mar 8

Volans created T159970: Cumin: auto ucfirst puppetdb resources.
Wed, Mar 8, 6:53 PM · Patch-For-Review, Operations-Software-Development
Volans moved T159969: Cumin: add integration tests from Backlog to In Progress on the Operations-Software-Development board.
Wed, Mar 8, 6:49 PM · Patch-For-Review, Operations-Software-Development
Volans created T159969: Cumin: add integration tests.
Wed, Mar 8, 6:49 PM · Patch-For-Review, Operations-Software-Development
Volans moved T159968: Cumin: add support for batch execution from In Progress to In Code Review on the Operations-Software-Development board.
Wed, Mar 8, 6:46 PM · Patch-For-Review, Operations-Software-Development
Volans moved T159968: Cumin: add support for batch execution from Backlog to In Progress on the Operations-Software-Development board.
Wed, Mar 8, 6:44 PM · Patch-For-Review, Operations-Software-Development
Volans created T159968: Cumin: add support for batch execution.
Wed, Mar 8, 6:44 PM · Patch-For-Review, Operations-Software-Development

Fri, Mar 3

Volans added a project to T159410: Degraded RAID on db1056: DBA.
Fri, Mar 3, 10:42 AM · DBA, ops-eqiad, Operations

Mon, Feb 27

Volans created T159163: PuppetDB is auto-deactivating hosts.
Mon, Feb 27, 4:57 PM · Puppet, Operations
Volans moved T159127: Cumin: fine tuning configuration from Backlog to In Progress on the Operations-Software-Development board.
Mon, Feb 27, 11:20 AM · Patch-For-Review, Operations-Software-Development
Volans created T159127: Cumin: fine tuning configuration.
Mon, Feb 27, 11:20 AM · Patch-For-Review, Operations-Software-Development

Sun, Feb 26

Volans created T159070: MW OpenStackManager: add support for ED25519 SSH keys.
Sun, Feb 26, 12:26 PM · MW-1.29-release (WMF-deploy-2017-02-28_(1.29.0-wmf.14)), Patch-For-Review, MediaWiki-extensions-OpenStackManager

Feb 25 2017

Volans removed a project from T158854: Review the recent Varnishkafka patches: Operations-Software-Development.
Feb 25 2017, 6:40 PM · User-Elukey, Analytics-Kanban
Volans closed T158967: Cumin: fix first batch of potential issues reported by codacy as "Resolved".
Feb 25 2017, 6:39 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158967: Cumin: fix first batch of potential issues reported by codacy from In Progress to In Code Review on the Operations-Software-Development board.
Feb 25 2017, 6:28 PM · Patch-For-Review, Operations-Software-Development
Volans renamed T159045: Update Puppet repo code that uses maniphest.update and maniphest.createtask conduit api from "Update wmf_auto_reimage.py file to use maniphest.edit conduit api" to "Update Puppet repo code that uses maniphest.edit conduit api".
Feb 25 2017, 6:25 PM · Operations-Software-Development, Technical-Debt, Operations, Phabricator
Volans triaged T159045: Update Puppet repo code that uses maniphest.update and maniphest.createtask conduit api as "Normal" priority.
Feb 25 2017, 6:17 PM · Operations-Software-Development, Technical-Debt, Operations, Phabricator
Volans claimed T159045: Update Puppet repo code that uses maniphest.update and maniphest.createtask conduit api.
Feb 25 2017, 6:11 PM · Operations-Software-Development, Technical-Debt, Operations, Phabricator

Feb 24 2017

Volans closed T158798: Ferm: leftovers on hosts were it was enabled and then removed as "Resolved".

Cleanup completed and all looks good so far. Resolving

Feb 24 2017, 5:20 PM · Operations
Volans updated subscribers of T158798: Ferm: leftovers on hosts were it was enabled and then removed.

@jcrespo @Marostegui are you ok with the manual removal of any ferm rule and manual restore of a clean iptables table on dbproxy1011?

Feb 24 2017, 4:44 PM · Operations
Volans closed T158753: Cumin: authorize also IPv6 on the targets as "Resolved".
Feb 24 2017, 4:28 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158967: Cumin: fix first batch of potential issues reported by codacy from Backlog to In Progress on the Operations-Software-Development board.
Feb 24 2017, 3:39 PM · Patch-For-Review, Operations-Software-Development
Volans created T158967: Cumin: fix first batch of potential issues reported by codacy.
Feb 24 2017, 3:39 PM · Patch-For-Review, Operations-Software-Development
Volans closed T154588: Automation framework first version as "Resolved".
Feb 24 2017, 2:32 PM · Patch-For-Review, Operations-Software-Development
Volans edited the description of T154588: Automation framework first version.
Feb 24 2017, 2:32 PM · Patch-For-Review, Operations-Software-Development
Volans created T158964: Cumin: fill wikitech page with documentation.
Feb 24 2017, 2:31 PM · Operations-Software-Development
Volans closed T158660: Keyholder accept passwordless keys as "Resolved".
Feb 24 2017, 2:28 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans closed T158659: Keyholder: add support for ED25519 keys as "Resolved".
Feb 24 2017, 2:27 PM · Patch-For-Review, Operations-Software-Development
Volans closed T158758: Cumin: ignore urllib3 SubjectAltNameWarning in PuppetDB calls as "Resolved".
Feb 24 2017, 9:21 AM · Patch-For-Review, Operations-Software-Development
Volans closed T158748: Cumin: add support for 'not' operator for hosts as "Resolved".
Feb 24 2017, 9:11 AM · Patch-For-Review, Operations-Software-Development
Volans closed T158746: Cumin: use safer default for hosts regex matching as "Resolved".
Feb 24 2017, 9:11 AM · Patch-For-Review, Operations-Software-Development
Volans closed T149913: Icinga raid_handler: add option to handle frack instances as "Resolved".
Feb 24 2017, 9:11 AM · Patch-For-Review, Operations-Software-Development
Volans moved T149913: Icinga raid_handler: add option to handle frack instances from In Code Review to Done on the Operations-Software-Development board.
Feb 24 2017, 9:11 AM · Patch-For-Review, Operations-Software-Development
Volans moved T158746: Cumin: use safer default for hosts regex matching from In Code Review to Done on the Operations-Software-Development board.
Feb 24 2017, 9:10 AM · Patch-For-Review, Operations-Software-Development
Volans moved T158748: Cumin: add support for 'not' operator for hosts from In Code Review to Done on the Operations-Software-Development board.
Feb 24 2017, 9:10 AM · Patch-For-Review, Operations-Software-Development

Feb 22 2017

Volans created T158798: Ferm: leftovers on hosts were it was enabled and then removed.
Feb 22 2017, 7:41 PM · Operations
Volans moved T158748: Cumin: add support for 'not' operator for hosts from In Progress to In Code Review on the Operations-Software-Development board.
Feb 22 2017, 4:56 PM · Patch-For-Review, Operations-Software-Development
Volans closed T158773: Cumin: skip target installation on labs realm as "Resolved".

Solved for now. I'll follow up with Labs folks on when/how to include Cumin in Labs too.

Feb 22 2017, 4:51 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158773: Cumin: skip target installation on labs realm from In Progress to In Code Review on the Operations-Software-Development board.
Feb 22 2017, 4:05 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158773: Cumin: skip target installation on labs realm from Backlog to In Progress on the Operations-Software-Development board.
Feb 22 2017, 4:00 PM · Patch-For-Review, Operations-Software-Development
Volans created T158773: Cumin: skip target installation on labs realm.
Feb 22 2017, 3:59 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158753: Cumin: authorize also IPv6 on the targets from In Progress to In Code Review on the Operations-Software-Development board.
Feb 22 2017, 3:44 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158758: Cumin: ignore urllib3 SubjectAltNameWarning in PuppetDB calls from In Progress to In Code Review on the Operations-Software-Development board.
Feb 22 2017, 2:28 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158746: Cumin: use safer default for hosts regex matching from In Progress to In Code Review on the Operations-Software-Development board.
Feb 22 2017, 2:10 PM · Patch-For-Review, Operations-Software-Development
Volans added a comment to T156232: confctl SubjectAltNameWarning after python-urllib3 upgrade.

The main issue is tracked in T158757. For conftool the temporary solution is to ignore the warning:

Feb 22 2017, 12:52 PM · discovery-system, Operations
Volans moved T158758: Cumin: ignore urllib3 SubjectAltNameWarning in PuppetDB calls from Backlog to In Progress on the Operations-Software-Development board.
Feb 22 2017, 12:39 PM · Patch-For-Review, Operations-Software-Development
Volans created T158758: Cumin: ignore urllib3 SubjectAltNameWarning in PuppetDB calls.
Feb 22 2017, 12:39 PM · Patch-For-Review, Operations-Software-Development
Volans added a comment to T150822: Internal PKI for secure communication - Barcelona Ops offsite 2016.

Related issue with the current Puppet certificates: T158757

Feb 22 2017, 12:38 PM · Operations
Volans created T158757: Puppet certificate missing subjectAltName.
Feb 22 2017, 12:37 PM · Patch-For-Review, Puppet, Operations
Volans moved T158753: Cumin: authorize also IPv6 on the targets from Backlog to In Progress on the Operations-Software-Development board.
Feb 22 2017, 11:13 AM · Patch-For-Review, Operations-Software-Development
Volans created T158753: Cumin: authorize also IPv6 on the targets.
Feb 22 2017, 11:13 AM · Patch-For-Review, Operations-Software-Development
Volans moved T158748: Cumin: add support for 'not' operator for hosts from Backlog to In Progress on the Operations-Software-Development board.
Feb 22 2017, 9:33 AM · Patch-For-Review, Operations-Software-Development
Volans created T158748: Cumin: add support for 'not' operator for hosts.
Feb 22 2017, 9:33 AM · Patch-For-Review, Operations-Software-Development
Volans created T158747: Cumin: better error message if no config file is available.
Feb 22 2017, 9:31 AM · Operations-Software-Development
Volans moved T158746: Cumin: use safer default for hosts regex matching from Backlog to In Progress on the Operations-Software-Development board.
Feb 22 2017, 9:29 AM · Patch-For-Review, Operations-Software-Development
Volans created T158746: Cumin: use safer default for hosts regex matching.
Feb 22 2017, 9:29 AM · Patch-For-Review, Operations-Software-Development

Feb 21 2017

Volans added a comment to T158660: Keyholder accept passwordless keys.

@mmodell I'm not sure what's the status with the https://phabricator.wikimedia.org/source/keyholder/ repository that was recently created.

Feb 21 2017, 6:28 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans moved T158659: Keyholder: add support for ED25519 keys from In Progress to In Code Review on the Operations-Software-Development board.
Feb 21 2017, 6:25 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158659: Keyholder: add support for ED25519 keys from Backlog to In Progress on the Operations-Software-Development board.
Feb 21 2017, 6:25 PM · Patch-For-Review, Operations-Software-Development
Volans renamed T158659: Keyholder: add support for ED25519 keys from "Keyholder: " to "Keyholder: add support for ED25519 keys".
Feb 21 2017, 6:22 PM · Patch-For-Review, Operations-Software-Development
Volans moved T158660: Keyholder accept passwordless keys from In Progress to In Code Review on the Operations-Software-Development board.
Feb 21 2017, 3:38 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans moved T158660: Keyholder accept passwordless keys from Backlog to In Progress on the Operations-Software-Development board.
Feb 21 2017, 3:38 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans renamed T158660: Keyholder accept passwordless keys from "Keyholder: " to "Keyholder accept passwordless keys".
Feb 21 2017, 3:37 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans created T158660: Keyholder accept passwordless keys.
Feb 21 2017, 3:33 PM · Patch-For-Review, Operations, Operations-Software-Development
Volans created T158659: Keyholder: add support for ED25519 keys.
Feb 21 2017, 3:33 PM · Patch-For-Review, Operations-Software-Development
Volans edited the description of T154588: Automation framework first version.
Feb 21 2017, 11:56 AM · Patch-For-Review, Operations-Software-Development

Feb 20 2017

Volans added a comment to T158551: operations/software/cumin should not run debian-glue job on master branch.

@hashar I'm wondering if it could be easier to do the opposite and run only if matches a specific branch, debian in my case, but YMMV.

Feb 20 2017, 12:10 PM · Patch-For-Review, Continuous-Integration-Config
Volans added a comment to T158553: Enhance debian-glue job packages validation.

Another couple of enhancement that could be done for lintian, looking at the source code at https://github.com/mika/jenkins-debian-glue/blob/16f0ba5565435e12cb211c686bd5a49cb073252e/scripts/lintian-junit-report are:

Feb 20 2017, 11:55 AM · Patch-For-Review, Continuous-Integration-Config

Feb 16 2017

Volans added a comment to T143536: Upgrade all mw* servers to debian jessie.

What is the status of terbium? From the summary it appears to have been upgraded but the host is still a trusty.

Feb 16 2017, 6:47 PM · Operations-Software-Development, Patch-For-Review, HHVM, Operations