Volans (Riccardo Coccioli)
Operations Software Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Feb 10 2016, 11:25 AM (75 w, 3 d)
Availability
Available
IRC Nick
volans
LDAP User
Volans
MediaWiki User
RCoccioli (WMF)

Recent Activity

Fri, Jul 21

Volans moved T170394: Cumin: add multi-query support from In Progress to In Code Review on the Operations-Software-Development board.
Fri, Jul 21, 8:22 AM · Patch-For-Review, Operations-Software-Development

Thu, Jul 20

Volans added a comment to T154588: Automation framework first version.

@greg kinda...:-P
I might have abused a bit this task for things that are related to make it a proper release, PyPi ready and some metadata/testing polishing.

Thu, Jul 20, 11:12 PM · Patch-For-Review, Operations-Software-Development
Volans added a comment to T170394: Cumin: add multi-query support.

The implemented proposal (CR will be sent later today) is pretty much what listed as (2) in the task description:

  • have an optional default_backend parameter in the configuration that will cause:
    • if set: try to parse the query with the default backend first, if it fails parsing try with the multi-query global grammar
    • if not set: run directly with the multi-query global grammar
  • have aliases only in the global grammar, they need to use the global multi-query grammar and can be specified as A:alias_name
  • aliases are recursively replaced with their value, so an alias can be a composition of other aliases
  • Each query block can be aggregated with the others with the boolean operators: and, or, and not, xor
Thu, Jul 20, 5:16 PM · Patch-For-Review, Operations-Software-Development

Wed, Jul 19

Volans awarded T129139: Deploy statsv with scap3 a Like token.
Wed, Jul 19, 5:26 PM · User-fgiunchedi, Patch-For-Review, monitoring, Deployment-Systems, Scap (Scap3-Adoption-Phase1), scap2
Volans merged T169993: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations
Volans merged task T169993: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations
Volans merged T170539: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations
Volans merged task T170539: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations
Volans merged T170538: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations
Volans merged task T170538: Degraded RAID on lvs3001 into T168619: Degraded RAID on lvs3001.
Wed, Jul 19, 9:02 AM · ops-esams, Operations

Mon, Jul 17

Volans added a project to T170847: Icinga check for pybal HTTP connections to etcd: monitoring.
Mon, Jul 17, 4:40 PM · monitoring, Pybal, Traffic, Operations
Volans closed T164206: Icinga loses downtime entries, causing alert and page spam as Resolved.

After few days without incident seems that we can call it resolved! \o/

Mon, Jul 17, 3:54 PM · Icinga, Operations, monitoring
Volans closed T169640: Support aliases in cumin as Resolved.
Mon, Jul 17, 12:48 PM · Operations-Software-Development

Sat, Jul 15

Liuxinyu970226 awarded T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse a Yellow Medal token.
Sat, Jul 15, 11:21 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator

Thu, Jul 13

Volans added a comment to T169011: reimage maps-test servers.

@Pnorman the issue on Icinga was solved around 12:10 UTC, but, as you can see in the detailed explanation in T164206#3435220, there was a loss of downtimes and acknowledgments, and the cleanup of this was done a bit later after the fix, from the logs I can see that at 13:43 UTC the maps-tests alarms where acknowledged again.

Thu, Jul 13, 8:28 PM · Patch-For-Review, Maps, Discovery, Interactive-Sprint
Volans added a comment to T169011: reimage maps-test servers.

@Pnorman when was that?
I'm asking because the Icinga issue is T164206 and since today around 12:10 UTC it should be fixed, see T164206#3435220 for the full explanation.
And the alarm I mentioned above was already re-acked after the fix and I can see the ack is still there.

Thu, Jul 13, 8:01 PM · Patch-For-Review, Maps, Discovery, Interactive-Sprint
Volans lowered the priority of T164206: Icinga loses downtime entries, causing alert and page spam from Unbreak Now! to High.

So after a lot of digging and live debugging between @akosiaris and me, we *think* to have an explanation and to have fixed the issue.

Thu, Jul 13, 1:19 PM · Icinga, Operations, monitoring

Wed, Jul 12

Volans triaged T170433: Degraded RAID on db1066 as Normal priority.
Wed, Jul 12, 4:06 PM · DBA, ops-eqiad, Operations
Volans added a comment to T170394: Cumin: add multi-query support.

For option 2, how do you resolve ambiguities? Would letters be unique to backends (i.e. F/R are claimed by PuppetDB and can't be claimed by another backend, period) or would they be claimed on a first-come first-served basis, in which case the order of backend inclusion matters? I'd strongly prefer the former, but this means that letters need to be reserved across/above backends, right?

Wed, Jul 12, 12:29 PM · Patch-For-Review, Operations-Software-Development
Volans moved T170394: Cumin: add multi-query support from Backlog to In Progress on the Operations-Software-Development board.
Wed, Jul 12, 10:53 AM · Patch-For-Review, Operations-Software-Development
Volans created T170394: Cumin: add multi-query support.
Wed, Jul 12, 10:53 AM · Patch-For-Review, Operations-Software-Development

Tue, Jul 11

greg awarded T170353: Icinga: timeseries checks should have the link to a graph with the data a Cup of Joe token.
Tue, Jul 11, 10:18 PM · Operations, monitoring
Volans created T170353: Icinga: timeseries checks should have the link to a graph with the data.
Tue, Jul 11, 9:59 PM · Operations, monitoring
Volans added a watcher for monitoring: Volans.
Tue, Jul 11, 9:25 PM
Volans added a member for monitoring: Volans.
Tue, Jul 11, 9:24 PM
Volans added a comment to T169011: reimage maps-test servers.

@Gehel I've ACK'ed the Icinga alert for kartotherian endpoints health on maps-test* hosts, given that they are in alarm since 4 days my guess is that they were downtimed but today seems that Icinga lost downtime again. Just FYI.

Tue, Jul 11, 7:38 PM · Patch-For-Review, Maps, Discovery, Interactive-Sprint
Volans added a comment to T150160: Remote IPMI doesn't work for ~2% of the fleet.

db1053.mgmt.eqiad.wmnet seems to work now, I can both ssh and get an chassis status from neodymium. Transient issue?

Tue, Jul 11, 11:25 AM · monitoring, Operations

Mon, Jul 10

Volans added a comment to T150160: Remote IPMI doesn't work for ~2% of the fleet.

I've try to fix the 5 hosts with the remote wrong remote config:

sudo cumin -b 1 "bast3002.wikimedia.org,cp4021.ulsfo.wmnet,db2082.codfw.wmnet,gerrit2001.wikimedia.org,naos.codfw.wmnet" "ipmi-config --category=core --key-pair="Lan_Channel:Volatile_Access_Mode=Always_Available" --key-pair="Lan_Channel:Non_Volatile_Access_Mode=Always_Available" --commit"
Mon, Jul 10, 10:44 PM · monitoring, Operations
Volans added a comment to T150160: Remote IPMI doesn't work for ~2% of the fleet.

I've run the audit again with a small script on neodymium in my home using cumin to grab the list of hostnames. The only requirement is to have exported IPMI_PASSWORD with the right password in the current environment (I use a space when setting it so that it doesn't get saved at all into the bash history because I'm using HISTCONTROL=ignoreboth).

Mon, Jul 10, 10:38 PM · monitoring, Operations
Volans added a comment to T169360: Unresponsive/misconfigured iDRACs over the host-BMC interface.

Regarding sodium it seems to me that puppet runs get stuck because it try to execute ipmi-config while loading facts:

Mon, Jul 10, 10:36 PM · monitoring, Operations, ops-codfw, ops-eqiad

Fri, Jul 7

Volans moved T169640: Support aliases in cumin from In Progress to In Code Review on the Operations-Software-Development board.
Fri, Jul 7, 8:07 AM · Operations-Software-Development
Volans added a comment to T169959: bast3002 didn't come up after reboot.

@MoritzMuehlenhoff the broken disk was known: T169959

Fri, Jul 7, 7:39 AM · Operations, ops-esams

Thu, Jul 6

Volans merged T169906: Degraded RAID on db2044 into T169693: db2044: Disk on predictive failure.
Thu, Jul 6, 5:04 PM · ops-codfw, DBA, Operations
Volans merged task T169906: Degraded RAID on db2044 into T169693: db2044: Disk on predictive failure.
Thu, Jul 6, 5:04 PM · Operations, ops-codfw
Volans added a comment to T169906: Degraded RAID on db2044.

Closing as duplicate of T169693 . The disk went to a failed state when removed and is now rebuilding.

Thu, Jul 6, 5:04 PM · Operations, ops-codfw
Volans added a comment to T169765: pybal should automatically reconnect to etcd.

Another thing that would be nice is the possibility to specify more than one conf host in profile::pybal::config_host: conf2001.codfw.wmnet, and allow pybal to connect to more hosts in case of connection failures.

Thu, Jul 6, 7:26 AM · Pybal, Traffic, Operations

Wed, Jul 5

mmodell awarded T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse a Yellow Medal token.
Wed, Jul 5, 5:05 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans closed T169619: Degraded RAID on ms-be2024 as Resolved.
Wed, Jul 5, 10:38 AM · Operations-Software-Development, Operations
Volans moved T169619: Degraded RAID on ms-be2024 from In Code Review to Done on the Operations-Software-Development board.
Wed, Jul 5, 10:38 AM · Operations-Software-Development, Operations
Volans moved T169619: Degraded RAID on ms-be2024 from In Progress to In Code Review on the Operations-Software-Development board.
Wed, Jul 5, 9:18 AM · Operations-Software-Development, Operations
Volans moved T169619: Degraded RAID on ms-be2024 from Backlog to In Progress on the Operations-Software-Development board.
Wed, Jul 5, 9:18 AM · Operations-Software-Development, Operations
Volans claimed T169619: Degraded RAID on ms-be2024.

False positive, I'll update the list of patterns to skip.

Wed, Jul 5, 8:32 AM · Operations-Software-Development, Operations
Volans updated subscribers of T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it.

@fgiunchedi FYI Doing some cleanup with @ArielGlenn we discovered that most of the load was generated by the prometheus-node-exporter:

3007 ?        Dsl   44:58 /usr/bin/prometheus-node-exporter -collector.diskstats.ignored-devices=^(ram|loop|fd)\\d+$ -collector.textfile.directory=/var/lib/prometheus/node.d -collectors.enabled=diskstats,filefd,filesystem,hwmon,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,tcpstat,textfile,time,uname
Wed, Jul 5, 8:29 AM · Patch-For-Review, Datasets-General-or-Unknown, Operations
Volans added a comment to T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it.

You could try to force unmount the NFS partition from the hosts that mount it, it should abort the outstanding I/O they are waiting for, also processes that are in TASK_KILLABLE state should be killable.

Wed, Jul 5, 7:27 AM · Patch-For-Review, Datasets-General-or-Unknown, Operations

Tue, Jul 4

Volans moved T169640: Support aliases in cumin from Backlog to In Progress on the Operations-Software-Development board.
Tue, Jul 4, 9:47 PM · Operations-Software-Development
Volans triaged T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it as High priority.
Tue, Jul 4, 9:38 PM · Patch-For-Review, Datasets-General-or-Unknown, Operations
Volans created T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it.
Tue, Jul 4, 9:38 PM · Patch-For-Review, Datasets-General-or-Unknown, Operations
Volans claimed T169640: Support aliases in cumin.
Tue, Jul 4, 2:18 PM · Operations-Software-Development
Volans created T169602: Upgrade tox on CI instances.
Tue, Jul 4, 8:11 AM · Release-Engineering-Team (Kanban), Continuous-Integration-Infrastructure

Mon, Jul 3

Volans closed T158747: Cumin: better error message if no config file is available as Resolved.
Mon, Jul 3, 8:44 PM · Operations-Software-Development
Volans closed T164838: Cumin: allow to specify a timeout per command as Resolved.
Mon, Jul 3, 8:43 PM · Operations-Software-Development
Volans moved T158747: Cumin: better error message if no config file is available from In Code Review to Done on the Operations-Software-Development board.
Mon, Jul 3, 8:43 PM · Operations-Software-Development
Volans moved T164838: Cumin: allow to specify a timeout per command from In Code Review to Done on the Operations-Software-Development board.
Mon, Jul 3, 8:43 PM · Operations-Software-Development
Volans added a comment to T46443: Jenkins: install tox on Precise labs instances.

@hashar is there still any reason to keep tox pinned to this old version? Any chance we could upgrade it to a newer one? Jessie backports has 2.5.0 and latest from pip is 2.7.0.

Mon, Jul 3, 6:36 PM · Continuous-Integration-Infrastructure
Volans added a comment to T169035: bast3002 sdb broken.

Opened T169564 for the mdadm configuration.

Mon, Jul 3, 6:17 PM · Operations, ops-esams
Dzahn awarded T169564: MD RAID: remove mdadm daily check a Like token.
Mon, Jul 3, 5:43 PM · Operations
Volans triaged T169564: MD RAID: remove mdadm daily check as Normal priority.
Mon, Jul 3, 5:39 PM · Operations
Volans created T169564: MD RAID: remove mdadm daily check.
Mon, Jul 3, 5:39 PM · Operations
Volans updated the task description for T164780: Sunset our use of Salt.
Mon, Jul 3, 1:28 PM · Goal, Technical-Debt, Operations-Software-Development, Operations
Volans added a project to T164780: Sunset our use of Salt: Goal.
Mon, Jul 3, 1:25 PM · Goal, Technical-Debt, Operations-Software-Development, Operations
Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

@mmodell thanks for cleaning those too, but I don't think that this method can be applied in general. There are ex-legit users that might be suspended (we have some of those) and their uploads would be deleted too although legit.

Mon, Jul 3, 10:52 AM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

@Aklapper @mmodell the cleaning effort is clearly not working! After my last full cleaning of recent files, there are already 134 files uploaded during the last few days by users that are disabled now. And the ~3k previously reported ones are still there.

Mon, Jul 3, 10:02 AM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

@Framawiki @Mainframe98 thanks for letting us know. I've disabled the two users and removed their files. I didn't touched the task T169502, as it's harmless at this point, I'll leave it to Phabricator admins.

Mon, Jul 3, 9:44 AM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T169035: bast3002 sdb broken.

And of course that was not enough, I had to also add an exit 0 to /etc/cron.daily/mdadm to prevent it from running, without the MAILADDR setting the report check refuses to run and generates cronspam.

Mon, Jul 3, 7:33 AM · Operations, ops-esams
Volans added a comment to T166965: Degraded RAID on lvs3001.

And of course that was not enough, I had to also add an exit 0 to /etc/cron.daily/mdadm to prevent it from running, without the MAILADDR setting the report check refuses to run and generates cronspam.

Mon, Jul 3, 7:33 AM · Traffic, ops-esams, Operations
Volans added a comment to T169035: bast3002 sdb broken.

I've commented out the MAILADDR line to avoid to get one email per day. Given that we have also the Icinga check we could consider to comment it out broadly across the fleet. The file is currently not managed by puppet.

Mon, Jul 3, 7:16 AM · Operations, ops-esams
Volans added a comment to T166965: Degraded RAID on lvs3001.

I've commented out the MAILADDR line to avoid to get one email per day. Given that we have also the Icinga check we could consider to comment it out broadly across the fleet. The file is currently not managed by puppet.

Mon, Jul 3, 7:15 AM · Traffic, ops-esams, Operations

Fri, Jun 30

Volans added a comment to T169355: Degraded RAID on db1052.

Rebuild completed, RAID back to optimal. There are 2 disks with predictive failure that might fail sooner or later

Fri, Jun 30, 9:47 PM · DBA, ops-eqiad, Operations
Volans added a project to T169355: Degraded RAID on db1052: DBA.

FYI: This is s1 master! Adding @Marostegui @jcrespo directly too for visibility.
At least the other 2 disks with predictive failure are in different spans.

Fri, Jun 30, 5:39 PM · DBA, ops-eqiad, Operations
Volans moved T169304: Cumin masters: simplify usage in case of emergency from In Progress to In Code Review on the Operations-Software-Development board.
Fri, Jun 30, 11:11 AM · Patch-For-Review, Operations-Software-Development
Volans moved T169304: Cumin masters: simplify usage in case of emergency from Backlog to In Progress on the Operations-Software-Development board.
Fri, Jun 30, 8:39 AM · Patch-For-Review, Operations-Software-Development
Volans created T169304: Cumin masters: simplify usage in case of emergency.
Fri, Jun 30, 8:38 AM · Patch-For-Review, Operations-Software-Development

Tue, Jun 27

Volans added a comment to T143175: Configure phabricator clustering for daemons and repositories.

@mmodell thanks for the additional info.

Tue, Jun 27, 5:48 PM · Release-Engineering-Team (Backlog), WorkType-NewFunctionality, Availability, Phabricator
Volans added a comment to T168881: Rename mw2148 / mw2149 / mw2259 / mw2260 to thumbor200[1234].

@fgiunchedi the old names needs to be cleaned from puppetdb too, I guess we need to add this step in the related documentations too.

Tue, Jun 27, 6:58 AM · ops-codfw, User-fgiunchedi, Operations, Performance-Team, Thumbor

Mon, Jun 26

Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

@Aklapper, yes AFAIK that field in the DB respect the query filter Upload Source in the file query page. But from what I'm seeing, that determines only from where the file was uploaded (I guess if from the file upload page or indirectly from a drag and drop in task comment, a Paste, etc...).

Mon, Jun 26, 3:00 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

@Aklapper not really, same order of magnitude, see below. Also what is exactly the difference? I've opened some of the isExplicitUpload = 0 and are usually images that could be spam as well.

Mon, Jun 26, 2:15 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T164206: Icinga loses downtime entries, causing alert and page spam.

512 sounds reasonable and seems the only pertinent one in that list, +1!

Mon, Jun 26, 8:48 AM · Icinga, Operations, monitoring
Volans added a comment to T168619: Degraded RAID on lvs3001.

Related to T166965, please do not close until the parent is fixed as well because it will be re-opened again if Icinga loose the downtimes/acknowledges or the alarm flaps for any reason.

Mon, Jun 26, 7:40 AM · ops-esams, Operations
Volans added a parent task for T168619: Degraded RAID on lvs3001: T166965: Degraded RAID on lvs3001.
Mon, Jun 26, 7:39 AM · ops-esams, Operations
Volans added a subtask for T166965: Degraded RAID on lvs3001: T168619: Degraded RAID on lvs3001.
Mon, Jun 26, 7:39 AM · Traffic, ops-esams, Operations

Fri, Jun 23

Volans updated subscribers of T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

Today @MaxSem got another report of an abuse file in Phabricator. After checking it I found that was uploaded last week during the incident and I coudn't find it in Phabricator UI in the file search, but I could confirm it was there in the DB.

Fri, Jun 23, 9:11 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T156933: Improve purging for analytics-slave data on Eventlogging.

@elukey thanks for the update. Which one is the final change to be reviewed then?

Fri, Jun 23, 5:30 PM · Patch-For-Review, User-Elukey, Analytics-Kanban
Volans updated subscribers of T143175: Configure phabricator clustering for daemons and repositories.

As per the obvious concerns about www-data being able to ssh as the Phabricator daemon_user, that seems to be a hard requirement on Phabricator side in order to enable the clustering unfortunately. I'd like @MoritzMuehlenhoff to comment on this too, in the end is a trade off between security and high availability.
Is not fully clear to me right now in the single host configuration what kind of access has www-data to the repositories data. The sudo line seems limited to git-upload-pack and git-receive-pack.

Fri, Jun 23, 9:33 AM · Release-Engineering-Team (Backlog), WorkType-NewFunctionality, Availability, Phabricator
Volans added a comment to T143175: Configure phabricator clustering for daemons and repositories.

Following @Dzahn last message on https://gerrit.wikimedia.org/r/#/c/324841/3 , continuing here.

Fri, Jun 23, 9:18 AM · Release-Engineering-Team (Backlog), WorkType-NewFunctionality, Availability, Phabricator

Jun 21 2017

Volans added a comment to T167504: New tool to track package updates/status for hosts and images (debmonitor).

We should also investigate other available tools in the container space, for example one recently released is https://github.com/puppetlabs/lumogon or from CoreOS https://github.com/coreos/clair (thanks @Joe for this one). Disclaimer: I've not yet done an extensive search for other available tools ;)

Jun 21 2017, 9:43 AM · Operations-Software-Development, Operations

Jun 19 2017

Volans added a comment to T156933: Improve purging for analytics-slave data on Eventlogging.

What @jcrespo said, see also my comment on https://gerrit.wikimedia.org/r/#/c/356383/12/modules/role/files/mariadb/eventlogging_cleaner.py@206 regarding the addition of an ORDER BY.

Jun 19 2017, 11:07 AM · Patch-For-Review, User-Elukey, Analytics-Kanban

Jun 17 2017

Volans added a comment to T168142: Cleanup phabricator.wikimedia.org uploaded files, WP zero abuse.

Sorry for the late reply, partially because I was too busy cleaning stuff around to reply here (thanks Reedy for the help) and partially to not give too much of a realtime feedback to the abusers.
Thanks everyone here that helped notifying us and limiting the impact whenever possible.

Jun 17 2017, 9:38 PM · Patch-For-Review, Wikimedia-Incident, Wikimedia-Site-requests, Phabricator
Volans added a comment to T159922: pdfrender fails to serve requests since Mar 8 00:30:32 UTC on scb1003.

There is an ETA for a permanent fix? It seems to me that we've already delayed this too much given the frequency at which it's happening lately.

Jun 17 2017, 4:55 PM · Services (blocked), Reading-Web-Backlog (Tracking), Patch-For-Review, Operations, Electron-PDFs

Jun 16 2017

Volans moved T164838: Cumin: allow to specify a timeout per command from In Progress to In Code Review on the Operations-Software-Development board.
Jun 16 2017, 4:42 PM · Operations-Software-Development
Volans added a hashtag to Operations-Software-Development: #cumin.
Jun 16 2017, 10:40 AM

Jun 14 2017

Volans added a comment to T156120: Update gerrit to 2.14.2.

@Paladox FYI I'm still getting Invalid SSH Key when trying to add my key

Jun 14 2017, 1:30 PM · Release-Engineering-Team (Backlog), Patch-For-Review, Gerrit

Jun 13 2017

Volans moved T166371: Monitoring: create an alert for daemonized puppet from Backlog to Done on the Operations-Software-Development board.
Jun 13 2017, 8:27 AM · Patch-For-Review, Operations-Software-Development, Operations, monitoring

Jun 12 2017

Volans added a comment to T167504: New tool to track package updates/status for hosts and images (debmonitor).

@akosiaris yes we were aware of it and I spoke with @Joe last week about the requirements for the Docker part, sorry to not have mentioned/referenced it here too. The idea is to have a single tool at this point that can work for both physical hosts and Docker images, so it should overlap fully with the requirements of T167269.

Jun 12 2017, 3:00 PM · Operations-Software-Development, Operations
Volans closed T167394: Cumin: fix ok_codes when set to empty list as Resolved.
Jun 12 2017, 1:43 PM · Operations-Software-Development
Volans closed T167392: Cumin: fix --success-percentage 0 as Resolved.
Jun 12 2017, 1:42 PM · Operations-Software-Development

Jun 9 2017

Volans updated subscribers of T167504: New tool to track package updates/status for hosts and images (debmonitor).
Jun 9 2017, 1:34 PM · Operations-Software-Development, Operations
Volans updated the task description for T167504: New tool to track package updates/status for hosts and images (debmonitor).
Jun 9 2017, 1:21 PM · Operations-Software-Development, Operations

Jun 8 2017

Volans created T167422: Monitoring: add link to graph for Icinga timeseries alarms.
Jun 8 2017, 2:47 PM · Operations, monitoring
Volans moved T167392: Cumin: fix --success-percentage 0 from In Progress to In Code Review on the Operations-Software-Development board.
Jun 8 2017, 10:20 AM · Operations-Software-Development
Volans moved T167394: Cumin: fix ok_codes when set to empty list from In Progress to In Code Review on the Operations-Software-Development board.
Jun 8 2017, 10:20 AM · Operations-Software-Development