Page MenuHomePhabricator

Update diamond to latest upstream version
Closed, ResolvedPublic

Description

diamond 4.0 has been released in December, we're running 3.5 though

Event Timeline

fgiunchedi raised the priority of this task from to Medium.
fgiunchedi updated the task description. (Show Details)
fgiunchedi added a project: observability.
fgiunchedi added a subscriber: fgiunchedi.

We run 4.0 on stretch systems nowadays. Would it be worthwhile to backport it to jessie and trusty? Anything that we're missing from 3.5?

It is a lot of development history between the two releases (https://github.com/python-diamond/Diamond/compare/v3.5...v4.0.515) and I'd say some updated/improved collectors essentially, nothing we're badly missing AFAICT.

I tried a quick backport and on jessie it works as is, on trusty it doesn't (debhelper >= 10, but even when lowering that to >= 9 still FTBFS so more fiddling is needed there). If we really wanted we could go with the jessie backport and leave trusty behind. When trusty is out of the door we can deprecate our diamond package altogether and use Debian's

If you've backported it already, yeah, we can go forward I'd say :) We can leave trusty behind too, I don't see this as a big deal at all.

Mentioned in SAL (#wikimedia-operations) [2017-07-20T14:23:37Z] <godog> upload diamond 4.0.515-4~bpo8+1 to jessie-wikimedia - T97635

Mentioned in SAL (#wikimedia-operations) [2017-07-20T14:41:20Z] <godog> upload diamond 4.0.515-4~bpo8+2 to jessie-wikimedia - T97635

I tried on cp1008 and a couple of thumbor machines and diamond seems to work just fine, package is uploaded and pending rollout to jessie machines

Mentioned in SAL (#wikimedia-operations) [2017-07-25T09:14:19Z] <godog> upgrade diamond to 4.0.515 in ulsfo and esams - T97635

Mentioned in SAL (#wikimedia-operations) [2017-07-25T09:20:57Z] <godog> upgrade diamond to 4.0.515 in codfw - T97635

Mentioned in SAL (#wikimedia-operations) [2017-07-25T12:09:05Z] <godog> upgrade diamond to 4.0.515 in eqiad - T97635

fgiunchedi renamed this task from update diamond to latest upstream version to Update diamond to latest upstream version.Jul 25 2017, 12:24 PM

It looks like diamond still takes a long time stop, this was reported by @faidon at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854842 and https://github.com/python-diamond/Diamond/issues/595 though it doesn't seem to be fixed (~40s on jessie, ~20s on stretch)

root@lithium:/srv/syslog# time systemctl stop diamond

real	0m40.043s
user	0m0.000s
sys	0m0.000s

root@ms-be2020:~# time systemctl stop diamond

real	0m20.070s
user	0m0.000s
sys	0m0.000s

Applying https://github.com/Ssawa/Diamond/commit/8b58d7a7dd2a1249731b0642b35e7d7cbdcf611f from the github issue fixes it and stop is fast again. The patch isn't applied upstream yet though

# time systemctl start diamond && sleep 30 && time systemctl stop diamond

real	0m0.007s
user	0m0.000s
sys	0m0.000s

real	0m0.037s
user	0m0.000s
sys	0m0.000s
fgiunchedi changed the task status from Open to Stalled.Jul 28 2017, 9:28 AM

The --log-stdout issue has been filed as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=869970

As for the slow shutdown I've reopened https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854842

Both issues (debug log and slow stop) have been bandaided in our puppet in the meantime

Fixed for our purposes, we can follow-up on upstream's/Debian's bug reports for the long-term fixes.