colors for icinga-wm as well. wikibugs has them. so yay
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 21 2015
In T84163#1647247, @Dzahn wrote:I scheduled a downtime for this service of 1 month with a link to this ticket.
on neon:
@faidon this must be long time ago, right. can you confirm this should be closed?
In T94896#1567576, @BBlack wrote:new one about eventually looking at the specific upload problem.
In T94896#1567576, @BBlack wrote:Well this basically got solved along the way while doing other things. ... I think we can go ahead and close this ticket in favor of a possible new one about eventually looking at the specific upload problem.
this sounds like something for check_graphite, right? YuvOri?
Thank you!
can we first see with cipher_list which are available?
In T111654#1737846, @jcrespo wrote:
- (I assume `ssl_cipher=TLSv1.2
In T111654#1737846, @jcrespo wrote:
- Recommended cipher and key length (I suppose 2048), that we use for other production services (I assume ssl_cipher=TLSv1.2,
or anyone, can you fix https://gerrit.wikimedia.org/r/#/c/247760/2/modules/icinga/manifests/gsbmonitoring.pp with different options of check_http? i tried with -f follow , -f sticky among other things but did not find a solution
i'll still say priority normal since this is broken monitoring (due to Google changing things on their side), not actual alarms that our sites have a problem
"The API key format has changed. API keys are now managed in the Google Developers Console,"
12:19 < mutante> papaul: are all the cisco servers shut down?
12:20 < papaul> no
12:20 < papaul> there are stay up
12:20 < papaul> doing the wipe
12:20 < mutante> but you dont need mgmt to do that?
12:20 < papaul> no
12:20 < papaul> i don't
12:20 < mutante> alright,ok
should this use the real API ?
In T34796#1737605, @hashar wrote:Do we really care of having status.wikimedia.org to be served over TLS?
still not really working after switch to https, won't find the string
@EBernhardson Thank you for checking that. Ok, this was also confirmed by Otto on the gerrit change.
Change 247480 merged by Dzahn:
Removed mgmt DNS for virt20[0-1][1-9], pc200[1-3], labsdb200[1-3] and WMF5709
Oct 20 2015
Change 247760 merged by Dzahn:
Switch safe browsing checks to HTTPS
Change 247760 had a related patch set uploaded (by MaxSem):
Switch safe browsing checks to HTTPS
Racked, cabled and ILO setup. DNS completed
dumps: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=dataset1001&service=HTTPS
OTRS: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=iodine&service=HTTPS
lists: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=fermium&service=HTTPS
icinga: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=neon&service=HTTPS
gerrit: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=ytterbium&service=HTTPS
RT: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=magnesium&service=HTTPS
planet: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=planet1001&service=HTTPS (BROKEN, this needs to be on the cp boxes, special case! )
librenms: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=netmon1001&service=HTTPS
Change 247754 merged by Dzahn:
Update safe browsing checks
Change 247754 had a related patch set uploaded (by MaxSem):
Update safe browsing checks
16:00 < icinga-wm> CUSTOM - Host google is UP: PING OK - Packet loss = 0%, RTA = 9.61 ms
Definitely not varnish!
Change 247744 merged by Dzahn:
planet: add ssl cert expiry check
Change 247744 had a related patch set uploaded (by Dzahn):
planet: add ssl cert expiry check
Change 244617 merged by Dzahn:
dumps: add cert expiry check
In T106517#1722718, @Tgr wrote:See [[ https://github.com/wikimedia/mediawiki/blob/a2d6ecc4539e60501803155990ec36575bdb4332/includes/filerepo/FileRepo.php#L1764 | FileRepo::nameForThumb() ]] for how the thumbnail file name (the part after the /) is generated. IIRC abbrvThreshold is 200 for Wikimedia sites.
Change 247721 had a related patch set uploaded (by Thcipriani):
Remove trebuchet user from wikidev group
progress tracking on etherpad now:
In T81030#1739822, @demon wrote:Apache syslog error rate, MW debug log error rates, HHVM error rates and OOMs all tracked via this dashboard now.
In T115760#1739925, @thcipriani wrote:It seems like the Right Thing™ would be to make wikidev the primary group for the trebuchet user.
So, currently, it doesn't matter if the trebuchet user is in the wikidev group, this has only been the case since commit acfeeefb landed.
and let's also have meta monitoring. icinga itself should have a working cert :)
Apache syslog error rate, MW debug log error rates, HHVM error rates and OOMs all tracked via this dashboard now.
Also, for the record we are now talking about beaconImpressions files, not bannerImpressions. E.g. /archive/banner_logs/2015/beaconImpressions-sampled10.tsv-20151020-184501.log.gz
Confirmed that the campaign is intact. All the pipeline does is store URLs in a file, the banner impression loader job is what's responsible for parsing these URLs and importing into the database.
Change 244436 had a related patch set uploaded (by Yurik):
maps: Add tileratorui service
Change 244614 merged by Dzahn:
icinga: add cert expiry check for icinga itself
Change 247613 merged by Andrew Bogott:
Logstash: track apache2 syslog error rate in statsd
pc200X is not in use, and pending to be replaced. I didn't even know labsdb200X existed.
Change 247615 had a related patch set uploaded (by Jcrespo):
Enabling performance schema experimentally on db1018
fe-0/0/5 up up Transit: <! Equinix OOB {#?} [100Mbps Cu]
Change 247613 had a related patch set uploaded (by Chad):
Logstash: track apache2 syslog error rate in statsd