Gehel (Guillaume Lederrey)
Operations Engineer - Discovery

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Nov 9 2015, 9:18 PM (114 w, 2 d)
Availability
Available
IRC Nick
gehel
LDAP User
Gehel
MediaWiki User
GLederrey (WMF)

Recent Activity

Yesterday

Gehel assigned T182840: kartotherian package repo fails to build to mobrovac.

I tried to setup a docker on a labs VM to run the build process as root (neither @Pnorman nor me are very keen on launching a big pile of code that we don't understand with a sudo in front). The build still fails with the same error.

Wed, Jan 17, 8:21 PM · Services (next), service-runner, Maps-Sprint, Maps (Maps-data)

Tue, Jan 16

Gehel added a comment to T184083: Define the constraints of the new WDQS cluster.

Since there are no objections here, I'll move this on wiki and then we can close this task.

Tue, Jan 16, 6:12 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service

Fri, Jan 12

Gehel added a comment to T166246: WDQS service on Maven does not include node_modules.

Mixing maven and npm projects from 2 different repositories in the same source tree by using git submodules is somewhat ugly, and we can't seem to agree on the right way to do it. So a different idea:

Fri, Jan 12, 8:53 AM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Discovery, Wikidata, Wikidata-Query-Service

Wed, Jan 10

Gehel added a comment to T182840: kartotherian package repo fails to build.

Yep, same issue with docker 17.12.0-ce

Wed, Jan 10, 5:05 PM · Services (next), service-runner, Maps-Sprint, Maps (Maps-data)
Gehel added a comment to T182840: kartotherian package repo fails to build.

I'm on Ubuntu 17.10, with docker 1.5 (the one provided by Ubuntu, fairly old). I can try to install latest. @mobrovac seemed to have reproduced the issue and I expect him to be running something less ancient than me. Which leads me to think that upgrading might not be sufficient.

Wed, Jan 10, 4:56 PM · Services (next), service-runner, Maps-Sprint, Maps (Maps-data)
Gehel created T184617: Give access to S4 (procurement tasks) to Erika Bjune.
Wed, Jan 10, 4:30 PM · Operations, procurement

Tue, Jan 9

Gehel added a comment to T184083: Define the constraints of the new WDQS cluster.

So the technical limitations are:

Tue, Jan 9, 6:21 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel claimed T184083: Define the constraints of the new WDQS cluster.
Tue, Jan 9, 6:11 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel moved T184083: Define the constraints of the new WDQS cluster from Backlog to In progress on the Discovery-Search (Current work) board.
Tue, Jan 9, 6:10 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel claimed T181627: Port elasticsearch metrics to Prometheus.
Tue, Jan 9, 6:04 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Upstream has released version 1.0.2 with the additional elasticsearch metrics that we need: https://github.com/justwatchcom/elasticsearch_exporter/tree/v1.0.2

Tue, Jan 9, 2:10 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T183306: wmf-elasticsearch-search-plugin not available for Stretch.

I've copied the plugins to stretch, so we should be good. I have not rebuilt the package as this is basically only a .zip with pre-compiled .jar. Let me know if this actually works in stretch!

Tue, Jan 9, 10:43 AM · Packaging, MediaWiki-Vagrant

Mon, Jan 8

Gehel added a comment to T182991: New WDQS clusters eqiad + codfw.

Can you perhaps briefly explain how the specs compare to the existing WDQS clusters? Because I would assume that the internal clusters will see much less traffic than the external ones.

Mon, Jan 8, 6:53 PM · hardware-requests, Discovery, Wikidata, Discovery-Wikidata-Query-Service-Sprint, Operations, Wikidata-Query-Service
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Playing with jmx_exporter and elasticsearch, it looks like the metrics exposed through the elasticsearch API are already sufficient, and the naming is consistent between jmx_exporter and elasticsearch_exporter. So we can probably drop the jmx_exporter for elasticsearch.

Mon, Jan 8, 5:05 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T184434: prometheus-blazegraph-exporter failing to start after reboot.

Extract from the logs:

Mon, Jan 8, 2:28 PM · Patch-For-Review, Discovery, Wikidata, Operations, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel created T184434: prometheus-blazegraph-exporter failing to start after reboot.
Mon, Jan 8, 2:15 PM · Patch-For-Review, Discovery, Wikidata, Operations, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service

Fri, Jan 5

Gehel committed rDPOMf5993fd70215: [maven-release-plugin] prepare for next development iteration (authored by Gehel).
[maven-release-plugin] prepare for next development iteration
Fri, Jan 5, 8:26 PM
Gehel committed rDPOM0a8c4bbcb94b: [maven-release-plugin] prepare release discovery-parent-pom-1.9 (authored by Gehel).
[maven-release-plugin] prepare release discovery-parent-pom-1.9
Fri, Jan 5, 8:26 PM
Gehel committed rDPOM6db053220a69: Update dependencies and plugins to latest stable versions (authored by Gehel).
Update dependencies and plugins to latest stable versions
Fri, Jan 5, 8:26 PM

Thu, Jan 4

Gehel closed T162240: Move kartotherian / tilerator configuration to scap3 as Declined.

We are now managing sources with puppet. The most important part was to get them out of the kartotherian / tilerator repo, so this is solved.

Thu, Jan 4, 7:18 PM · Maps (Kartotherian), Discovery
Gehel closed T162240: Move kartotherian / tilerator configuration to scap3, a subtask of T156682: Deploying new vector tiles to production, as Declined.
Thu, Jan 4, 7:18 PM · Maps-Sprint, Maps (Maps-data)
Gehel moved T184083: Define the constraints of the new WDQS cluster from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Jan 4, 6:34 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

A .deb of prometheus jxm_exporter is now available. I started to experiment on deployment-elastic06. Elasticsearch installs a fairly strict security manager that prevents the jmx_exporter agent from working properly. Adding a /home/elasticsearch/.java.policy file with the content below solves the issue:

Thu, Jan 4, 4:54 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations

Wed, Jan 3

Gehel moved T178721: WDQS tests unstable with some thread leak errors from In progress to Done on the Discovery-Wikidata-Query-Service-Sprint board.

I'm not entirely confident that the above patch will solve all our issues, but it should at least solve some. We can close this task and re-open if we find other occurrences of failed integration tests.

Wed, Jan 3, 8:39 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Discovery, Wikidata-Query-Service, Wikidata
Gehel added a comment to T166246: WDQS service on Maven does not include node_modules.

As far as I can see, at the moment, the Maven build only adds the gui files to the package, without any build step. We could probably move the gui to a maven module and use the frontend-maven-plugin to execute the build steps required. And add the resulting artifact to the packaging.

Wed, Jan 3, 5:24 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Discovery, Wikidata, Wikidata-Query-Service
Gehel added a comment to T184083: Define the constraints of the new WDQS cluster.

Without getting into the specific numbers (which we will tune based on experience), I agree that we could (and probably should) have a short timeout, since we expect requests to be short... I'd say that in the context of this task, the important point is to specify that we expect the requests to be cheap. Whether or not we need to put a hard constraint on this is at this point an implementation detail.

Wed, Jan 3, 5:22 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel created T184083: Define the constraints of the new WDQS cluster.
Wed, Jan 3, 5:06 PM · Discovery-Search (Current work), Discovery, Wikidata, Structured-Data-Commons, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel moved T178721: WDQS tests unstable with some thread leak errors from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.

I did not really expect this patch to help in any way. But... digging into https://gerrit.wikimedia.org/r/#/c/399864/ I see that the HC client in WikibaseRepository isn't closed properly. This is fixed in that CR, but since it includes a bunch of other stuff as well, we might want to extract it. I'll have a look into it.

Wed, Jan 3, 9:32 AM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Discovery, Wikidata-Query-Service, Wikidata
Gehel moved T181989: Icinga check for WDQS should do an actual query from Backlog to Done on the Discovery-Wikidata-Query-Service-Sprint board.

Yep, this is actually done and deployed.

Wed, Jan 3, 9:28 AM · Patch-For-Review, Operations, Discovery, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata

Sun, Dec 24

Gehel updated subscribers of T183661: Enable <mapframe> on Latvian Wikipedia.

@Fjalapeno just so you know this is going on...

Sun, Dec 24, 6:18 PM · Maps-Sprint, Patch-For-Review, Maps (Kartographer), User-Urbanecm, Wikimedia-Site-requests

Fri, Dec 22

Gehel created P6502 SonarEmptyLabelServiceOptimizer.
Fri, Dec 22, 1:34 PM
Gehel created P6501 Analysing 'SonarEmptyLabelServiceOptimizer.java'....
Fri, Dec 22, 1:34 PM

Thu, Dec 21

Gehel moved T182304: Cleanup multiple definitions of logstash endpoint in puppet / hiera from Backlog to In progress on the Discovery-Search (Current work) board.
Thu, Dec 21, 1:41 PM · Patch-For-Review, Discovery-Search (Current work), Wikimedia-Logstash, Operations
Gehel created T183455: Error while ordering a payment slip from Switzerland to do a donation.
Thu, Dec 21, 10:49 AM · Fundraising-Backlog
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Short summary of the status of this task (since quite a lot of discussion happened):

Thu, Dec 21, 10:00 AM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel created T183451: Collect per node latency percentiles on our elasticsearch cirrus clusters.
Thu, Dec 21, 10:00 AM · Discovery-Search (Current work)

Tue, Dec 19

Gehel added a comment to T182857: Create Prometheus exporter for Blazegraph.

I updated the grafana dashboard as well: https://grafana-admin.wikimedia.org/dashboard/db/wikidata-query-service-prometheus?refresh=1m&orgId=1

Tue, Dec 19, 9:02 PM · Patch-For-Review, cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T182773: Create Prometheus exporter for wdqs-updater.

I updated the grafana dashboard as well: https://grafana-admin.wikimedia.org/dashboard/db/wikidata-query-service-prometheus?refresh=1m&orgId=1

Tue, Dec 19, 9:02 PM · Patch-For-Review, cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T183071: Import kibana package from jessie into stretch.

The elasticsearch / kibana / logstash packages have already been uploaded to our stretch repo (under the thirdparty/elastic55 component). This should fix the issue reported here. As I understand, @Paladox is only using vagrant to do some testing around logstash, so he probably doesn't need all our custom packages.

Tue, Dec 19, 4:19 PM · Patch-For-Review, MediaWiki-Vagrant, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

I'm probably doing somethign wrong with the ~jessie1 and ~stretch1 versions. It's JVM right? So the same build should be good for both distros? Not sure. I've just always done that and didn't bother to check if I shouldn't.

Tue, Dec 19, 3:59 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Issue opened upstream to include those metrics: https://github.com/justwatchcom/elasticsearch_exporter/issues/115

Tue, Dec 19, 3:46 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Additional missing metrics:

Tue, Dec 19, 3:38 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a project to T183254: Zoom buttons not present: Maps-Sprint.

@MaxSem: That's well outside my expertise... could you have a look if you have time? Thanks!

Tue, Dec 19, 3:09 PM · Maps-Sprint, Maps (Kartographer)

Dec 18 2017

Gehel added a comment to T177225: Uninstall ganglia from the fleet.

Alright, Ganglia is purged from everything across the board, except 17 hosts now! :) They are:

4 x maps codfw (osm/postgres)
4 x maps eqiad (osm/postgres)
3 x maps-test codfw (osm/postgres)

Dec 18 2017, 5:53 PM · Patch-For-Review, Operations, monitoring
Gehel updated subscribers of T181627: Port elasticsearch metrics to Prometheus.

While migrating existing grafana dashboards, it looks like some dashboards are broken and most probably unused. We should delete them instead of taking time to migrate:

Dec 18 2017, 1:59 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T23582: Transliteration of Crimean Wiki.

I had a look at the patch above from @TJones. A few notes:

Dec 18 2017, 1:21 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter
Gehel moved T182583: maps-test2001 is low on disk space from In progress to Done on the Maps-Sprint board.

Reducing cassandra replication factor frees enough space that we don't have an immediate issue anymore (compaction is running without issue). The goal being to move the maps test environment to WMCS and reduce the dataset size, we should not invest more time here.

Dec 18 2017, 9:30 AM · Discovery, Maps, Operations, Maps-Sprint

Dec 15 2017

Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Looking at the prometheus-jmx-exporter .deb, it seems to depend on default-jre which is openjdk-7-jre on Jessie. We use OpenJDK 8 for elasticsearch (and wdqs, and hopefully for most of our Java based stack). Having 2 different JRE installed is a source of unnecessary pain. My understanding of Debian packaging is low enough that I don't really know how this should be fixed. Maybe depend on java7-runtime instead of default-jre since both openjdk-7 and openjdk-8 provide it? Or remove the JRE dependency completely, since the application being monitored is responsible to depend on the appropriate JRE (this is true for the agent part of the exporter, not for the HTTP server part).

Dec 15 2017, 3:44 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel created T182991: New WDQS clusters eqiad + codfw.
Dec 15 2017, 2:49 PM · hardware-requests, Discovery, Wikidata, Discovery-Wikidata-Query-Service-Sprint, Operations, Wikidata-Query-Service

Dec 14 2017

Gehel added a comment to T182840: kartotherian package repo fails to build.

The source repo used is at https://gerrit.wikimedia.org/r/#/admin/projects/maps/kartotherian/package the deploy repo is https://gerrit.wikimedia.org/r/#/admin/projects/maps/kartotherian/deploy (those are not yet synced with diffusion / github, see T182848).

Dec 14 2017, 2:14 PM · Services (next), service-runner, Maps-Sprint, Maps (Maps-data)
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

The sum of all shard states does make some kind of sense: the some of all states should give the total number of shards, but the total is also exposed... so not sure.

Dec 14 2017, 2:03 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel moved T182583: maps-test2001 is low on disk space from Backlog to In progress on the Maps-Sprint board.
Dec 14 2017, 12:18 PM · Discovery, Maps, Operations, Maps-Sprint
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

While updating the Shards graph, I found that the way shards are exposed through prometheus is not optimal. We have different metric names for different shard states:

Dec 14 2017, 10:13 AM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

Now that I understand a bit better how prometheus works, the jmx_exporter starts to be scary. If I understand correctly, it exposes all MBeans and filtering is done after the fact. We should really never use the jmx_exporter without a whitelist.

Dec 14 2017, 10:06 AM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel updated subscribers of T182840: kartotherian package repo fails to build.

@MaxSem would you have any idea what's going on there?

Dec 14 2017, 8:16 AM · Services (next), service-runner, Maps-Sprint, Maps (Maps-data)
Gehel created T182848: Setup diffusion and github sync for kartotherian and tilerator package repositories.
Dec 14 2017, 7:23 AM · Repository-Admins, Maps (Tilerator), Maps-Sprint, Release-Engineering-Team

Dec 13 2017

Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

elasticsearch_exporter is deployed on all elasticsearch nodes. Still to do:

Dec 13 2017, 6:09 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations

Dec 11 2017

Gehel moved T182066: Requesting access to deploy-service for pnorman from Needs review to Done on the Maps-Sprint board.
Dec 11 2017, 5:27 PM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests
Gehel added a comment to T182066: Requesting access to deploy-service for pnorman.

This has been approved in Ops meeting as well.

Dec 11 2017, 5:24 PM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests
Gehel closed T175799: port elasticsearch diamond collector to prometheus as Declined.

This is a duplicate of T181627

Dec 11 2017, 5:09 PM · Discovery-Search (Current work), monitoring, Operations
Gehel closed T175799: port elasticsearch diamond collector to prometheus, a subtask of T177196: Port non-deprecated Diamond collectors to Prometheus, as Declined.
Dec 11 2017, 5:09 PM · cloud-services-team (Kanban), Patch-For-Review, User-fgiunchedi, Goal, Operations
Gehel added a project to T181627: Port elasticsearch metrics to Prometheus: Discovery-Search (Current work).
Dec 11 2017, 5:08 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T181627: Port elasticsearch metrics to Prometheus.

I tried jmx_exporter on deployment-logstash2 with the results below. A few notes: the exporter config needs to be somewhere accessible by elasticsearch (e.g. /srv/elasticsearch) or asking for metrics fails with something like [...]

Dec 11 2017, 2:53 PM · Patch-For-Review, Discovery-Search (Current work), cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
Gehel added a comment to T182583: maps-test2001 is low on disk space.

It looks like Cassandra does not have enough space to do compaction:

Dec 11 2017, 2:18 PM · Discovery, Maps, Operations, Maps-Sprint
Gehel added a comment to T182583: maps-test2001 is low on disk space.

@MaxSem good point! I'll check

Dec 11 2017, 2:07 PM · Discovery, Maps, Operations, Maps-Sprint
Gehel created T182583: maps-test2001 is low on disk space.
Dec 11 2017, 2:01 PM · Discovery, Maps, Operations, Maps-Sprint
Gehel added a comment to T182564: npm warnings when installing Kartotherian dependencies.

Oh yes, they could use some maintenance! Note that maps/tilerator is just as bad. Also note that changes happen upstream on github (http://github.com/kartotherian/), it might make sense to open issues there.

Dec 11 2017, 10:37 AM · Discovery, Maps (Kartotherian)

Dec 7 2017

Gehel moved T108435: Add proper expiry headers to kartotherian's responses from Prioritized to Needs review on the Maps-Sprint board.
Dec 7 2017, 7:58 PM · Maps (Kartotherian), Discovery, Maps-Sprint
Gehel created T182304: Cleanup multiple definitions of logstash endpoint in puppet / hiera.
Dec 7 2017, 10:34 AM · Patch-For-Review, Discovery-Search (Current work), Wikimedia-Logstash, Operations
Gehel added a comment to T181531: Upgrade Toolforge Elasticsearch to 5.5.x.

@bd808 all other elasticsearch clusters have been upgraded. Let me know if you need any help on that one. Once you are done, feel free to close T174662 as well.

Dec 7 2017, 10:04 AM · cloud-services-team (Kanban), Toolforge, Discovery-Search, Elasticsearch, Discovery
Gehel moved T178412: Upgrade logstash cluster to elastic 5.5.x from In progress to Done on the Discovery-Search (Current work) board.
Dec 7 2017, 10:02 AM · Patch-For-Review, Discovery-Search (Current work), CirrusSearch, Elasticsearch, Discovery

Dec 6 2017

Gehel removed projects from T175830: decommission logstash100[1-3]: Patch-For-Review, Discovery-Search (Current work).
Dec 6 2017, 5:23 PM · hardware-requests, Wikimedia-Logstash, Operations
Gehel reassigned T175830: decommission logstash100[1-3] from Gehel to RobH.

My steps for decommissioning are done (see checklist in the task description). Assigning to @RobH to continue.

Dec 6 2017, 9:56 AM · hardware-requests, Wikimedia-Logstash, Operations
Gehel updated the task description for T175830: decommission logstash100[1-3].
Dec 6 2017, 9:55 AM · hardware-requests, Wikimedia-Logstash, Operations

Dec 5 2017

Gehel moved T160639: Package ClearTables to deploy it on maps servers from Prioritized to Done on the Maps-Sprint board.

In the end, the solution choosen is to deploy ClearTables using npm. Work is tracked on T160639. We can close this as a duplicate.

Dec 5 2017, 5:26 PM · Maps-Sprint, Maps (Maps-data)
Gehel renamed T162241: Deploy meddo / cleartables as part of tilerator / kartotherian from Deploy meddo as part of tilerator / kartotherian to Deploy meddo / cleartables as part of tilerator / kartotherian.
Dec 5 2017, 5:25 PM · Services (watching), Patch-For-Review, Maps-Sprint, Maps (Maps-data)
Gehel updated subscribers of T180907: <mapframe>, Kartographer: adding zoom level 19.

https://gerrit.wikimedia.org/r/#/c/394948/ should do the trick and enable zoom level 19. Before merging it, I'd like to:

Dec 5 2017, 5:23 PM · Maps-Sprint, Patch-For-Review, Maps (Kartographer), Discovery
Gehel added a comment to T162241: Deploy meddo / cleartables as part of tilerator / kartotherian.

kartotherian has a new working packaging repo. tilerator is in progress. Still to do:

Dec 5 2017, 5:20 PM · Services (watching), Patch-For-Review, Maps-Sprint, Maps (Maps-data)
Gehel moved T181808: Kartotherian error with new deployment: "core.strToFloat is not a function" from In progress to Done on the Maps-Sprint board.

With @Pnorman fix to kartotherian/snapshot, this seems to just work.

Dec 5 2017, 5:19 PM · Maps (Kartotherian), Maps-Sprint
Gehel updated the task description for T175830: decommission logstash100[1-3].
Dec 5 2017, 4:14 PM · hardware-requests, Wikimedia-Logstash, Operations
Gehel added a project to T175830: decommission logstash100[1-3]: hardware-requests.
Dec 5 2017, 4:11 PM · hardware-requests, Wikimedia-Logstash, Operations
Gehel moved T175830: decommission logstash100[1-3] from Backlog to In progress on the Discovery-Search (Current work) board.
Dec 5 2017, 4:08 PM · hardware-requests, Wikimedia-Logstash, Operations
Gehel moved T181412: HP RAID Battery issue on elastic2004 from In progress to Done on the Discovery-Search (Current work) board.

maintenance has been done, icinga check is green again. We can close this.

Dec 5 2017, 2:12 PM · Discovery-Search (Current work), Elasticsearch, Operations, ops-codfw, Discovery
Gehel moved T182066: Requesting access to deploy-service for pnorman from Backlog to Needs review on the Maps-Sprint board.
Dec 5 2017, 10:10 AM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests
Gehel added a project to T182066: Requesting access to deploy-service for pnorman: Maps-Sprint.
Dec 5 2017, 10:10 AM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests
fgiunchedi awarded T175242: all log producers need to use the logstash LVS endpoint a Mountain of Wealth token.
Dec 5 2017, 9:43 AM · Patch-For-Review, Operations, Discovery-Search (Current work), Wikimedia-Logstash
Gehel updated the task description for T182066: Requesting access to deploy-service for pnorman.
Dec 5 2017, 9:42 AM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests
Gehel created T182066: Requesting access to deploy-service for pnorman.
Dec 5 2017, 9:42 AM · Maps-Sprint, Patch-For-Review, Operations, Ops-Access-Requests

Dec 4 2017

Gehel moved T175242: all log producers need to use the logstash LVS endpoint from Needs review to Done on the Discovery-Search (Current work) board.

Monitoring traffic for a few hours on logstash100[123] shows that nothing is coming into any of the logstash ports. Thanks to every one who helped this move forward!

Dec 4 2017, 6:40 PM · Patch-For-Review, Operations, Discovery-Search (Current work), Wikimedia-Logstash
Gehel added a comment to T181808: Kartotherian error with new deployment: "core.strToFloat is not a function".

@MaxSem my assumption is that because it was already there, we should keep it. That assumption might be wrong, but my understanding of deploying nodejs apps is severely limited :(

Dec 4 2017, 6:26 PM · Maps (Kartotherian), Maps-Sprint
Gehel added a comment to T181479: Requesting access to terbium/wasat for Trey Jones.

Has been approved during weekly Ops meeting

Dec 4 2017, 5:38 PM · Patch-For-Review, Ops-Access-Requests, Operations
Gehel added a comment to T181540: Enable wdqs-admin's to control nginx.

Has been approved during weekly Ops meeting

Dec 4 2017, 5:38 PM · Patch-For-Review, Discovery, Wikidata, Operations, Ops-Access-Requests, Wikidata-Query-Service
Gehel moved T175242: all log producers need to use the logstash LVS endpoint from In progress to Needs review on the Discovery-Search (Current work) board.

All reference to logstash100[123] have been removed from puppet. I'll still do a check that no traffic is coming to those servers (we might have something outside of puppet) and start decommisionning the servers.

Dec 4 2017, 2:46 PM · Patch-For-Review, Operations, Discovery-Search (Current work), Wikimedia-Logstash
Gehel created T181989: Icinga check for WDQS should do an actual query.
Dec 4 2017, 1:28 PM · Patch-For-Review, Operations, Discovery, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Gehel updated the task description for T181988: Investigate and improve memory allocation rates of WDQS.
Dec 4 2017, 1:09 PM · Discovery, Wikidata, Operations, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel created T181988: Investigate and improve memory allocation rates of WDQS.
Dec 4 2017, 1:08 PM · Discovery, Wikidata, Operations, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service
Gehel added a comment to T175919: investigate GC times on wikidata query service.

I am mostly happy with the current GC options. It would make sense to move those back from puppet to wdqs code base, so that they can be reused by other deployment of wdqs.

Dec 4 2017, 1:04 PM · Patch-For-Review, Discovery, Wikidata-Query-Service, Wikidata
Gehel added a comment to T181808: Kartotherian error with new deployment: "core.strToFloat is not a function".

re-generating shrinkwrap got rid of the dubious kartotherian-core 0.0.18, but not of the error. 0.0.23 is now deployed. Checking the upstream repo, it looks like the latest version deployed on npmjs is 0.1.1, which is not tagged in the repo. It looks like we still have a mess of incompatible dependencies versions.

Dec 4 2017, 12:57 PM · Maps (Kartotherian), Maps-Sprint
Gehel moved T181808: Kartotherian error with new deployment: "core.strToFloat is not a function" from Backlog to In progress on the Maps-Sprint board.

Building with ./server.js build --deploy-repo --force --reshrinkwrap seems to do the trick.

Dec 4 2017, 12:40 PM · Maps (Kartotherian), Maps-Sprint
Gehel added a comment to T178721: WDQS tests unstable with some thread leak errors.

No idea what's going on here. I have seen the same error a few times as well. It looks like the leaked thread is an HttpClient, so my guess is that in some cases RdfRepository.httpClient isn't closed. But I can't understand in which code path this could happen. Looking deeper into HttpClient itself, it looks like HttpClient.executor is shutdown correctly (as a LifeCycle bean). The executor itself is a QueuedThreadPool, which has some non trivial logic to poison the threads and wait for completion. Increasing log level on org.eclipse.jetty.util.thread might give us some insight on what is happening there.

Dec 4 2017, 10:46 AM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Discovery, Wikidata-Query-Service, Wikidata