Page MenuHomePhabricator
Feed Advanced Search

Yesterday

elukey committed rOSDE951d14ac8668: Add metrics related to number of queries to Broker and Historicals (authored by elukey).
Add metrics related to number of queries to Broker and Historicals
Fri, Aug 16, 6:01 PM
elukey committed rOSDE0be5f541c4db: Fix README (authored by elukey).
Fix README
Fri, Aug 16, 6:01 PM
elukey added a comment to T226035: Dropping data from druid takes down aqs hosts .

Today I added some changes to the prometheus-druid-exporter:

Fri, Aug 16, 5:09 PM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey reopened T226035: Dropping data from druid takes down aqs hosts as "Open".

Very interesting - yesterday the systemd timer dropped old snapshots and the AQS alert fired, but only for a brief moment and then it self-recovered. This should be the proof of the theory that our fix in https://gerrit.wikimedia.org/r/519181 is the right path to follow. Re-opening to see if anything can be done to avoid any alarm to fire at all.

Fri, Aug 16, 10:54 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics

Wed, Aug 14

elukey updated subscribers of T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

Completely ignorant about it, I'd loop in @jijiki :)

Wed, Aug 14, 4:26 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
elukey added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

Andrew is on holidays, but it looks good to me!

Wed, Aug 14, 3:52 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
elukey added a comment to T229682: Add more dimensions to netflow's druid ingestion specs.

Indexed one day (Aug 1st) of data in https://turnilo.wikimedia.org/#test_elukey_wmf_netflow with the new dimensions. The size looks good, will do other checks with Marcel. @ayounsi can you check if everything is there?

Wed, Aug 14, 2:54 PM · Analytics-Kanban, Analytics
elukey set the point value for T230136: Tune Wikistats 2 Varnish caching to 5.
Wed, Aug 14, 12:48 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey moved T230136: Tune Wikistats 2 Varnish caching from In Progress to Done on the Analytics-Kanban board.
Wed, Aug 14, 12:48 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey added a comment to T230136: Tune Wikistats 2 Varnish caching.
:~  curl -I https://stats.wikimedia.org/v2 -s | grep cache-control
:~  curl -I https://stats.wikimedia.org/v2/ -s | grep cache-control
cache-control: max-age=10
:~  curl -I https://stats.wikimedia.org/v2/#/it.wikipedia.org -s| grep cache-control
cache-control: max-age=10
:~ curl -I https://stats.wikimedia.org/v2/main.bundle.6bb1aa806f695a0bf1c1.css -s | grep cache-control
Wed, Aug 14, 12:47 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey edited projects for T229863: Refactor EventBus mediawiki configuration, added: Analytics; removed Analytics-Kanban.
Wed, Aug 14, 10:17 AM · Patch-For-Review, Analytics, CPT Initiatives (Modern Event Platform (TEC2)), Core Platform Team Workboards (Clinic Duty Team), Analytics-EventLogging, EventBus
elukey edited projects for T230049: Delayed jobs fail validation in eventgate, added: Analytics; removed Analytics-Kanban.
Wed, Aug 14, 10:17 AM · Analytics, Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (Modern Event Platform (TEC2)), MediaWiki-JobQueue, Analytics-EventLogging, EventBus
elukey moved T230416: Upgrade superset to 0.34 from Next Up to In Progress on the Analytics-Kanban board.
Wed, Aug 14, 10:16 AM · Analytics-Kanban
elukey moved T225297: Create a spicerack recipe to reboot the hadoop worker nodes from Next Up to In Progress on the Analytics-Kanban board.
Wed, Aug 14, 10:16 AM · Analytics-Kanban, Patch-For-Review, Analytics, User-Elukey
elukey added a project to T225297: Create a spicerack recipe to reboot the hadoop worker nodes: Analytics-Kanban.
Wed, Aug 14, 10:16 AM · Analytics-Kanban, Patch-For-Review, Analytics, User-Elukey

Tue, Aug 13

elukey closed T230242: Membership to 'wmf' LDAP group request for Connie Chen as Resolved.

Added the user to superset! :)

Tue, Aug 13, 5:11 PM · LDAP-Access-Requests, Operations
elukey added a comment to T203132: Streamline Superset signup and authentication.

I am testing Superset 0.34 in https://phabricator.wikimedia.org/T230416#5411856. The new version contains the fix for FlaskAppBuilder https://github.com/dpgaspar/Flask-AppBuilder/issues/965, that enables the auto-creation of the superset user without weird errors. The caveat is that the email of the user created will be $uid@email.notfound, not the wikimedia one. I personally think that this is not a problem in our current use case, adding more code to FAB to implement LDAP query might be overkill.

Tue, Aug 13, 2:56 PM · Analytics, Contributors-Analysis, Product-Analytics
elukey updated the task description for T211706: Superset Updates .
Tue, Aug 13, 2:32 PM · Better Use Of Data, Analytics-Kanban, Product-Analytics
elukey added a comment to T230416: Upgrade superset to 0.34.

Created https://gerrit.wikimedia.org/r/#/c/529936/ and deployed to an-tool1005, you can easily test it via:

Tue, Aug 13, 2:30 PM · Analytics-Kanban
elukey created T230416: Upgrade superset to 0.34.
Tue, Aug 13, 2:23 PM · Analytics-Kanban

Mon, Aug 12

elukey claimed T230136: Tune Wikistats 2 Varnish caching.
Mon, Aug 12, 4:09 PM · Patch-For-Review, Analytics-Kanban, Analytics

Fri, Aug 9

elukey added a comment to T229347: Rebuild spark2 for Debian Buster.

Interesting issue: python3-tk seems to require python3.5, forcing apt to uninstall python3.7 and libpython3.7, that puppet tries to add back. From git blame we added the package because:

Fri, Aug 9, 4:45 PM · Analytics-Kanban, Analytics
elukey closed T226466: Move the puppet cdh and zookeeper submodules into operations/puppet as Resolved.
Fri, Aug 9, 3:50 PM · Analytics
elukey closed T226474: Archive cdh puppet submodule, a subtask of T226466: Move the puppet cdh and zookeeper submodules into operations/puppet, as Resolved.
Fri, Aug 9, 3:49 PM · Analytics
elukey closed T226474: Archive cdh puppet submodule as Resolved.
Fri, Aug 9, 3:49 PM · Analytics-Kanban, Cleanup, Operations, Analytics
elukey closed T227164: Archive zookeeper puppet submodule, a subtask of T226466: Move the puppet cdh and zookeeper submodules into operations/puppet, as Resolved.
Fri, Aug 9, 3:48 PM · Analytics
elukey closed T227164: Archive zookeeper puppet submodule as Resolved.
Fri, Aug 9, 3:48 PM · Patch-For-Review, Analytics-Kanban, Operations, Cleanup, Analytics
elukey updated the task description for T227164: Archive zookeeper puppet submodule.
Fri, Aug 9, 3:48 PM · Patch-For-Review, Analytics-Kanban, Operations, Cleanup, Analytics
elukey added a comment to T230206: Remove nginx submodule from puppet.

When I folded the Analytics modules I used a procedure suggested by Joe, here an example:

Fri, Aug 9, 3:47 PM · User-jijiki, Operations, puppet-compiler

Thu, Aug 8

elukey added a comment to T230004: BGP session down for AS 20485 on cr2-esams.

Sent an email to their NOC, going to wait for an answer before closing.

Thu, Aug 8, 5:27 PM · Operations, netops
elukey claimed T230004: BGP session down for AS 20485 on cr2-esams.
Thu, Aug 8, 5:27 PM · Operations, netops
elukey added a comment to T230005: BGP session down for AS4739 on cr4-ulsfo.

Sent an email just now to their NOC, will wait for the answer before closing.

Thu, Aug 8, 5:26 PM · netops, Operations
elukey claimed T230005: BGP session down for AS4739 on cr4-ulsfo.
Thu, Aug 8, 5:26 PM · netops, Operations
elukey added a comment to T229682: Add more dimensions to netflow's druid ingestion specs.

Me and Marcel discussed about this, we already index IPs in webrequest_sampled_128, so it shouldn't be a huge problem but we'll have to try with one day of data first (as one off) to be sure before proceeding :)

Thu, Aug 8, 1:55 PM · Analytics-Kanban, Analytics

Wed, Aug 7

elukey closed T230022: Create a cookbook to restart the jvms on a Cassandra cluster, a subtask of T203943: Spicerack cookbooks TODO list, as Resolved.
Wed, Aug 7, 1:25 PM · SRE-tools, User-jijiki, User-Joe, Operations
elukey closed T230022: Create a cookbook to restart the jvms on a Cassandra cluster as Resolved.
Wed, Aug 7, 1:25 PM · SRE-tools, Operations
elukey added a comment to T230022: Create a cookbook to restart the jvms on a Cassandra cluster.

Really nice! AQS is not supported and I wasn't aware :P

Wed, Aug 7, 1:25 PM · SRE-tools, Operations
elukey created T230022: Create a cookbook to restart the jvms on a Cassandra cluster.
Wed, Aug 7, 1:19 PM · SRE-tools, Operations
elukey added a comment to T226698: Allow all Analytics tools to work with Kerberos auth.

After some tests to make Refine work with Kerberos in T228291 we decided to leave RPC encryption and authentication disabled for Spark. Spark 2.3.x doesn't work well with local mode and authentication (https://issues.apache.org/jira/browse/SPARK-23476), so we'll enable both features are Kerberos and Spark 2.4.x deployment.

Wed, Aug 7, 12:44 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey moved T226089: Make the Kerberos infrastructure production ready from Next Up to In Progress on the Analytics-Kanban board.
Wed, Aug 7, 12:42 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey added a project to T226089: Make the Kerberos infrastructure production ready: Analytics-Kanban.
Wed, Aug 7, 12:42 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T226089: Make the Kerberos infrastructure production ready.

I tried to use kdb5_util dump on kerberos1001, the resulting file was 24K. It might be worth to avoid Bacula and have a simple rsync on the KDC slave that copies dumps periodically. As far as I understand replicating from master to slave via krepl is not sufficient, since if the master's database gets corrupted or inconsistent then the problem might get propagated before ad admin can act. Having a dump of the database can help in having periodic (hopefully working) backup.

Wed, Aug 7, 10:28 AM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey triaged T230005: BGP session down for AS4739 on cr4-ulsfo as Normal priority.
Wed, Aug 7, 10:16 AM · netops, Operations
elukey triaged T230004: BGP session down for AS 20485 on cr2-esams as Normal priority.
Wed, Aug 7, 10:04 AM · Operations, netops
elukey added a comment to T226089: Make the Kerberos infrastructure production ready.

Very interesting reading: https://www.tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/server-replication.html

Wed, Aug 7, 9:41 AM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T227288: eqiad: 1 misc node for the Kerberos KDC service.

Looks good to me (followed up only on the codfw task). Can we get them repurposed?

Wed, Aug 7, 9:00 AM · hardware-requests, Operations, User-Elukey, Analytics
elukey added a comment to T229357: Remove logster from cp* hosts.

From my point of view logster etc.. on the cp hosts can be removed!

Wed, Aug 7, 8:05 AM · Patch-For-Review, Operations, observability
elukey added a comment to T151304: tmpreaper possible race condition.

So that suggests that it's probably wise to keep using it on cloud VMs, as long as it still works. That said, I'm not sure that we couldn't just > /dev/null the cron job.

Wed, Aug 7, 6:35 AM · serviceops, Operations
elukey awarded T229347: Rebuild spark2 for Debian Buster a Yellow Medal token.
Wed, Aug 7, 6:18 AM · Analytics-Kanban, Analytics

Tue, Aug 6

elukey moved T227860: TLS certificates for Analytics origin servers from In Progress to Done on the Analytics-Kanban board.
Tue, Aug 6, 3:06 PM · Analytics-Kanban, User-Elukey, Operations, Analytics, Traffic
elukey updated the task description for T227860: TLS certificates for Analytics origin servers.
Tue, Aug 6, 3:05 PM · Analytics-Kanban, User-Elukey, Operations, Analytics, Traffic
elukey added a comment to T228291: Refine should accept principal name for hive2 jdbc connection for DDL.

The last failure is expected, since in the testing cluster the shuffler wants authentication via spark RPC native or SASL, and if disabled it should fail.

Tue, Aug 6, 2:10 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey added a comment to T229347: Rebuild spark2 for Debian Buster.

@Ottomata there is a caveat though: all the python libraries have only the version in debian, so changing the version of the interpreter might be a problem.. We'll need to test!

Tue, Aug 6, 1:29 PM · Analytics-Kanban, Analytics
elukey added a comment to T229682: Add more dimensions to netflow's druid ingestion specs.

Current config:

Tue, Aug 6, 1:02 PM · Analytics-Kanban, Analytics
elukey moved T229003: Roll restart all openjdk-8 jvms in Analytics from In Progress to Done on the Analytics-Kanban board.
Tue, Aug 6, 12:53 PM · Analytics-Kanban, Analytics
elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Tue, Aug 6, 12:53 PM · Analytics-Kanban, Analytics
elukey closed T203963: Convert makevm to spicerack cookbook, a subtask of T203943: Spicerack cookbooks TODO list, as Resolved.
Tue, Aug 6, 12:46 PM · SRE-tools, User-jijiki, User-Joe, Operations
elukey closed T203963: Convert makevm to spicerack cookbook as Resolved.

Same for me, please re-open if necessary!

Tue, Aug 6, 12:46 PM · serviceops-radar, Patch-For-Review, User-crusnov, SRE-tools, User-jijiki, User-Joe, Operations
elukey added a comment to T203963: Convert makevm to spicerack cookbook.

Can we close this?

Tue, Aug 6, 12:29 PM · serviceops-radar, Patch-For-Review, User-crusnov, SRE-tools, User-jijiki, User-Joe, Operations
elukey updated subscribers of T176875: Allow access to wdqs.svc.eqiad.wmnet on port 8888.

Adding @WMDE-leszek and @Ladsgroup since afaics they were/are working on this :)

Tue, Aug 6, 10:43 AM · Patch-For-Review, Traffic, Wikidata-Query-Service, Operations, WMDE-Analytics-Engineering, User-Addshore, Discovery, Wikidata
elukey added a comment to T176875: Allow access to wdqs.svc.eqiad.wmnet on port 8888.

Changed the following: (Cc: @ayounsi )

Tue, Aug 6, 10:40 AM · Patch-For-Review, Traffic, Wikidata-Query-Service, Operations, WMDE-Analytics-Engineering, User-Addshore, Discovery, Wikidata
elukey closed T226599: (OoW) Degraded RAID on analytics1039 as Resolved.

The alert should not fire again (I hope), I have disabled it via Icinga UI. Closing :)

Tue, Aug 6, 10:12 AM · ops-eqiad, Operations
elukey closed T227940: (OoW) Degraded RAID on analytics1032 as Resolved.

The alert should not fire again (I hope), I have disabled it via Icinga UI. Closing :)

Tue, Aug 6, 10:12 AM · ops-eqiad, Operations
elukey added a comment to T228291: Refine should accept principal name for hive2 jdbc connection for DDL.

I was finally able to test on analytic1030 Refine. The missing bit was --conf spark.executorEnv.LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native passed to spark-submit, but I am still not sure why it is needed (note that the same added to the workers' spark-defaults seems not working, if the client doesn't specify it).

Tue, Aug 6, 9:52 AM · Patch-For-Review, Analytics-Kanban, Analytics
elukey added a comment to T229347: Rebuild spark2 for Debian Buster.

Another gotcha:

Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

I think we can't use Spark in Buster without upgrading the whole cluster at once.

Tue, Aug 6, 8:30 AM · Analytics-Kanban, Analytics
elukey added a comment to T227257: Move refinery to hive 2 actions.

We encountered two issues when after the migration to hive2 actions:

Tue, Aug 6, 8:25 AM · Analytics-Kanban, User-Elukey, Analytics

Mon, Aug 5

elukey merged T229256: cassandra loading jobs for unqiue devices data need an SLA alarm into T228747: Review all the oozie coordinators/bundles in Refinery to add alerting when missing.
Mon, Aug 5, 5:03 PM · Analytics-Kanban, Analytics, Wikimedia-Portals
elukey merged task T229256: cassandra loading jobs for unqiue devices data need an SLA alarm into T228747: Review all the oozie coordinators/bundles in Refinery to add alerting when missing.
Mon, Aug 5, 5:03 PM · Analytics
elukey renamed T228747: Review all the oozie coordinators/bundles in Refinery to add alerting when missing from projectview-hourly-coordinator needs to alarm when in error to Review all the oozie coordinators/bundles in Refinery to add alerting when missing.
Mon, Aug 5, 5:02 PM · Analytics-Kanban, Analytics, Wikimedia-Portals
elukey added a comment to T228883: mediawiki-history-wikitext-coord job fails every month .

As FYI the last run succeeded: https://hue.wikimedia.org/oozie/list_oozie_workflow/0008208-190715143115257-oozie-oozi-W/?coordinator_job_id=0053331-190417151359684-oozie-oozi-C

Mon, Aug 5, 2:30 PM · Analytics
elukey added a comment to T151304: tmpreaper possible race condition.

The patch seems sane, but I'm wondering whether we actually need to pursue this further? tmpreaper is dead upstream (the Debian maintainer keeps it alive a little for security fixes, but the origin of the codebase is a 20 years old tmpwatch RPM from Red Hat) and has significant bit rot on modern systems (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=881725). Notably we only use it on app servers, it seems to have been added back in 2015 to address core dumps from HHVM clogging up /tmp.
Given that we're moving away from HHVM, we can simply remove tmpreaper usage along with it? Arguably PHP might also dump core, but that applies to every other service we run, so in that case we should rather look into a solution which monitors/mitigates excessive /tmp usage in general.

Mon, Aug 5, 8:11 AM · serviceops, Operations
elukey added a comment to T227265: mcrouter codfw proxies sometimes lead to TKOs.

We deployed all the changes for T225642, so async settings for codfw replication was not the culprit.

Mon, Aug 5, 7:10 AM · Performance-Team (Radar), User-Elukey, serviceops, Operations

Sun, Aug 4

elukey created T229755: csw2-esams's VCP link flapped.
Sun, Aug 4, 11:22 AM · Operations, netops

Fri, Aug 2

elukey updated the task description for T229682: Add more dimensions to netflow's druid ingestion specs.
Fri, Aug 2, 5:21 PM · Analytics-Kanban, Analytics
elukey created T229682: Add more dimensions to netflow's druid ingestion specs.
Fri, Aug 2, 5:19 PM · Analytics-Kanban, Analytics

Thu, Aug 1

elukey added a comment to T227065: Move icinga alarm for the EventStreams external endpoint to SRE.

We didn't discuss if SERVICE UNKNOWN needs to alarm or not for some services :)

Thu, Aug 1, 9:09 AM · Analytics-Kanban, Wikimedia-Incident, Analytics, Operations
elukey reopened T227065: Move icinga alarm for the EventStreams external endpoint to SRE, a subtask of T226808: Eventstreams in codfw down for several hours due to kafka2001 -> kafka-main2001 swap, as Open.
Thu, Aug 1, 9:08 AM · Wikimedia-Incident, Security, Services (watching), Analytics, Operations
elukey reopened T227065: Move icinga alarm for the EventStreams external endpoint to SRE as "Open".
Thu, Aug 1, 9:08 AM · Analytics-Kanban, Wikimedia-Incident, Analytics, Operations
elukey added a comment to T226698: Allow all Analytics tools to work with Kerberos auth.

@Ottomata I think I have fixed the spark2shell --master yarn issue with the last patch. Added the current issues in https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/Hadoop_testing_cluster#Use_Spark_2

Thu, Aug 1, 8:18 AM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T151304: tmpreaper possible race condition.

Added patch to the Debian bug in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763858#10

Thu, Aug 1, 7:29 AM · serviceops, Operations
elukey reopened T186550: Anycast recdns, a subtask of T98006: Anycast (Auth)DNS, as Open.
Thu, Aug 1, 7:25 AM · Performance-Team (Radar), Patch-For-Review, netops, Operations, Traffic
elukey reopened T186550: Anycast recdns as "Open".

Couple of notes about the anycast-healthchecker:

Thu, Aug 1, 7:25 AM · Patch-For-Review, netops, Operations, Traffic
elukey added a comment to T228827: Instability of the Level3 link between cr2-eqiad and cr2-esams.

The link went down again:

Thu, Aug 1, 7:06 AM · Operations, netops

Wed, Jul 31

elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Wed, Jul 31, 2:56 PM · Analytics-Kanban, Analytics
elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Wed, Jul 31, 2:55 PM · Analytics-Kanban, Analytics
elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Wed, Jul 31, 2:51 PM · Analytics-Kanban, Analytics
elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Wed, Jul 31, 10:10 AM · Analytics-Kanban, Analytics
elukey added a comment to T228291: Refine should accept principal name for hive2 jdbc connection for DDL.

@Ottomata deleted and recreated your principal, you should have an email with the tmp password to reset :)

Wed, Jul 31, 9:20 AM · Patch-For-Review, Analytics-Kanban, Analytics
elukey updated the task description for T229003: Roll restart all openjdk-8 jvms in Analytics.
Wed, Jul 31, 8:46 AM · Analytics-Kanban, Analytics
elukey triaged T229357: Remove logster from cp* hosts as Normal priority.
Wed, Jul 31, 7:50 AM · Patch-For-Review, Operations, observability
elukey added a comment to T229357: Remove logster from cp* hosts.

https://grafana.wikimedia.org/d/000000253/varnishkafka is now prometheus based, meanwhile the old one is now at https://grafana.wikimedia.org/d/JzhtS4vWz/varnishkafka-graphite. The metrics looks good, the number per hosts are consistent but of course the aggregate it is not (since in graphite I was used to aggregate all the metrics from the pops, meanwhile in the prometheus dashboard I have aggregation per-dc). When the alerts are migrated I'd wait a couple of days to check metrics etc.. and then we'll be able to proceed in removing logster :)

Wed, Jul 31, 7:50 AM · Patch-For-Review, Operations, observability

Tue, Jul 30

elukey added a comment to T229357: Remove logster from cp* hosts.

The Analytics team cares about two things:

Tue, Jul 30, 5:42 PM · Patch-For-Review, Operations, observability
elukey closed T225296: High Prometheus TCP retransmits as Resolved.
Tue, Jul 30, 4:12 PM · User-Elukey, User-fgiunchedi, Cloud-Services, observability, Analytics
elukey triaged T229347: Rebuild spark2 for Debian Buster as Normal priority.
Tue, Jul 30, 3:24 PM · Analytics-Kanban, Analytics
elukey updated the task description for T227860: TLS certificates for Analytics origin servers.
Tue, Jul 30, 3:16 PM · Analytics-Kanban, User-Elukey, Operations, Analytics, Traffic
elukey added a comment to T151304: tmpreaper possible race condition.

We can start by responding to Debian bug #763858 with your fix and see if the maintainer is willing to incorporate this!

Tue, Jul 30, 3:00 PM · serviceops, Operations
elukey changed the status of T227860: TLS certificates for Analytics origin servers from Stalled to Open.
Tue, Jul 30, 1:35 PM · Analytics-Kanban, User-Elukey, Operations, Analytics, Traffic
elukey updated the task description for T227860: TLS certificates for Analytics origin servers.
Tue, Jul 30, 1:33 PM · Analytics-Kanban, User-Elukey, Operations, Analytics, Traffic
elukey updated subscribers of T151304: tmpreaper possible race condition.

I have created tmpreaper_1.6.13+nmu1+deb9u1+wmf1_amd64.deb on boron, with the following patch:

Tue, Jul 30, 11:29 AM · serviceops, Operations

Mon, Jul 29

elukey added a comment to T151304: tmpreaper possible race condition.

Following the Debian Bug, it seems that in https://sources.debian.org/src/tmpreaper/1.6.13+nmu1/tmpreaper.c/?hl=452#L422 we could add a simple check to avoid this. From

Mon, Jul 29, 6:18 PM · serviceops, Operations