Joe (Giuseppe Lavagetto)
Spy

Projects (20)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 5:57 AM (155 w, 18 h)
Availability
Available
LDAP User
Giuseppe Lavagetto
MediaWiki User
Unknown

Recent Activity

Yesterday

Dzahn awarded T176392: The aphlict systemd unit needs to be rewritten from scratch a Cup of Joe token.
Fri, Sep 22, 5:18 PM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
mmodell awarded T176392: The aphlict systemd unit needs to be rewritten from scratch a Barnstar token.
Fri, Sep 22, 4:51 PM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
Joe closed T176392: The aphlict systemd unit needs to be rewritten from scratch as Resolved.
Fri, Sep 22, 10:49 AM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
Joe closed T176392: The aphlict systemd unit needs to be rewritten from scratch, a subtask of T765: Enable notification server (real-time pop-up notifications) in Phabricator, as Resolved.
Fri, Sep 22, 10:49 AM · Patch-For-Review, Phabricator
Joe added a comment to T176392: The aphlict systemd unit needs to be rewritten from scratch.

Thanks to @Paladox work on this, the aphlict service unit now handles correctly the software.

Fri, Sep 22, 10:48 AM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
Joe added a comment to T176437: puppet ca_server confusion.

If you want to better understand what puppet_ca does on an agent, and why removing it afterwards "doesn't break anything" there are good reads in the puppet docs:

Fri, Sep 22, 6:31 AM · cloud-services-team (Kanban), Operations

Thu, Sep 21

Joe triaged T176392: The aphlict systemd unit needs to be rewritten from scratch as High priority.
Thu, Sep 21, 7:49 AM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
Joe created T176392: The aphlict systemd unit needs to be rewritten from scratch.
Thu, Sep 21, 7:49 AM · Patch-For-Review, Release-Engineering-Team, Operations, Phabricator
Joe added a comment to T174431: Migration of mw* servers to stretch.

I updated the steps based on the plan to use PHP 7 instead of HHVM.

Thu, Sep 21, 6:07 AM · User-Elukey, HHVM, Operations

Tue, Sep 19

Joe added a comment to T173710: Job queue is increasing non-stop.

FWIW we're seeing another almost-incontrollable growth of jobs on commons and probably other wikis. I might decide to raise the concurrency of those jobs.

Tue, Sep 19, 12:55 PM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue
Joe added a comment to T176184: Check 'depool' failed while deploying.

This was caused by https://gerrit.wikimedia.org/r/#/c/365891/, yet another case of a labs-specific fix breaking production.

Tue, Sep 19, 8:29 AM · Services (watching), Operations, Release-Engineering-Team (Backlog), Scap, Parsoid
Joe claimed T176184: Check 'depool' failed while deploying.
Tue, Sep 19, 8:28 AM · Services (watching), Operations, Release-Engineering-Team (Backlog), Scap, Parsoid

Fri, Sep 15

Joe added a comment to T175527: Build a slim container for fluentd.

After some reasoning, I decided to go the following way:

Fri, Sep 15, 10:04 AM · Patch-For-Review, User-Joe, Services (watching), Kubernetes, Operations, Goal

Thu, Sep 14

Joe moved T147204: Update confd package from Backlog to Doing on the User-Joe board.
Thu, Sep 14, 9:43 AM · User-Joe, Beta-Cluster-reproducible, Operations
Joe moved T162013: etcd cluster in codfw has raft consensus issues from Doing to Backlog on the User-Joe board.
Thu, Sep 14, 9:43 AM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T173129: Prove helm as a potential k8s deployment tool.

After the discussion the other day at the containers cabal meeting, I promised to come up with a proposal for helm chart development/management. So here it is.

Thu, Sep 14, 7:19 AM · User-Joe, Release-Engineering-Team (Next), Release Pipeline

Wed, Sep 13

Joe added a comment to T175736: Give ores admins read access to /srv/log/ores/main.log*.

I would suggest AGAINST giving access to all logs. We should have a tail-ores command that specifically tails the ores logs, like we do with other services.

Wed, Sep 13, 2:03 PM · Patch-For-Review, Operations, Scoring-platform-team (Current), ORES
Joe added a project to T175800: Allow easy tuning of the jobqueue concurrency.: Services.
Wed, Sep 13, 8:30 AM · Services (designing), MediaWiki-JobQueue, User-mobrovac, Analytics, ChangeProp, EventBus
Joe created T175800: Allow easy tuning of the jobqueue concurrency..
Wed, Sep 13, 8:29 AM · Services (designing), MediaWiki-JobQueue, User-mobrovac, Analytics, ChangeProp, EventBus
Joe added a comment to T175780: Requests for new JobQueue monitoring capabilities.

This is very promising, I was in the process of writing down my own requirements and it seems most things are already covered, although it's not clear from your post if we can have per-wiki stats as well as per-job stats.

Wed, Sep 13, 6:56 AM · MediaWiki-JobQueue, Services (designing), ChangeProp, EventBus, Analytics

Tue, Sep 12

Joe added a comment to T171704: Switch all hosts to the future parser.

We did a lot of work today on this, and I am thus running a new puppet compiler full run, which can be found here

Tue, Sep 12, 3:43 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T175609: Package Blubber.

dh-make-golang is what I'd use for creating a debian package from scratch, as it will also prepare packages for any dependency (read: any library dependency that still isn't in debian).

Tue, Sep 12, 9:30 AM · Release-Engineering-Team (Kanban), Release Pipeline (Blubber)

Mon, Sep 11

Joe created T175539: Build containers for statsd, prometheus-statsd-exporter.
Mon, Sep 11, 10:47 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe created T175527: Build a slim container for fluentd.
Mon, Sep 11, 9:55 AM · Patch-For-Review, User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe moved T170120: Standardize on the "default" pod setup from Backlog to Doing on the User-Joe board.
Mon, Sep 11, 9:50 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe moved T173129: Prove helm as a potential k8s deployment tool from Backlog to Doing on the User-Joe board.
Mon, Sep 11, 9:50 AM · User-Joe, Release-Engineering-Team (Next), Release Pipeline

Fri, Sep 8

Joe reopened T162013: etcd cluster in codfw has raft consensus issues as "Open".
Fri, Sep 8, 3:21 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T162013: etcd cluster in codfw has raft consensus issues.

I was too optimistic, it appears, in declaring victory. The new resync at reduced speeds still triggered consensus issues. It seems the version of etcd we're using is particularly sensitive to i/o latency spikes. So while I'm inclined to either disable altoghether the raid resyncs or to stagger them between the servers of each cluster, I will consider upgrading etcd to a newer version (still in the 2.x series) as an option.

Fri, Sep 8, 3:21 PM · Patch-For-Review, User-Joe, Operations
Joe added a project to T173129: Prove helm as a potential k8s deployment tool: User-Joe.
Fri, Sep 8, 2:05 PM · User-Joe, Release-Engineering-Team (Next), Release Pipeline
Joe added a comment to T173129: Prove helm as a potential k8s deployment tool.

I did a review of how helm works/what it offers in relation to our environment.

Fri, Sep 8, 2:04 PM · User-Joe, Release-Engineering-Team (Next), Release Pipeline
Joe closed T162013: etcd cluster in codfw has raft consensus issues as Resolved.
Fri, Sep 8, 1:40 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T162013: etcd cluster in codfw has raft consensus issues.

Reducing the sync speed manually did the job, so we can just puppetize this.

Fri, Sep 8, 12:52 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T170120: Standardize on the "default" pod setup.

I am a bit more concerned about performance and reliability implications of adding indirections in the data path itself. TLS is supported by all major platforms we use, so we should be able to avoid indirections for that. The main requirement to enable this is centralized certificate management, and exposing certs to services in a standardized manner, often via env vars.

Fri, Sep 8, 10:59 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe added a comment to T162013: etcd cluster in codfw has raft consensus issues.

Result of the latest experiment:

Fri, Sep 8, 10:48 AM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T162013: etcd cluster in codfw has raft consensus issues.

Running the mdadm command on one host caused a re-election to happen. It seems likely we found the culprit, so now I'm going to run the command at the same time on first two hosts, then on all three, to verify we found the origin of the issues.

Fri, Sep 8, 9:46 AM · Patch-For-Review, User-Joe, Operations

Thu, Sep 7

Joe added a project to T170120: Standardize on the "default" pod setup: User-Joe.
Thu, Sep 7, 2:44 PM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe closed T174599: Set up LVS and VirtualHost for RunSingleJob.php as Resolved.
Thu, Sep 7, 2:03 PM · Analytics, Services (watching), Patch-For-Review, User-Joe, MediaWiki-JobQueue, User-mobrovac, ChangeProp, EventBus
Joe closed T174599: Set up LVS and VirtualHost for RunSingleJob.php, a subtask of T157088: [EPIC] Develop a JobQueue backend based on EventBus, as Resolved.
Thu, Sep 7, 2:03 PM · MediaWiki-JobQueue, Epic, Services (doing), User-mobrovac, Analytics, ChangeProp, EventBus
Joe added a comment to T174599: Set up LVS and VirtualHost for RunSingleJob.php.

Everything is set up and you can reach the correct LVS endpoint via the discovery DNS system at jobrunner.discovery.wmnet, via HTTPS.

Thu, Sep 7, 2:03 PM · Analytics, Services (watching), Patch-For-Review, User-Joe, MediaWiki-JobQueue, User-mobrovac, ChangeProp, EventBus
Joe added a comment to T173710: Job queue is increasing non-stop.

I did some more number crunching on the instances of runJob.php I'm running on terbium, I found what follows:

Thu, Sep 7, 12:49 PM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue
Joe moved T171704: Switch all hosts to the future parser from Doing to Blocked on others on the User-Joe board.
Thu, Sep 7, 7:25 AM · Patch-For-Review, User-Joe, Puppet, Operations

Wed, Sep 6

Joe moved T162013: etcd cluster in codfw has raft consensus issues from Backlog to Doing on the User-Joe board.
Wed, Sep 6, 3:44 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T173710: Job queue is increasing non-stop.

As a side comment: this is one of the cases where I would've loved to have an elastic environment to run MediaWiki-related applications: I could've spun up 10 instances of jobrunner dedicated to refreshlinks (or, ideally, the system could have done it automagically), for example.

Yep! Very true. When I first read about Borg and whenever I read about similar principles since, the job queue always comes to mind as a great use case. Of course it would benefit app server maintenance too, but the job queue pressure tends to vary more than app server pressure. A Borg-like system would allow us to make the most of the idle time on all (app) servers and gracefully fill it up with job runners.

Of course, that doesn't apply to cases that are limited by a common resource (e.g. database). But the idea is still very attractive. Permanently setting up more job runners remains a difficult calculation for us, because in the end we must prioritise app servers for site availability. On the other hand, given how idle most app servers are most of the time, it seems like a royal waste to not put it to use.

Wed, Sep 6, 3:07 PM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue
Joe added a comment to T173710: Job queue is increasing non-stop.

Those refreshLInks jobs (from wikibase) are the only ones that use multiple titles per job, so they will be a lot slower (seems to be 50 pages/job) than the regular ones from MediaWiki core. That is a bit on the slow side for a run time of a non-rare job type (e.g. TMH or GWT).

Wed, Sep 6, 9:08 AM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue

Tue, Sep 5

Joe closed T122069: jobrunner memory leaks as Resolved.
Tue, Sep 5, 8:23 AM · JobRunner-Service, Wikimedia-General-or-Unknown, Operations
Joe closed T122069: jobrunner memory leaks, a subtask of T124194: Job queue is growing and growing, as Resolved.
Tue, Sep 5, 8:23 AM · Operations, Wikimedia-General-or-Unknown
Joe added a comment to T173710: Job queue is increasing non-stop.

We still have around 1.4 million items in queue for commons, evenly divided between htmlCacheUpdate jobs and refreshLinks jobs.

Tue, Sep 5, 7:09 AM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue

Mon, Sep 4

Joe moved T174599: Set up LVS and VirtualHost for RunSingleJob.php from Backlog to Doing on the User-Joe board.
Mon, Sep 4, 8:11 AM · Analytics, Services (watching), Patch-For-Review, User-Joe, MediaWiki-JobQueue, User-mobrovac, ChangeProp, EventBus
Joe added a comment to T73853: Retry counts not working / jobs re-executed beyond retry limits.

The replication issues discussed in T163337 could play a role in duplication / keeping old jobs alive.

Mon, Sep 4, 6:58 AM · WMF-deploy-2015-08-25_(1.26wmf20), WMF-deploy-2015-07-28_(1.26wmf16), Patch-For-Review, MediaWiki-JobQueue

Fri, Sep 1

Joe added a comment to T165519: rack and setup mw1307-1348 .

To recap quickly the plan:

Fri, Sep 1, 2:38 PM · Patch-For-Review, User-Elukey, User-Joe, Operations, ops-eqiad
Joe added a comment to T171704: Switch all hosts to the future parser.

After my series of changes the situation looks much better:

Fri, Sep 1, 12:35 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe updated the task description for T171704: Switch all hosts to the future parser.
Fri, Sep 1, 12:33 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T170120: Standardize on the "default" pod setup.

A containerized microservice environment should make developing and deploying applications as easy as possible.

Fri, Sep 1, 9:05 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe updated subscribers of T170120: Standardize on the "default" pod setup.

For metrics collection, my proposal (after a chat with @fgiunchedi) have another sidecar running prometheus-statsd-exporter in the modifed version we maintain.

Fri, Sep 1, 8:57 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe added a comment to T170120: Standardize on the "default" pod setup.

As far as logging goes, we have basically two big options:

Fri, Sep 1, 7:53 AM · User-Joe, Services (watching), Kubernetes, Operations, Goal
Joe added a project to T174599: Set up LVS and VirtualHost for RunSingleJob.php: User-Joe.
Fri, Sep 1, 7:31 AM · Analytics, Services (watching), Patch-For-Review, User-Joe, MediaWiki-JobQueue, User-mobrovac, ChangeProp, EventBus
Joe closed T173078: Fix the `base::service_unit` template scoping problem as Resolved.
Fri, Sep 1, 7:30 AM · Patch-For-Review, User-Joe, Puppet, Operations
Joe closed T173078: Fix the `base::service_unit` template scoping problem, a subtask of T171704: Switch all hosts to the future parser, as Resolved.
Fri, Sep 1, 7:30 AM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T171965: [Spike - 8 hours] How should the PDF post-processing script be exposed for use by Extension:Collection.

This is the control flow as proposed now:

  1. MediaWiki requests HTML of pages from RESTBase
    • RESTBase might fall back to Parsoid if not cached; Parsoid partially relies on the PHP API for rendering
  2. MediaWiki concatenates the pages into a single document ('concatenate' is a bit misleading; this requires parsing the HTML)
  3. MediaWiki POSTs the HTML to Electron (mediawiki/services/electron-render; RESTBase and ElectronPdfRender cannot be used as do not expose rendering arbitrary HTML, for obvious reasons)
  4. Electron responds with a PDF file
  5. MediaWiki shells out to Python which does some post-processing on the PDF file
  6. MediaWiki returns the PDF to the user.
Fri, Sep 1, 6:41 AM · Proton, Readers-Web-Kanban-Board, Electron-PDFs, Readers-Web-Backlog (Tracking), Spike

Thu, Aug 31

Joe updated subscribers of T173710: Job queue is increasing non-stop.

Correcting myself after a discussion with @ema: since we have up to 4 cache layers (at most), we should process any job with a root timestamp newer than 4 times the cache TTL cap. So anything older than 4 days should be safely discardable.

Thu, Aug 31, 2:16 PM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue
Joe added a comment to T173710: Job queue is increasing non-stop.

@aaron so you're saying that when we have someone editing a lot of pages with a lot of backlinks we will see the jobqueue growing basically for quite a long time, as the divided jobs will be executed at a later time, and as long as the queue is long enough, we'll see jobs divided/inserted in the queue when division jobs are executed.

Thu, Aug 31, 2:10 PM · Patch-For-Review, Services (watching), Performance-Team (Radar), Discovery-Search, CirrusSearch, Discovery, Wikidata-Sprint, Wikidata, Operations, MediaWiki-JobQueue

Wed, Aug 30

Joe added a comment to T171965: [Spike - 8 hours] How should the PDF post-processing script be exposed for use by Extension:Collection.

Also, going through the remainder of the design document and the implementation PoC, I could summarize the flow as follows:

Wed, Aug 30, 5:43 PM · Proton, Readers-Web-Kanban-Board, Electron-PDFs, Readers-Web-Backlog (Tracking), Spike
Joe added a comment to T171965: [Spike - 8 hours] How should the PDF post-processing script be exposed for use by Extension:Collection.

Hi, I took a look at your current proposal and I see a series of issues with it. I might still not have understood what you're proposing fully, if that's the case, please let me know!

Wed, Aug 30, 3:56 PM · Proton, Readers-Web-Kanban-Board, Electron-PDFs, Readers-Web-Backlog (Tracking), Spike
Joe added a comment to T159922: pdfrender fails to serve requests since Mar 8 00:30:32 UTC on scb1003.

This task was about pdfrender failing to start, and that problem has been "hotfixed".

Wed, Aug 30, 5:22 AM · Services (done), Readers-Web-Backlog (Tracking), Operations, Electron-PDFs

Mon, Aug 28

Joe updated the task description for T171704: Switch all hosts to the future parser.
Mon, Aug 28, 3:14 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T171704: Switch all hosts to the future parser.

https://puppet-compiler.wmflabs.org/compiler02/7622/index-future.html has a list with most spurious differences removed.

Mon, Aug 28, 3:13 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T173786: Convert Wikimedia production HHVM instances to have hhvm.php7.all set true.

Beware that, as HHVM developers declared themselves, the php 7 implementation in HHVM will never be 100% compatible with the PHP one.

Mon, Aug 28, 8:21 AM · MediaWiki-Platform-Team, Performance-Team, Operations, HHVM
Joe raised the priority of T168271: Decommission mw1170-mw1179 from Normal to High.
Mon, Aug 28, 7:52 AM · Patch-For-Review, hardware-requests, User-Joe, ops-eqiad, Operations
Joe added a comment to T167130: Decom mw1170-mw1179, and replace them with new systems..

Any news on this? We do need to rack the new appservers as putting them in production is needed in order to go on with the eqiad row D switch upgrade T172459

Mon, Aug 28, 7:52 AM · Patch-For-Review, User-Joe, ops-eqiad, Operations
Joe raised the priority of T167130: Decom mw1170-mw1179, and replace them with new systems. from Normal to High.
Mon, Aug 28, 7:51 AM · Patch-For-Review, User-Joe, ops-eqiad, Operations

Aug 23 2017

mmodell awarded T161675: Re-think puppet management for deployment-prep a Like token.
Aug 23 2017, 4:45 PM · Release-Engineering-Team (Next), User-Joe, Beta-Cluster-Infrastructure, Cloud-Services, Puppet

Aug 11 2017

Gehel awarded T173078: Fix the `base::service_unit` template scoping problem a Like token.
Aug 11 2017, 8:58 AM · Patch-For-Review, User-Joe, Puppet, Operations
Joe created T173078: Fix the `base::service_unit` template scoping problem.
Aug 11 2017, 8:56 AM · Patch-For-Review, User-Joe, Puppet, Operations
Joe moved T171704: Switch all hosts to the future parser from Backlog to Doing on the User-Joe board.
Aug 11 2017, 8:50 AM · Patch-For-Review, User-Joe, Puppet, Operations
Joe added a comment to T172459: eqiad row D switch upgrade.

I don't think we're safe to do this maintenance until we do rack all the new mediawiki machines. We have almost half of our capacity for MediaWiki in row D. We have plans to remediate that when the new mediawiki servers will be racked (see T165519) but I'd say racking and setting up those servers should be a hard blocker for this maintenance at the moment.

Aug 11 2017, 7:21 AM · Patch-For-Review, Operations, netops, Traffic

Aug 10 2017

Joe added a comment to T171704: Switch all hosts to the future parser.

Full list of hosts using the future parser:

Aug 10 2017, 1:23 PM · Patch-For-Review, User-Joe, Puppet, Operations
Joe closed T172362: New puppet compiler differ html escape as Resolved.
Aug 10 2017, 10:11 AM · Patch-For-Review, User-Joe, puppet-compiler
Joe added a comment to T172362: New puppet compiler differ html escape.

https://puppet-compiler.wmflabs.org/compiler02/7383/cp2001.codfw.wmnet/ shows the correct behaivour.

Aug 10 2017, 10:11 AM · Patch-For-Review, User-Joe, puppet-compiler
Joe closed T150456: puppet compiler fails with modules using puppetdb as Resolved.
Aug 10 2017, 10:02 AM · Patch-For-Review, User-Joe, puppet-compiler, Operations

Aug 9 2017

Dzahn awarded T162949: hosts with puppet compiler failures on every run a Love token.
Aug 9 2017, 1:23 PM · puppet-compiler, Operations
Joe added a comment to T150456: puppet compiler fails with modules using puppetdb.

So one additional complication: we need to refresh the facts timestamp for every pcc run, as we don't want to incur in a case of https://tickets.puppetlabs.com/browse/PUP-5441

Aug 9 2017, 8:20 AM · Patch-For-Review, User-Joe, puppet-compiler, Operations
Joe closed T133979: puppet compiler error on catalog with non-ascii output as Resolved.
Aug 9 2017, 7:43 AM · puppet-compiler, Operations
Joe added a comment to T133979: puppet compiler error on catalog with non-ascii output.

This is resolved now that we use our own differ.

Aug 9 2017, 7:42 AM · puppet-compiler, Operations
Joe closed T157496: A few hosts never get clean puppet compiler runs as Resolved.
Aug 9 2017, 7:42 AM · Prometheus-metrics-monitoring, puppet-compiler, Puppet, Operations
Joe added a comment to T157496: A few hosts never get clean puppet compiler runs.

Yes, this is a duplicate of T150456

Aug 9 2017, 7:42 AM · Prometheus-metrics-monitoring, puppet-compiler, Puppet, Operations
Joe closed T162949: hosts with puppet compiler failures on every run as Resolved.
Aug 9 2017, 6:53 AM · puppet-compiler, Operations
Joe added a comment to T162949: hosts with puppet compiler failures on every run.

As can be seen here

Aug 9 2017, 6:52 AM · puppet-compiler, Operations
Joe added a comment to T150456: puppet compiler fails with modules using puppetdb.

I wrote a first version of the script that can be used to populate puppetdb; I'll upload it via puppet to all the compiler machines for now so that we can populate the db easily.

Aug 9 2017, 6:35 AM · Patch-For-Review, User-Joe, puppet-compiler, Operations

Aug 8 2017

Joe added a comment to T150456: puppet compiler fails with modules using puppetdb.

Status update:

Aug 8 2017, 1:58 PM · Patch-For-Review, User-Joe, puppet-compiler, Operations

Aug 7 2017

Joe closed T172547: ::profile::puppetmaster::common missing dependencies when $storeconfigs=puppetdb as Resolved.
Aug 7 2017, 6:32 AM · Patch-For-Review, Cloud-VPS, Puppet

Aug 4 2017

Joe closed T149432: puppet compiler claims "no change" when catalogs are actually different as Resolved.
Aug 4 2017, 9:22 AM · Patch-For-Review, puppet-compiler, Operations
Joe added a comment to T149432: puppet compiler claims "no change" when catalogs are actually different.

This should be resolved with the new home-brewed differ:

Aug 4 2017, 9:22 AM · Patch-For-Review, puppet-compiler, Operations
Joe moved T150456: puppet compiler fails with modules using puppetdb from Backlog to Doing on the User-Joe board.
Aug 4 2017, 9:03 AM · Patch-For-Review, User-Joe, puppet-compiler, Operations
Joe moved T172362: New puppet compiler differ html escape from Backlog to Doing on the User-Joe board.
Aug 4 2017, 8:48 AM · Patch-For-Review, User-Joe, puppet-compiler
Joe closed T166888: CI for operations/puppet is taking too long as Resolved.
Aug 4 2017, 8:45 AM · User-Joe, Release-Engineering-Team (Kanban), Patch-For-Review, Operations, Continuous-Integration-Infrastructure
Joe closed T166888: CI for operations/puppet is taking too long, a subtask of T169548: Prepare for Puppet 4, as Resolved.
Aug 4 2017, 8:44 AM · User-Joe, Puppet, Operations
Joe added a comment to T166888: CI for operations/puppet is taking too long.

I did rewrite the Rakefile according to @faidon's suggestions, did tweak the dockerfile/run environment a bit, and now an average job takes less than 20 seconds to execute, with the simple changes taking less than 10 seconds. All this while running specs if needed.

Aug 4 2017, 8:44 AM · User-Joe, Release-Engineering-Team (Kanban), Patch-For-Review, Operations, Continuous-Integration-Infrastructure

Aug 3 2017

Joe added a project to T172362: New puppet compiler differ html escape: User-Joe.
Aug 3 2017, 2:14 PM · Patch-For-Review, User-Joe, puppet-compiler
Joe added a comment to T169998: RFC: Container path conventions.

I think the proposal is pretty sound - with a couple of suggestions to keep things more "familiar" for ops, and more in line with what we use for our production environment:

Aug 3 2017, 10:46 AM · Release-Engineering-Team (Watching / External), MediaWiki-Containers, User-mobrovac, Kubernetes, Services (designing)

Jul 31 2017

Joe updated the task description for T171704: Switch all hosts to the future parser.
Jul 31 2017, 6:52 AM · Patch-For-Review, User-Joe, Puppet, Operations