Joe (Giuseppe Lavagetto)
Spy

Projects (22)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 5:57 AM (215 w, 5 d)
Availability
Available
LDAP User
Giuseppe Lavagetto
MediaWiki User
GLavagetto (WMF) [ Global Accounts ]

Recent Activity

Yesterday

Joe closed T206338: Allow directing users to PHP7 based on a cookie as Resolved.
Tue, Nov 20, 4:02 PM · Core Platform Team Backlog (Watching / External), Core Platform Team (PHP7 (TEC4)), Patch-For-Review, Operations
Joe closed T206338: Allow directing users to PHP7 based on a cookie, a subtask of T206336: SRE quarterly goal: Ability to serve a fraction of the production traffic from PHP7, as Resolved.
Tue, Nov 20, 4:02 PM · Operations

Mon, Nov 19

Joe added a comment to T209802: Cannot vote on votewiki.

Some more information before I get away on a vacation day:

Mon, Nov 19, 12:22 PM · Patch-For-Review, Operations, Wikimedia-production-error, MediaWiki-extensions-SecurePoll
Joe added a comment to T209802: Cannot vote on votewiki.

I tried to debug further what the problem is, given I'm not inclined to disable site-wide the use of lightprocesses (although now it should be less of an issue than it used to be).

Mon, Nov 19, 11:22 AM · Patch-For-Review, Operations, Wikimedia-production-error, MediaWiki-extensions-SecurePoll
Joe added a comment to T209802: Cannot vote on votewiki.

So, redefining $wgSecurePollTempDir doesn't do the trick:

Mon, Nov 19, 10:54 AM · Patch-For-Review, Operations, Wikimedia-production-error, MediaWiki-extensions-SecurePoll
Joe added a comment to T209802: Cannot vote on votewiki.

I just succesfully obtained an encrypted message by running the same script and disabling the light_process feature of HHVM.

Mon, Nov 19, 10:46 AM · Patch-For-Review, Operations, Wikimedia-production-error, MediaWiki-extensions-SecurePoll
Joe added a comment to T209802: Cannot vote on votewiki.

So looking in the logs, it seems like a log event is generated for importing the key into gpg, but there is no log event for actually encrypting the voting record (The next step after importing the key). This makes me wonder if its an issue with shelling out to gpg.

Further proof of this:

 bawolff@mwmaint1002:/srv/mediawiki/php-1.33.0-wmf.4/maintenance$ mwscript eval.php votewiki 
> $context = new SecurePoll_Context;
> $e = $context->getElection( 730 );
> $c = $e->getCrypt();
> $stat = $c->encrypt( 'test' );
 [process hangs]
Mon, Nov 19, 8:56 AM · Patch-For-Review, Operations, Wikimedia-production-error, MediaWiki-extensions-SecurePoll

Sun, Nov 18

Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

@Joe Might be interesting to look at specific calls that appear to perform less well, to see if we can identify specific calls that are slower. xhprof/tideways might be an approach...

Sun, Nov 18, 10:28 AM · Performance-Team (Radar), Operations
Joe merged T209781: parsoid-rt.service keeps failing on ruthenium causing alerts in icinga into T209758: parsoid-rt repeated failures on ruthenium (parsoid::testing).
Sun, Nov 18, 10:12 AM · Parsoid, Operations
Joe merged task T209781: parsoid-rt.service keeps failing on ruthenium causing alerts in icinga into T209758: parsoid-rt repeated failures on ruthenium (parsoid::testing).
Sun, Nov 18, 10:12 AM · Operations, Parsoid

Sat, Nov 17

Joe added a project to T209754: db1078 (s3 candidate master) crashed : Operations.
Sat, Nov 17, 6:25 AM · Patch-For-Review, Operations, DBA

Fri, Nov 16

Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Last thing to note:

  • pm = static vs pm = dynamic didn't really changed much for long-lasting requests; it made smaller requests faster though, so it's a net win
  • we need some tool to inspect php-fpm's inner workings in order to find what is going on there. I might need to look at perf recordings to get an idea. phpspy might help too.
Fri, Nov 16, 3:23 PM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Results for more endpoints:

Fri, Nov 16, 2:41 PM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

After more thorough analisys of parsing the Obama page:

Fri, Nov 16, 11:29 AM · Performance-Team (Radar), Operations
Joe added a comment to T207994: revision-create events are sometimes emitted in a secondary DC.

So while I think my theory is pretty sound, we can just set eventbus to be active/passive and see if the events still get created in codfw or not. If that's the case, we have an explanation. @elukey what do you think?

Fri, Nov 16, 9:17 AM · User-Elukey, Core Platform Team (Security, stability, performance and scalability (TEC1)), Core Platform Team Backlog (Later), Analytics, EventBus, Services (later)
Joe added a comment to T207994: revision-create events are sometimes emitted in a secondary DC.

So one scenario in which this can happen is the following:

Fri, Nov 16, 9:14 AM · User-Elukey, Core Platform Team (Security, stability, performance and scalability (TEC1)), Core Platform Team Backlog (Later), Analytics, EventBus, Services (later)
Gilles awarded T206341: Evaluate scalability and performance of PHP7 compared to HHVM a Love token.
Fri, Nov 16, 8:48 AM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Forcing a reparse of the Obama page by requesting
curl -g -b "PHP_ENGINE=php7" -H 'Host: en.wikipedia.org' 'http://mw1261.eqiad.wmnet/w/api.php?action=parse&text={{:Barack%20Obama}}'

Fri, Nov 16, 8:16 AM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Have you diffed the output coming from HHVM and PHP7, to ensure that they're generating the same HTML for these pages?

Fri, Nov 16, 8:10 AM · Performance-Team (Radar), Operations

Thu, Nov 15

Ladsgroup awarded T206341: Evaluate scalability and performance of PHP7 compared to HHVM a Like token.
Thu, Nov 15, 6:19 PM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Here are the first results that I feel comfortable sharing!

Thu, Nov 15, 6:05 PM · Performance-Team (Radar), Operations
Joe added a comment to T206341: Evaluate scalability and performance of PHP7 compared to HHVM.

Is there anything specific being asked of the Performance Team, or is this something that @Joe (or others) were planning to do?

Thu, Nov 15, 1:03 PM · Performance-Team (Radar), Operations
Joe triaged T209573: Gather metrics from php-fpm as High priority.
Thu, Nov 15, 9:55 AM · Patch-For-Review, User-Joe, Operations
Joe added a project to T209568: The icinga web interface can't read the icinga log file: Operations.
Thu, Nov 15, 9:11 AM · Patch-For-Review, Operations
Joe created T209568: The icinga web interface can't read the icinga log file.
Thu, Nov 15, 9:08 AM · Patch-For-Review, Operations

Wed, Nov 14

Joe updated the task description for T176370: Migrate to PHP 7 in WMF production.
Wed, Nov 14, 5:02 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe updated the task description for T208524: RfC: Standards for external services that integrate with MediaWiki.
Wed, Nov 14, 4:05 PM · TechCom, TechCom-RFC
Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

I will start moving the RfC on-wiki, and add some proposed amendments in the talk page.

Wed, Nov 14, 3:02 PM · TechCom, TechCom-RFC
Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

+1 to moving on-wiki.

I'll also throw in a quibble with this criterion: "A non-PHP language or framework exists that significantly simplifies implementation." To me it seems better to keep the focus here squarely on architectural principles, and to consider the language out of scope as an implementation detail (subject to criteria set elsewhere about things like how many languages we can practically support in production). For our purposes here, could we simply say, "An external tool or framework exists that significantly simplifies implementation"?

Wed, Nov 14, 3:01 PM · TechCom, TechCom-RFC
Joe placed T209456: Gerrit is down "502 Proxy Error" up for grabs.
Wed, Nov 14, 8:23 AM · Operations, Gerrit
Joe lowered the priority of T209456: Gerrit is down "502 Proxy Error" from Unbreak Now! to High.
Wed, Nov 14, 5:49 AM · Operations, Gerrit
Joe added a comment to T209456: Gerrit is down "502 Proxy Error".

I was still quite asleep, but I saw a series of broken pipes from sockets and jetty refusing to manage any new connection in the logs, so I just restarted gerrit. It is now working, so we can lower the priority of this ticket to "High" but the correlation with the new release seems clear to me.

Wed, Nov 14, 5:49 AM · Operations, Gerrit
Joe added a project to T209456: Gerrit is down "502 Proxy Error": Operations.
Wed, Nov 14, 5:44 AM · Operations, Gerrit
Joe claimed T209456: Gerrit is down "502 Proxy Error".
Wed, Nov 14, 5:44 AM · Operations, Gerrit

Tue, Nov 13

Joe added a comment to T208433: Package and install php 7.2 in place of php 7.0.

status update:

Tue, Nov 13, 4:42 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

If no one else wants to pick this up, I'm willing to work on this RfC. The fact I didn't participate in the session can be both and advantage (I have a fresh prespective) and a disadvantage (I don't have more background than what's written here).

Tue, Nov 13, 1:02 PM · TechCom, TechCom-RFC
Joe added a comment to T209265: Validate no namespaced keys are present in hieradata/*.yaml.

IIRC this is because of the expand_data directive in https://github.com/wikimedia/puppet/blob/production/modules/puppetmaster/files/production.hiera.yaml#L8

It's confusing indeed. That being said @Joe has expressed an interest in unifying the hiera backends so maybe we can get rid of this behavior.

Tue, Nov 13, 1:00 PM · Patch-For-Review, Puppet
Joe added a comment to T209271: improve docker registry architecture.

I would say that this sounds like a better direction to go into, yes.

Tue, Nov 13, 12:42 PM · Continuous-Integration-Infrastructure (shipyard), Kubernetes, Operations

Fri, Nov 9

Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

Since I got no feedback on the proposal to move this on-wiki, I'll start commenting and amending here, although I will eventually move everything on-wiki.

Fri, Nov 9, 8:10 AM · TechCom, TechCom-RFC
Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

@Joe the only distinction I see there is "stuff we write" vs "stuff we don't write", really.

But you are right that criteria for data storage technology selection don't belong here. But "we use an external service for storing X" should follow from this guideline, even if that external service is Redis or Swift. But the criteria for deciding *which* data storage tech to use do not belong here.

Actually, Swift over local files is a good example for a case where "inline" functionality was replaced with an off-the-shelf service. And the criteria developed here should apply to, and be consistent with, that decision.

Fri, Nov 9, 7:39 AM · TechCom, TechCom-RFC

Thu, Nov 8

Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

@daniel one detail I'd want to add: external storage systems like mysql or redis should not be considered "external services" - they're part of where MediaWiki stores its state.

Thu, Nov 8, 11:14 AM · TechCom, TechCom-RFC
Bawolff awarded T208524: RfC: Standards for external services that integrate with MediaWiki a Love token.
Thu, Nov 8, 11:12 AM · TechCom, TechCom-RFC
Joe added a comment to T208524: RfC: Standards for external services that integrate with MediaWiki.

I think @daniel raises quite a few good points, I will try to go through the document and integrate some more observations in the coming days. But I'd first propose we move the discussion mostly on-wiki: this is a big wall of text, and I feel a wiki page it's a better place to collaboratively enhance it.

Thu, Nov 8, 11:01 AM · TechCom, TechCom-RFC

Wed, Nov 7

Joe updated the task description for T208433: Package and install php 7.2 in place of php 7.0.
Wed, Nov 7, 11:04 AM · Patch-For-Review, User-Joe, Operations

Tue, Nov 6

Joe updated the task description for T176370: Migrate to PHP 7 in WMF production.
Tue, Nov 6, 7:10 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe updated the task description for T176370: Migrate to PHP 7 in WMF production.
Tue, Nov 6, 7:10 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe added a comment to T176370: Migrate to PHP 7 in WMF production.

Important status update:

Tue, Nov 6, 7:09 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe updated the task description for T176370: Migrate to PHP 7 in WMF production.
Tue, Nov 6, 7:07 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations

Fri, Nov 2

Joe added a comment to T208549: HHVM CPU usage when deploying MediaWiki.

So, we tried to redeploy the code on the mwdebug servers today, and I load tested both servers with ab before and after the deployment of the code.

Fri, Nov 2, 5:08 PM · Wikimedia-production-error, Release-Engineering-Team, Operations
Joe added a comment to T208549: HHVM CPU usage when deploying MediaWiki.

What I briefly observed yesterday night during the outage was:

Fri, Nov 2, 4:30 PM · Wikimedia-production-error, Release-Engineering-Team, Operations
Joe closed T201140: Puppetize the installation of PHP-FPM on the MediaWiki hosts, a subtask of T176370: Migrate to PHP 7 in WMF production, as Resolved.
Fri, Nov 2, 10:06 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe closed T201140: Puppetize the installation of PHP-FPM on the MediaWiki hosts as Resolved.
Fri, Nov 2, 10:06 AM · Patch-For-Review, User-Joe, User-ArielGlenn, Operations

Wed, Oct 31

Joe updated subscribers of T208188: Proposal for partial opt-out method for Content security policy.
Wed, Oct 31, 8:43 PM · TechCom-RFC, TechCom, Security-Team, Security
Joe claimed T208433: Package and install php 7.2 in place of php 7.0.
Wed, Oct 31, 3:33 PM · Patch-For-Review, User-Joe, Operations
Joe triaged T208433: Package and install php 7.2 in place of php 7.0 as High priority.
Wed, Oct 31, 3:32 PM · Patch-For-Review, User-Joe, Operations
Joe added a comment to T208272: codfw row C recable and add QFX.

To be clear: I think we should do the maintenance without depooling anything and check what would happen when we lose a row, even if in an inactive datacenter. But we should do that (call it an extreme chaos engineering drill) when most people are not on vacation.

Wed, Oct 31, 10:45 AM · Patch-For-Review, ops-codfw, Operations, netops
Joe updated subscribers of T208272: codfw row C recable and add QFX.

Here is the full list of hosts in that row. No outages expected, but brief (5s) connectivity interruption for some racks is possible.
CCing services owners, to know if it's an acceptable risk and if it can be mitigated by depooling services.

Wed, Oct 31, 10:43 AM · Patch-For-Review, ops-codfw, Operations, netops

Mon, Oct 29

Joe created P7727 dumpvars.
Mon, Oct 29, 8:20 AM

Sun, Oct 28

Joe added a comment to T208108: httpd class and php7.0 - conflict with mpm_event module.

I think we should stop using mod_php *anywhere*. We should really use php-fpm for anything that is not explicitly known not to work with fcgi (and I wonder, what that might be).

Sun, Oct 28, 11:47 AM · Patch-For-Review, Operations

Wed, Oct 24

Joe added a comment to P7712 Apache proxy fcgi headers sent.

WMF uses:

Wed, Oct 24, 8:46 AM
Joe added a comment to T201963: RFC: Modern Event Platform: Stream Intake Service.

I think in general it's ok to go with the nodejs rewrite - I only hope we've evaluated carefully that this service will not be very cpu-intensive; as we know, all systems that mock concurrency by using an event loop and non-blocking i/o are inherently limited in the amount of computational work they can do by running on a single CPU core.

Wed, Oct 24, 7:23 AM · TechCom-RFC, Services (watching), Analytics-EventLogging, EventBus, Analytics, Analytics-Kanban

Tue, Oct 23

Joe created P7712 Apache proxy fcgi headers sent.
Tue, Oct 23, 1:14 PM

Oct 22 2018

Joe added a comment to T196968: Re-organize the apache configuration for MediaWiki in puppet.

This is now fully done in both deployment-prep (beta) and production.

Oct 22 2018, 1:22 PM · User-Joe, Patch-For-Review, Wikimedia-Apache-configuration, Operations
Joe closed T196968: Re-organize the apache configuration for MediaWiki in puppet, a subtask of T176370: Migrate to PHP 7 in WMF production, as Resolved.
Oct 22 2018, 1:21 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe closed T196968: Re-organize the apache configuration for MediaWiki in puppet as Resolved.
Oct 22 2018, 1:21 PM · User-Joe, Patch-For-Review, Wikimedia-Apache-configuration, Operations
Joe closed T196968: Re-organize the apache configuration for MediaWiki in puppet, a subtask of T87220: Minimize differences between beta and production (Tracking), as Resolved.
Oct 22 2018, 1:21 PM · Technical-Debt, Tracking, Operations, Puppet, Beta-Cluster-Infrastructure

Oct 19 2018

Krenair awarded T196968: Re-organize the apache configuration for MediaWiki in puppet a Mountain of Wealth token.
Oct 19 2018, 3:38 PM · User-Joe, Patch-For-Review, Wikimedia-Apache-configuration, Operations

Oct 17 2018

Dzahn awarded T196968: Re-organize the apache configuration for MediaWiki in puppet a Barnstar token.
Oct 17 2018, 6:03 PM · User-Joe, Patch-For-Review, Wikimedia-Apache-configuration, Operations
Joe closed T206841: Evaluate the consequences of the parsercache being empty post-switchover as Resolved.
Oct 17 2018, 7:58 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Bawolff awarded T206841: Evaluate the consequences of the parsercache being empty post-switchover a Yellow Medal token.
Oct 17 2018, 12:11 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations

Oct 15 2018

akosiaris awarded T206841: Evaluate the consequences of the parsercache being empty post-switchover a Love token.
Oct 15 2018, 3:35 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T205911: Track and install additional npm packages for all service container images.

My suggestion would be to create a nodeX-wmf-servicerunner-base image (or something with a less atrocious name) on top of our basic nodejs image.

Oct 15 2018, 2:05 PM · Release-Engineering-Team (Watching / External), Core Platform Team Backlog (Watching / External), Services (watching), Operations, Release Pipeline

Oct 12 2018

Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

Overall the absence of any valid parsercache entries can explain all the effects we've seen, except at least partially the very high volume of memcached errors.

Oct 12 2018, 3:05 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

As far as MediaWiki fatals go, we had way less issues than one would expect given the graphs above. We had only ~ 1000 fatals due to requests exceeding 60 seconds, and a bunch more for other somewhat related causes

Oct 12 2018, 3:05 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

At the same time, a higher time for processing a single request meant that even in front of a substantially constant request flow

Oct 12 2018, 2:45 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

With memcached wiped clean, and the parsercache databases basically void of useful content, almost all requests needed to be fully parsed by our application servers. This had a series of consequences;

Oct 12 2018, 2:41 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

So first, what I think might be the full root cause of everything: When we switched from codfw to eqiad the parser cache hit ratio drops suddenly to 6% in the matter of minutes. At 15:00, when we started a full recovery, the hit rate was still only 18% but growing steadily.

Oct 12 2018, 1:54 PM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe renamed T206841: Evaluate the consequences of the parsercache being empty post-switchover from Evaluate the consequences of the parsercache fiasco post-switchover to Evaluate the consequences of the parsercache being empty post-switchover.
Oct 12 2018, 9:51 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe renamed T206841: Evaluate the consequences of the parsercache being empty post-switchover from Evaluate the consequences of the parsercache fiaso post-switchover to Evaluate the consequences of the parsercache fiasco post-switchover.
Oct 12 2018, 9:48 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe added a comment to T206841: Evaluate the consequences of the parsercache being empty post-switchover.

First, the timeline:

Oct 12 2018, 9:26 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations
Joe created T206841: Evaluate the consequences of the parsercache being empty post-switchover.
Oct 12 2018, 9:05 AM · User-Joe, Datacenter-Switchover-2018, DBA, Operations

Oct 11 2018

Joe edited P7661 pool_services.sh.
Oct 11 2018, 2:40 PM · Datacenter-Switchover-2018, Operations
akosiaris awarded P7661 pool_services.sh a Like token.
Oct 11 2018, 11:15 AM · Datacenter-Switchover-2018, Operations
Joe created P7661 pool_services.sh.
Oct 11 2018, 11:14 AM · Datacenter-Switchover-2018, Operations

Oct 8 2018

Joe added a parent task for T196685: rack/setup/install rdb10[09|10].eqiad.wmnet: T206450: Reorganize our redis rdb1/rdb2 clusters.
Oct 8 2018, 9:32 AM · User-Joe, User-Elukey, Operations
Joe added a subtask for T206450: Reorganize our redis rdb1/rdb2 clusters: T196685: rack/setup/install rdb10[09|10].eqiad.wmnet.
Oct 8 2018, 9:32 AM · Patch-For-Review, User-jijiki, User-Joe, Operations
Joe created T206450: Reorganize our redis rdb1/rdb2 clusters.
Oct 8 2018, 9:30 AM · Patch-For-Review, User-jijiki, User-Joe, Operations

Oct 5 2018

Joe created T206341: Evaluate scalability and performance of PHP7 compared to HHVM.
Oct 5 2018, 4:09 PM · Performance-Team (Radar), Operations
Imarlier awarded T206336: SRE quarterly goal: Ability to serve a fraction of the production traffic from PHP7 a Love token.
Oct 5 2018, 4:08 PM · Operations
Joe created T206339: Separate Traffic layer caches for PHP7/HHVM.
Oct 5 2018, 4:03 PM · Traffic, Operations
Joe added a project to T206336: SRE quarterly goal: Ability to serve a fraction of the production traffic from PHP7: Operations.
Oct 5 2018, 3:46 PM · Operations
Joe created T206338: Allow directing users to PHP7 based on a cookie.
Oct 5 2018, 3:46 PM · Core Platform Team Backlog (Watching / External), Core Platform Team (PHP7 (TEC4)), Patch-For-Review, Operations
Joe created T206336: SRE quarterly goal: Ability to serve a fraction of the production traffic from PHP7.
Oct 5 2018, 3:25 PM · Operations

Oct 4 2018

Joe added a comment to T206166: puppet compiler set to eqiad as primary dc while prod is codfw.

The master DC is a variable, and while in production that's dynamically generated from etcd (more or less), in the compiler is a static value. That was a deliberate choice to decouple the compiler from the current state of production.

Oct 4 2018, 4:03 PM · Operations
Joe edited P7626 Apache config with sticky loadbalancer.
Oct 4 2018, 10:04 AM
Joe edited P7626 Apache config with sticky loadbalancer.
Oct 4 2018, 10:04 AM
Joe edited P7626 Apache config with sticky loadbalancer.
Oct 4 2018, 8:01 AM
Joe created P7626 Apache config with sticky loadbalancer.
Oct 4 2018, 7:46 AM

Oct 3 2018

Joe added a comment to T206003: Beta Cluster: Parsoid config request failures from the MediaWiki API.

Sorry, I misinterpreted the issue. We are connecting via http even though the configuration says https, so Arlo's patch is all that's needed.

Oct 3 2018, 6:10 PM · Services (done), Parsoid, Beta-Cluster-Infrastructure
Joe added a comment to T206003: Beta Cluster: Parsoid config request failures from the MediaWiki API.

While @Arlolra patch surely does a sensible thing, I am not sure we have an https interface to mediawiki in beta. I didn't think of this when I did the transition in production, probably. This is easy to solve though, and can go in parallel with what @Arlolra is doing (which is indeed correct anyways).

Oct 3 2018, 5:58 PM · Services (done), Parsoid, Beta-Cluster-Infrastructure