Page MenuHomePhabricator

Joe (Giuseppe Lavagetto)
Spy

Projects (22)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 5:57 AM (233 w, 1 d)
Availability
Available
LDAP User
Giuseppe Lavagetto
MediaWiki User
GLavagetto (WMF) [ Global Accounts ]

Recent Activity

Yesterday

Joe added a comment to T218812: Provide the ability to have time-delayed or time-offset jobs in the job queue.

@Mooeypoo thanks for clarifying requirements of the current project better, it's much clearer now.

Fri, Mar 22, 6:52 AM · Services (watching), serviceops, Core Platform Team, Analytics, ChangeProp, EventBus, WMF-JobQueue, Core Platform Team Backlog (Next), TechCom, Community-Tech
Joe added a comment to T213493: Install PHP7 on scandium.

So first of all, why do wtp servers have php installed even? They should not, and they don't.

Fri, Mar 22, 6:31 AM · Patch-For-Review, Operations, Parsoid-PHP

Thu, Mar 21

Joe added a comment to T218812: Provide the ability to have time-delayed or time-offset jobs in the job queue.

I'm a bit conflicted about this, and let me clarify why:

Thu, Mar 21, 7:31 AM · Services (watching), serviceops, Core Platform Team, Analytics, ChangeProp, EventBus, WMF-JobQueue, Core Platform Team Backlog (Next), TechCom, Community-Tech
Joe added a project to T218812: Provide the ability to have time-delayed or time-offset jobs in the job queue: serviceops.
Thu, Mar 21, 7:23 AM · Services (watching), serviceops, Core Platform Team, Analytics, ChangeProp, EventBus, WMF-JobQueue, Core Platform Team Backlog (Next), TechCom, Community-Tech
Mill <mill@mail.com> committed rMSCAf4f665a41886: 4%5eaaaaaaaaaaaa (authored by Joe).
4%5eaaaaaaaaaaaa
Thu, Mar 21, 12:11 AM
Mill <mill@mail.com> committed rMSCA90c1d34088cb: u%5eaaaaaaaaaaaa (authored by Joe).
u%5eaaaaaaaaaaaa
Thu, Mar 21, 12:11 AM

Tue, Mar 19

Etonkovidova awarded T217938: Beta Cluster does not have php7.0-redis available a Yellow Medal token.
Tue, Mar 19, 5:13 PM · Patch-For-Review, PHP 7.0 support, Beta-Cluster-Infrastructure

Mon, Mar 18

Joe added a comment to T176916: Set up sampling profiler for PHP 7 (alternative to HHVM Xenon).

@Krinkle now that I'm back, what needs to be done on the SRE side for this to be considered done?

Mon, Mar 18, 4:22 PM · Core Platform Team Kanban (Doing), PHP 7.1 support, Core Platform Team (PHP7 (TEC4)), Performance-Team
mmodell awarded T217938: Beta Cluster does not have php7.0-redis available a Orange Medal token.
Mon, Mar 18, 2:12 PM · Patch-For-Review, PHP 7.0 support, Beta-Cluster-Infrastructure
Joe claimed T217938: Beta Cluster does not have php7.0-redis available.
Mon, Mar 18, 12:27 PM · Patch-For-Review, PHP 7.0 support, Beta-Cluster-Infrastructure
Joe added a comment to T217938: Beta Cluster does not have php7.0-redis available.

Digging further:

Mon, Mar 18, 12:27 PM · Patch-For-Review, PHP 7.0 support, Beta-Cluster-Infrastructure
Joe added a comment to T217938: Beta Cluster does not have php7.0-redis available.

Sorry for coming late to the party, I was AFK last week.

Mon, Mar 18, 12:19 PM · Patch-For-Review, PHP 7.0 support, Beta-Cluster-Infrastructure
Joe added a comment to T218005: Variable wmgUsePagedTiffHandler from InitialiseSettings undefined.

Looking at the logs on the server:

  • 2019-03-11T06:32:48 errors start
  • 2019-03-11T06:34:57 opcache received a request to invalidate a single file /opcache-free?file=wmf-config%2Fdb-eqiad.php, during a scap run.
  • 2019-03-11T06:49:09 last error is logged
  • 2019-03-11T06:49:22 a full opcache flush was requested /opcache-free - not sure what caused this, but it came from scap.
Mon, Mar 18, 9:26 AM · Patch-For-Review, PHP 7.2 support, Wikimedia-production-error, MediaWiki-extensions-PagedTiffHandler, User-DannyS712
Joe added a comment to T218005: Variable wmgUsePagedTiffHandler from InitialiseSettings undefined.

So while the errors found by @MaxSem on logstash seem to refer to a host that was somehow in a bad state in terms of opcache state, which will need further analysis, the other issue to fix is the display_errors ini setting, which used to be in our php configuration and I didn't modify.

Mon, Mar 18, 9:03 AM · Patch-For-Review, PHP 7.2 support, Wikimedia-production-error, MediaWiki-extensions-PagedTiffHandler, User-DannyS712

Tue, Mar 12

Joe added a comment to T217881: Decide whether to keep violating OpenAPI/Swagger specification in our REST services.

After reading the openapi spec and examples, I think the best approach to address optional params is by making them query paramters

So, an example path like this

/required{/optional1}{/optional2}

can be just

/required?optional1=value1&optional2=value2

As per spec query parameters can be optional. But this kind of change would mean that we need a new version of api so that we don't break old path based approaches. Plus a deprecation plan.

Tue, Mar 12, 6:43 AM · Core Platform Team Kanban (Done with CPT), Services (done), TechCom, RESTBase-API, serviceops, Operations
Joe added a comment to T217881: Decide whether to keep violating OpenAPI/Swagger specification in our REST services.

Well instead of copy/pasta, there is that thing called YAML anchors/references that deal with data repetition in YAML files.

Tue, Mar 12, 6:42 AM · Core Platform Team Kanban (Done with CPT), Services (done), TechCom, RESTBase-API, serviceops, Operations

Thu, Mar 7

Eevans awarded T217650: Deployment strategy for the session storage application. a Cookie token.
Thu, Mar 7, 8:14 PM · Patch-For-Review, Kubernetes, serviceops, Core Platform Team (Multi-DC (TEC1)), User-Clarakosi, Core Platform Team Backlog (Next), User-Eevans

Wed, Mar 6

Effie Mouzeli <effie@wikimedia.org> committed rMSCA218e431ec35d: Add tests for scap.cli (authored by Joe).
Add tests for scap.cli
Wed, Mar 6, 7:37 PM
Effie Mouzeli <effie@wikimedia.org> committed rMSCA569c262475c1: Remove functionality to talk to conftool (authored by Joe).
Remove functionality to talk to conftool
Wed, Mar 6, 7:37 PM

Tue, Mar 5

Joe closed T217587: Class 'Memcached' not found for php7 in beta as Resolved.
Tue, Mar 5, 8:06 PM · Release-Engineering-Team (Kanban), Beta-Cluster-Infrastructure, Scap
Joe added a comment to T217587: Class 'Memcached' not found for php7 in beta.

apt-get install php7.2-msgpack solved the problem FWIW.

Tue, Mar 5, 8:06 PM · Release-Engineering-Team (Kanban), Beta-Cluster-Infrastructure, Scap
Joe closed T217611: Deploy scap 3.9.2-1 as Resolved.
Tue, Mar 5, 2:32 PM · Release-Engineering-Team (Kanban), serviceops, Scap
Joe closed T217611: Deploy scap 3.9.2-1, a subtask of T217597: Scap: server_groups regression, as Resolved.
Tue, Mar 5, 2:32 PM · Patch-For-Review, Release-Engineering-Team (Kanban), serviceops, Scap
Joe added a comment to T208524: RfC: Standards for external services in the Wikimedia infrastructure..
Tue, Mar 5, 12:41 PM · TechCom-RFC (TechCom-Approved), serviceops
Joe updated the task description for T208524: RfC: Standards for external services in the Wikimedia infrastructure..
Tue, Mar 5, 12:40 PM · TechCom-RFC (TechCom-Approved), serviceops
Joe added a comment to T208524: RfC: Standards for external services in the Wikimedia infrastructure..

Production deployment
<snip>
Have backups if the service stores any data

May be we could add "and a restoration/emergency plan", in terms that sometimes it might not be straightforward of what to do with a backup or what is included in a backup.

Tue, Mar 5, 12:32 PM · TechCom-RFC (TechCom-Approved), serviceops
Joe updated the task description for T208524: RfC: Standards for external services in the Wikimedia infrastructure..
Tue, Mar 5, 12:32 PM · TechCom-RFC (TechCom-Approved), serviceops
Joe placed T217650: Deployment strategy for the session storage application. up for grabs.
Tue, Mar 5, 12:28 PM · Patch-For-Review, Kubernetes, serviceops, Core Platform Team (Multi-DC (TEC1)), User-Clarakosi, Core Platform Team Backlog (Next), User-Eevans
Joe triaged T217650: Deployment strategy for the session storage application. as Normal priority.
Tue, Mar 5, 12:26 PM · Patch-For-Review, Kubernetes, serviceops, Core Platform Team (Multi-DC (TEC1)), User-Clarakosi, Core Platform Team Backlog (Next), User-Eevans
Joe added a comment to T216456: JSTOR is blocking citoid IPs.

@Mvolz this means we can reconfigure citoid to use both proxies?

Tue, Mar 5, 7:41 AM · serviceops, Citoid

Mon, Mar 4

Joe updated the task description for T212828: SRE FY2019 Q3 goal: Ramp-up serving traffic to PHP 7 .
Mon, Mar 4, 9:45 AM · User-Joe, serviceops, Operations
Joe updated the task description for T212828: SRE FY2019 Q3 goal: Ramp-up serving traffic to PHP 7 .
Mon, Mar 4, 9:45 AM · User-Joe, serviceops, Operations
Joe closed T211964: Make scap and opcache work consistently together, a subtask of T176370: Migrate to PHP 7 in WMF production, as Resolved.
Mon, Mar 4, 9:44 AM · Core Platform Team Kanban (Doing), Core Platform Team (PHP7 (TEC4)), Patch-For-Review, TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe closed T211964: Make scap and opcache work consistently together as Resolved.
Mon, Mar 4, 9:44 AM · User-Joe, Patch-For-Review, Scap, User-ArielGlenn, Operations

Thu, Feb 28

Joe added a comment to T217323: Beta site api.phps gets - RuntimeException: RedisConnectionPool requires a Redis client library.

Applies magic wand solved!

Thu, Feb 28, 5:54 PM · PHP 7.0 support, Beta-Cluster-Infrastructure
Joe added a comment to T201963: RFC: Modern Event Platform: Stream Intake Service.

Can we close this?

Thu, Feb 28, 6:31 AM · Core Platform Team Backlog (Watching / External), TechCom-RFC (TechCom-Approved), Services (watching), Analytics-EventLogging, EventBus, Analytics

Wed, Feb 27

Joe added a comment to T217020: Test different growth factors for memcached (prep step for upgrade to newer versions).

FWIW I would like to decrease the max key size instead than increasing it.

Wed, Feb 27, 7:10 PM · User-jijiki, serviceops, Performance-Team (Radar), Operations, User-Elukey
Joe updated the task description for T208524: RfC: Standards for external services in the Wikimedia infrastructure..
Wed, Feb 27, 8:16 AM · TechCom-RFC (TechCom-Approved), serviceops
Joe added a comment to T208524: RfC: Standards for external services in the Wikimedia infrastructure..

Would it be possible to clarify the wording on "There is no existing FLOSS software that provides the same functionality"? I believe the intent here is about surveying the FLOSS ecosystem for well crafted, well maintained, architecturally compatible FLOSS software that provides comparable functionality before specifying and building new non-trivial standalone services.

Wed, Feb 27, 8:12 AM · TechCom-RFC (TechCom-Approved), serviceops
Joe added a comment to T208524: RfC: Standards for external services in the Wikimedia infrastructure..

I think that the

"Collect RED metrics; be able to export those metrics according to WMF standards specified in the implementation guidelines"

should be rephrased to be more generic, e.g.

"Be able to collect and expose operational metrics according to the current WMF standards specified in the implementation guidelines"

and then specify in the implementation guidelines that we want RED metrics, or their larger counterpart, the 4 golden signals[1] or perhaps radically different approaches.

[1] https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/#xref_monitoring_golden-signals

Wed, Feb 27, 8:10 AM · TechCom-RFC (TechCom-Approved), serviceops
Joe added a comment to T208524: RfC: Standards for external services in the Wikimedia infrastructure..

"Log all requests received via the production logging facilities"

Should we make this a bit more generic? e.g.

"Log actions via the production logging facilities"

Wed, Feb 27, 8:08 AM · TechCom-RFC (TechCom-Approved), serviceops

Tue, Feb 26

Joe committed rMSCAcb41af66b6c3: Fix OpcacheManager.invalidate_all() (authored by Joe).
Fix OpcacheManager.invalidate_all()
Tue, Feb 26, 4:11 PM
Joe added a comment to T216860: WMFTimeoutException in Special:Search.

ExcimerTimer reacts differently than HHVM set_time_limit?

As I understand it, there is a notable difference between PHP 7 timeouts and HHVM timeouts (if I recall correctly, PHP 7 measures CPU-time only, whereas HHVM counts wall-clock time). @tstarling added use of Excimer timeouts to approximate the HHVM-like behaviour under PHP 7.

Tue, Feb 26, 7:12 AM · Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, CirrusSearch, Wikimedia-production-error
Joe added a comment to T207703: Pruning docker-pkg images.

I think this just needs to be deployed at this point (which looking at /srv/deployment maybe hasn't happened in a while).

Tue, Feb 26, 6:28 AM · Patch-For-Review, docker-pkg, Continuous-Integration-Infrastructure

Mon, Feb 25

Joe committed rMSCA579fa8e18068: Explicitly pass the config to OpcacheManager.invalidate_all() (authored by Joe).
Explicitly pass the config to OpcacheManager.invalidate_all()
Mon, Feb 25, 5:45 PM
Joe added a comment to T192457: Reallocate former image scalers.

@Dzahn https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/485968/ removed the spare role from mw2151,but the host is still installed with role(spare) and puppet is failing:

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, secret(): invalid secret mcrouter/mw2151.codfw.wmnet/mw2151.codfw.wmnet.crt.pem at /etc/puppet/modules/profile/manifests/mediawiki/mcrouter_wancache.pp:70:24 on node mw2151.codfw.wmnet
Mon, Feb 25, 8:56 AM · Patch-For-Review, Operations

Sat, Feb 23

Joe added a comment to T216676: Set up A/B testing mechanism for PHP7,.

@Joe I understand the choice between VCL in Varnish and client-side JS favouring the latter. While I'm not sure which exact issue you encountered, I can imagine several (e.g. difficult to produce rands in VCL, or wanting to avoid the cost server-side for bots/scrapers.)

Sat, Feb 23, 1:15 AM · MW-1.33-notes (1.33.0-wmf.19; 2019-02-26), Patch-For-Review, User-Joe, serviceops, Operations

Fri, Feb 22

Joe added a comment to T216712: Switch PHP 7.2 packages to an internal component.

As far as I can see we're using packages generated by the following source packages:

Fri, Feb 22, 7:40 AM · Patch-For-Review, cloud-services-team (Kanban), Toolforge, Operations
Joe added a comment to T216744: libpcre-related performance opportunities.

So, while jit is enabled by default on PHP 7.2 (pcre.jit is 1 by default), I don't see how perf could help in knowing how full the JIT VM is (which is what we want to measure probably).

Fri, Feb 22, 7:10 AM · Performance, Performance-Team
Joe added a comment to T122676: Implement sentinel for ORES production Redis.

Personally speaking, I think the real blocker is proper resourcing for doing this work, which isn't something we can do as a side job. Using redis-sentinel while maintaining our operational standards means we have to run tests, set up appropriate monitoring, learn how to recover from the inevitable failure scenarios.

Fri, Feb 22, 6:04 AM · Scoring-platform-team, ORES
Joe added a comment to T216689: Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11).

And -11 would be a segmentation fault.

Fri, Feb 22, 5:50 AM · Wikimedia-production-error (Shared Build Failure), Language-Team (Language-2019-January-March), Continuous-Integration-Infrastructure, HHVM

Feb 21 2019

Joe updated the task description for T216676: Set up A/B testing mechanism for PHP7,.
Feb 21 2019, 6:43 AM · MW-1.33-notes (1.33.0-wmf.19; 2019-02-26), Patch-For-Review, User-Joe, serviceops, Operations
Joe triaged T216676: Set up A/B testing mechanism for PHP7, as Normal priority.
Feb 21 2019, 6:42 AM · MW-1.33-notes (1.33.0-wmf.19; 2019-02-26), Patch-For-Review, User-Joe, serviceops, Operations
Joe closed T210717: Find an alternative to HHVM curl connection pooling for PHP 7, a subtask of T176370: Migrate to PHP 7 in WMF production, as Resolved.
Feb 21 2019, 6:31 AM · Core Platform Team Kanban (Doing), Core Platform Team (PHP7 (TEC4)), Patch-For-Review, TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe closed T210717: Find an alternative to HHVM curl connection pooling for PHP 7 as Resolved.

This task is resolved per-se, we still might need the mw-config patches if they weren't merged.

Feb 21 2019, 6:31 AM · Patch-For-Review, Discovery-Search (Current work), serviceops, CirrusSearch, Operations
Joe added a comment to T214362: RFC: Store WikibaseQualityConstraint check data in persistent storage.
Feb 21 2019, 6:26 AM · Core Platform Team Backlog (Designing), Services (designing), User-mobrovac, wikidata-tech-focus, TechCom-RFC, Wikibase-Quality, Wikidata

Feb 20 2019

Joe committed rMSCA0e679e9cfe56: Remove functionality to talk to conftool (authored by Joe).
Remove functionality to talk to conftool
Feb 20 2019, 4:24 PM
Joe committed rMSCA251ece3beeff: Add tests for scap.cli (authored by Joe).
Add tests for scap.cli
Feb 20 2019, 4:19 PM
Joe added a comment to T176916: Set up sampling profiler for PHP 7 (alternative to HHVM Xenon).

On-demand profiling now works on mwdebug1002.

Feb 20 2019, 6:57 AM · Core Platform Team Kanban (Doing), PHP 7.1 support, Core Platform Team (PHP7 (TEC4)), Performance-Team

Feb 19 2019

Joe added a comment to T176916: Set up sampling profiler for PHP 7 (alternative to HHVM Xenon).

I did manually install php7.2-tideways-xhprof on mwdebug1001 and I now see the following error:

Feb 19 2019, 3:42 PM · Core Platform Team Kanban (Doing), PHP 7.1 support, Core Platform Team (PHP7 (TEC4)), Performance-Team
Joe added a comment to T176916: Set up sampling profiler for PHP 7 (alternative to HHVM Xenon).

So basically either we rewrite profiler.php to support the old tideways extension or (which I prefer at this point, tbh) we build just the xhprof component, which seems to have a smaller surface.

Feb 19 2019, 9:33 AM · Core Platform Team Kanban (Doing), PHP 7.1 support, Core Platform Team (PHP7 (TEC4)), Performance-Team
Joe added a comment to T176916: Set up sampling profiler for PHP 7 (alternative to HHVM Xenon).

I just ran a simple test to verify if profiling to xhgui works in PHP 7, after @Krinkle told me it doesn't.

Feb 19 2019, 9:01 AM · Core Platform Team Kanban (Doing), PHP 7.1 support, Core Platform Team (PHP7 (TEC4)), Performance-Team
Joe committed rMSCAbd1444349a84: Remove functionality to talk to conftool (authored by Joe).
Remove functionality to talk to conftool
Feb 19 2019, 8:26 AM
Joe committed rMSCAcb7d7c9ddef6: Rewrite the concurrency logic of OpcacheManager (authored by Joe).
Rewrite the concurrency logic of OpcacheManager
Feb 19 2019, 6:27 AM

Feb 15 2019

Joe committed rMSCAb66a4e6d8f0b: Rewrite the concurrency logic of OpcacheManager (authored by Joe).
Rewrite the concurrency logic of OpcacheManager
Feb 15 2019, 5:23 PM
Joe added a comment to T216164: Puppet failures on deployment-deploy01.deployment-prep.eqiad.wmflabs.

I did keep beta in mind when writing this, as it is apparent from labs/deployment-prep/common.yaml containing a hiera key to ensure we won't use or install the services proxy there.

Feb 15 2019, 7:00 AM · Patch-For-Review, Beta-Cluster-Infrastructure
Joe added a comment to T213318: Wikibase Front-End Architecture.

I frankly have a bit of a hard time imagining an IT person of the kind that commonly installs smaller wikis being able to efficiently maintain a zoo of services that we're now running in WMF. I think the model of "unpack the code, start httpd, welcome to the wiki world" should still be supported. Maybe not at 100% functionality, but at least basic things (like editing Wikibase) should work.

Another solution to this problem would be to require Docker and Kuberneties which are both free and trivial to setup (especially if we distribute MediaWiki and all of it's services with Helm)

Feb 15 2019, 6:53 AM · TechCom-RFC (TechCom-Approved), Wikidata
Joe added a comment to T151903: Special:Search performs DB writes on GET request.

Given how efficient the jobqueue is, we can expect that more than 99% of the preferences will be saved before a new search happens. Having some sort of protection against that 1% (like, localstorage "tricks") would help, but I'd argue that would be better than 50% of the calls needing a cross-datacenter write in terms of reliability and user experience.
But this is clearly a more general question: do we prefer to impose strong causality[1] at the risk of increasing the overall error rate, or do we prefer to accept asynchronicity and deal with the added complexity, while guaranteeing a better eventual consistency?

Feb 15 2019, 6:30 AM · Availability (MediaWiki-MultiDC), Discovery-Search, Discovery, CirrusSearch

Feb 14 2019

Joe added a comment to T216102: Determine which PHP version to target with Parsoid.

@ssastry you should aim at php 7.x support, with x >= 2.

Feb 14 2019, 3:22 PM · Patch-For-Review, Parsoid-PHP
Joe added a comment to T214362: RFC: Store WikibaseQualityConstraint check data in persistent storage.

In order to better understand your needs, let me ask you a few questions:

Feb 14 2019, 3:14 PM · Core Platform Team Backlog (Designing), Services (designing), User-mobrovac, wikidata-tech-focus, TechCom-RFC, Wikibase-Quality, Wikidata
Joe closed T212418: Memory error on restbase1016 as Resolved.
oblivian@restbase1016:~$ sudo -i pool-restbase 
oblivian@restbase1016:~$ echo $?
0
Feb 14 2019, 11:50 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations
Joe closed T209136: python3-etcd needs python3-dnspython as Resolved.
Feb 14 2019, 11:49 AM · Core Platform Team Kanban (Done with CPT), Services (watching), Patch-For-Review, Operations, Operations-Software-Development
Joe closed T216084: mw1338 hhvm complaining intermittently about TC as Resolved.

The Jit TC cold portion was completely full and exhausted on that server. A restart of HHVM should've solved the issue.

Feb 14 2019, 11:44 AM · serviceops, HHVM, Operations, Wikimedia-production-error
Joe claimed T216084: mw1338 hhvm complaining intermittently about TC.
Feb 14 2019, 11:43 AM · serviceops, HHVM, Operations, Wikimedia-production-error
Joe added a project to T216084: mw1338 hhvm complaining intermittently about TC: serviceops.
Feb 14 2019, 9:36 AM · serviceops, HHVM, Operations, Wikimedia-production-error
Joe updated the task description for T208524: RfC: Standards for external services in the Wikimedia infrastructure..
Feb 14 2019, 9:35 AM · TechCom-RFC (TechCom-Approved), serviceops
Joe added a comment to T215046: RfC: Use Github login for mediawiki.org.

I think the privacy implications are a bit more nuanced than what you've stated, and should grant a request to the legal department for a green light before we allow an external authentication provider to work on any site under our privacy policy, if we don't do this already (and AFAIK this would be a first, more or less).

Feb 14 2019, 6:31 AM · User-Tgr, Privacy, Security, TechCom-RFC, Wikimedia-General-or-Unknown, GitHub-Mirrors

Feb 13 2019

Joe added a comment to T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed..

@jijiki what is the total number of items stored on those redises? So that I can understand how much of that is used by the sessions. I guess less than 1%?

Feb 13 2019, 11:14 AM · User-mobrovac, Services (doing), User-jijiki, Core Platform Team Kanban (Doing), Core Platform Team (Security, stability, performance and scalability (TEC1)), Performance-Team (Radar), Operations, MediaWiki-Cache, serviceops
Joe added a comment to T214073: Fix maps puppet to make sure apt-get update runs after configuration change.

@Gehel I would say what's missing is a clear dependency between the installation of the cassandra package and the apt-get update.

Feb 13 2019, 10:32 AM · Patch-For-Review, Puppet, Operations, Discovery-Search, Maps
Joe added a comment to T213708: Upgrade production prometheus-node-exporter to >= 0.16.

I've just noticed, based on a diffscan email, that the new version of prometheus-node-exporter ALSO binds to :::9100 on ipv6 and listens to all ipv6 clients, while the old node exporter version would only bind to a specific interface on ipv4 and listen on that interface.

Feb 13 2019, 10:29 AM · Patch-For-Review, Goal, monitoring, Operations

Feb 12 2019

Joe added a comment to T209136: python3-etcd needs python3-dnspython.

Please note this is fixed on jessie but not on stretch. I'm going to look into it now.

Feb 12 2019, 5:25 PM · Core Platform Team Kanban (Done with CPT), Services (watching), Patch-For-Review, Operations, Operations-Software-Development
Joe reopened T209136: python3-etcd needs python3-dnspython as "Open".
Feb 12 2019, 5:24 PM · Core Platform Team Kanban (Done with CPT), Services (watching), Patch-For-Review, Operations, Operations-Software-Development
Joe added a comment to T212418: Memory error on restbase1016.

this is a result of a defect in python3-etcd packaging (so, blame me!)

Feb 12 2019, 5:23 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations
Joe closed T215481: Add Erika to techcom@ as Resolved.
Feb 12 2019, 5:11 PM · TechCom
Joe closed T215328: Requesting access to deployment, contint-admins, and contint-docker for Brennen Bearnes as Resolved.

@brennen your key was added to production; let me know if you have any problem accessing production servers, either here or on IRC.

Feb 12 2019, 1:19 PM · Patch-For-Review, Operations, SRE-Access-Requests
Joe closed T215328: Requesting access to deployment, contint-admins, and contint-docker for Brennen Bearnes, a subtask of T214556: Onboarding Brennen, as Resolved.
Feb 12 2019, 1:19 PM · User-greg, Release-Engineering-Team (Kanban)
Joe added a comment to T151903: Special:Search performs DB writes on GET request.

Wouldn't it be feasible to have the search request generate a simple job that saves that preference asynchronously?

Feb 12 2019, 7:27 AM · Availability (MediaWiki-MultiDC), Discovery-Search, Discovery, CirrusSearch
Joe closed T185195: tmpreaper doesn't play along with PrivateTmp systemd units as Resolved.
Feb 12 2019, 6:58 AM · Patch-For-Review, Operations, User-Elukey
Joe closed T185195: tmpreaper doesn't play along with PrivateTmp systemd units, a subtask of T132324: Tracking and Reducing cron-spam to root@ , as Resolved.
Feb 12 2019, 6:58 AM · Patch-For-Review, Operations
Joe added a comment to T185195: tmpreaper doesn't play along with PrivateTmp systemd units.

FYI, I've merged a change yesterday that should've fixed the problem from now on.

Feb 12 2019, 6:55 AM · Patch-For-Review, Operations, User-Elukey

Feb 11 2019

Joe committed rMSCA73f63b5e5b6d: Fix invalidate_opcache (authored by Joe).
Fix invalidate_opcache
Feb 11 2019, 3:49 PM
Joe committed rMSCA938bdf479243: Fix invalidate_opcache (authored by Joe).
Fix invalidate_opcache
Feb 11 2019, 12:38 PM
Joe added a comment to T214130: Requesting access to production for dsharpe.

I will assume you can successfully access and just resolve the ticket. Please reopen it if any issue happens.

Feb 11 2019, 10:50 AM · SRE-Access-Requests, Operations
Joe closed T215376: mwscript dies on mwmaint with PHP=php7.2 due to php-redis missing, a subtask of T176370: Migrate to PHP 7 in WMF production, as Resolved.
Feb 11 2019, 10:45 AM · Core Platform Team Kanban (Doing), Core Platform Team (PHP7 (TEC4)), Patch-For-Review, TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
Joe closed T215376: mwscript dies on mwmaint with PHP=php7.2 due to php-redis missing as Resolved.
Feb 11 2019, 10:45 AM · User-jijiki, serviceops, Operations, Wikimedia-General-or-Unknown, PHP 7.2 support
Joe added a comment to T210717: Find an alternative to HHVM curl connection pooling for PHP 7.
Feb 11 2019, 5:22 AM · Patch-For-Review, Discovery-Search (Current work), serviceops, CirrusSearch, Operations

Feb 9 2019

Joe added projects to T215624: AMC – Opt-in for logged out users: Traffic, User-Joe, Operations.
Feb 9 2019, 11:03 AM · Operations, User-Joe, Traffic, Advanced Mobile Contributions
Joe added a comment to T215624: AMC – Opt-in for logged out users.

Thanks @phuedx for opening the task!

Feb 9 2019, 11:02 AM · Operations, User-Joe, Traffic, Advanced Mobile Contributions

Feb 8 2019

Joe added a comment to T215376: mwscript dies on mwmaint with PHP=php7.2 due to php-redis missing.

`All the extensions were not upgraded at the time we did the 7.0 => 7.2 transition - my bad! Updating them manually was the only thing that was needed - we needed to update all of them and not just php-redis.

Feb 8 2019, 3:48 PM · User-jijiki, serviceops, Operations, Wikimedia-General-or-Unknown, PHP 7.2 support
Joe added a comment to T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed..

One thing we'd need to make sure of is that the Session Storage API isn't designed to be a general-purpose key-value store. Brad covered it pretty well here. I think the primary feature that we use a lot in MW core and extensions is atomic increment, which is not important for sessions but pretty important for stats, counters, toggles, etc.

Feb 8 2019, 7:17 AM · User-mobrovac, Services (doing), User-jijiki, Core Platform Team Kanban (Doing), Core Platform Team (Security, stability, performance and scalability (TEC1)), Performance-Team (Radar), Operations, MediaWiki-Cache, serviceops