Page MenuHomePhabricator

Eevans (Eric Evans)
Senior Software Engineer

Projects (13)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Feb 27 2015, 10:47 PM (254 w, 6 d)
Availability
Available
IRC Nick
urandom
LDAP User
Eevans
MediaWiki User
Unknown

Recent Activity

Today

Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 9:56 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243106: Phased rollout of sessionstore to production fleet.
Fri, Jan 17, 9:41 PM · Patch-For-Review, TPG-Epics (Team Practices Group Coaching Clinic), CPT Initiatives (Multi-DC (TEC1)), User-Clarakosi, User-Eevans
Eevans updated the task description for T243106: Phased rollout of sessionstore to production fleet.
Fri, Jan 17, 9:40 PM · Patch-For-Review, TPG-Epics (Team Practices Group Coaching Clinic), CPT Initiatives (Multi-DC (TEC1)), User-Clarakosi, User-Eevans
Eevans created T243106: Phased rollout of sessionstore to production fleet.
Fri, Jan 17, 9:36 PM · Patch-For-Review, TPG-Epics (Team Practices Group Coaching Clinic), CPT Initiatives (Multi-DC (TEC1)), User-Clarakosi, User-Eevans
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 8:07 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 5:13 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 2:32 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 2:45 AM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Fri, Jan 17, 12:30 AM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations

Yesterday

Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Thu, Jan 16, 10:21 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Thu, Jan 16, 9:30 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans added a comment to T234286: Multi-DC Echo Notification Storage.

TTBMK, everything here is done.

Thu, Jan 16, 7:21 PM · Growth-Team, Notifications, Core Platform Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans claimed T234296: Completed migration.

Done.

Thu, Jan 16, 7:21 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans claimed T234963: Deploy final configuration.

Done.

Thu, Jan 16, 7:20 PM · Core Platform Team Workboards (Clinic Duty Team), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans edited projects for T241784: (No Need By Date) rack/setup/install restbase1029, restbase1029, restbase1030, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Thu, Jan 16, 6:23 PM · Core Platform Team Workboards (Clinic Duty Team), ops-eqiad, Operations
Eevans edited projects for T241790: (No Need By Date Provided) rack/setup/install restbase202[123], added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Thu, Jan 16, 6:23 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans triaged T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c} as Medium priority.
Thu, Jan 16, 5:56 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans moved T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c} from Inbox to Doing on the Core Platform Team Workboards (Clinic Duty Team) board.
Thu, Jan 16, 5:56 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans edited projects for T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Thu, Jan 16, 5:55 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans created T243000: Bootstrap new Cassandra instances: restbase202[123]-{a,b,c}.
Thu, Jan 16, 5:55 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations

Wed, Jan 15

Eevans moved T234963: Deploy final configuration from Doing to Waiting for Review on the Core Platform Team Workboards (Clinic Duty Team) board.
Wed, Jan 15, 9:14 PM · Core Platform Team Workboards (Clinic Duty Team), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans moved T234963: Deploy final configuration from Inbox to Doing on the Core Platform Team Workboards (Clinic Duty Team) board.
Wed, Jan 15, 9:14 PM · Core Platform Team Workboards (Clinic Duty Team), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans triaged T234963: Deploy final configuration as Medium priority.
Wed, Jan 15, 9:13 PM · Core Platform Team Workboards (Clinic Duty Team), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)

Mon, Jan 13

Eevans added a comment to T242461: restrouter.svc.{eqiad,codfw}.wmnet in a failed state.

Since (long-term) we aim to replace all of this, is abandoning it entirely an option?

Is it possible to take it out for now until we either prioritize it again or drop it entirely?

You mean undeploy? Sure we can undeploy it. The only caveat being that redeploying it will take some time as we will need to create the necessary resources again (LVS entries, DNS, kubernetes namespaces etc).

We're running CI for RESTBase in both RESTBase and RESTRouter modes, so it will be in mostly deployable state if we want to put it back online, however maintaining an unused production deployment seems like a waste.

Indeed.

A lot has changed since we began this migration, including https://www.mediawiki.org/wiki/Core_Platform_Team/Decisions_Architecture_Research_Documentation/Services_Architecture_Recommendations_(2019), which is expected be a lengthly process, but will ultimately result in REST{Router,Base}-less world. I guess the question we should be asking is: Is this still something we should do in the meantime (and schedule and resource to complete), or should we cut bait, undeploy from k8s, and leave things as they are?
@WDoranWMF ?

Mon, Jan 13, 8:51 PM · serviceops, Core Platform Team Workboards (Clinic Duty Team)
Eevans added a comment to T242461: restrouter.svc.{eqiad,codfw}.wmnet in a failed state.

Since (long-term) we aim to replace all of this, is abandoning it entirely an option?

Is it possible to take it out for now until we either prioritize it again or drop it entirely?

You mean undeploy? Sure we can undeploy it. The only caveat being that redeploying it will take some time as we will need to create the necessary resources again (LVS entries, DNS, kubernetes namespaces etc).

We're running CI for RESTBase in both RESTBase and RESTRouter modes, so it will be in mostly deployable state if we want to put it back online, however maintaining an unused production deployment seems like a waste.

Indeed.

Mon, Jan 13, 4:43 PM · serviceops, Core Platform Team Workboards (Clinic Duty Team)

Fri, Jan 10

Eevans triaged T242461: restrouter.svc.{eqiad,codfw}.wmnet in a failed state as Medium priority.

It's not clear to me what the status of this is. Do we need to deploy the latest code here? Since (long-term) we aim to replace all of this, is abandoning it entirely an option?

Fri, Jan 10, 8:39 PM · serviceops, Core Platform Team Workboards (Clinic Duty Team)
Eevans created T242461: restrouter.svc.{eqiad,codfw}.wmnet in a failed state.
Fri, Jan 10, 8:35 PM · serviceops, Core Platform Team Workboards (Clinic Duty Team)
Eevans added a comment to T242344: Remove Parsoid-JS tables from Cassandra.

The tables have been dropped in all 3 environments. The only thing remaining is to clear the snapshots (and actually reclaim the space). Out of an abundance of caution, I'll sit on this for a couple days and close the ticket once complete.

Fri, Jan 10, 8:27 PM · Core Platform Team Workboards (Clinic Duty Team), Parsoid-PHP, RESTBase
Eevans added a comment to T242344: Remove Parsoid-JS tables from Cassandra.

OK, here is what I propose applying; Review appreciated!

Fri, Jan 10, 7:51 PM · Core Platform Team Workboards (Clinic Duty Team), Parsoid-PHP, RESTBase
Eevans created P10118 deployment-prep.yaml.
Fri, Jan 10, 7:50 PM
Eevans created P10116 dev.yaml.
Fri, Jan 10, 7:49 PM
Eevans created P10115 production.yaml.
Fri, Jan 10, 7:47 PM
Eevans updated the task description for T242344: Remove Parsoid-JS tables from Cassandra.
Fri, Jan 10, 7:35 PM · Core Platform Team Workboards (Clinic Duty Team), Parsoid-PHP, RESTBase
Eevans triaged T242344: Remove Parsoid-JS tables from Cassandra as Medium priority.
Fri, Jan 10, 7:34 PM · Core Platform Team Workboards (Clinic Duty Team), Parsoid-PHP, RESTBase
Eevans edited projects for T241068: Restrouter health checks fail when local wikifeeds instance is not pool in discovery records, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:47 PM · Core Platform Team Workboards (Clinic Duty Team), serviceops-radar
Eevans edited projects for T178445: flapping monitoring for recommendation_api on scb, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:43 PM · Core Platform Team Workboards (Clinic Duty Team), Recommendation-API, Discovery, Services (watching), Wikidata, Operations, observability
Eevans edited projects for T241905: Investigate JobQueue outage from 2020-01-04 22:00 UTC, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:41 PM · Core Platform Team Workboards (Clinic Duty Team), Wikimedia-Incident, WMF-JobQueue
Eevans edited projects for T241940: No option to continue querying for more results in globalallusers API, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:40 PM · Core Platform Team Workboards (Clinic Duty Team), MediaWiki-extensions-CentralAuth, MediaWiki-API
Eevans edited projects for T242249: Unclear MCR replacement for WikiPage::prepareContentForEdit, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:40 PM · Core Platform Team Workboards (Clinic Duty Team), Documentation, CPT Initiatives (MCR)
Eevans edited projects for T242409: languageinfo API returns a TypeError if you request fallbacks, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team.
Fri, Jan 10, 5:40 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), Core Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error, MediaWiki-API, Regression
Eevans removed a project from T224425: MW Job consumers sometimes pause for several minutes: Core Platform Team.
Fri, Jan 10, 5:39 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (Modern Event Platform (TEC2)), WMF-JobQueue, Discovery-Search (Current work)
Eevans added a project to T224425: MW Job consumers sometimes pause for several minutes: Core Platform Team Workboards (Clinic Duty Team).
Fri, Jan 10, 5:38 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (Modern Event Platform (TEC2)), WMF-JobQueue, Discovery-Search (Current work)
Eevans triaged T240307: Hook container with strong types and DI as Medium priority.
Fri, Jan 10, 5:34 PM · User-Daniel, Core Platform Team, TechCom-RFC
Eevans triaged T170603: API Edit Requires a Captcha, but on Wiki edit does not as Medium priority.
Fri, Jan 10, 5:33 PM · MediaWiki-extensions-OAuth, ConfirmEdit (CAPTCHA extension), MediaWiki-API
Eevans triaged T192023: Allowing seaching the archive table for titles of deleted pages through the API as Medium priority.
Fri, Jan 10, 5:25 PM · MediaWiki-API
Eevans triaged T241940: No option to continue querying for more results in globalallusers API as Medium priority.
Fri, Jan 10, 5:23 PM · Core Platform Team Workboards (Clinic Duty Team), MediaWiki-extensions-CentralAuth, MediaWiki-API
Eevans triaged T242249: Unclear MCR replacement for WikiPage::prepareContentForEdit as Medium priority.
Fri, Jan 10, 5:22 PM · Core Platform Team Workboards (Clinic Duty Team), Documentation, CPT Initiatives (MCR)
Eevans triaged T241905: Investigate JobQueue outage from 2020-01-04 22:00 UTC as Medium priority.
Fri, Jan 10, 5:21 PM · Core Platform Team Workboards (Clinic Duty Team), Wikimedia-Incident, WMF-JobQueue
Eevans triaged T242409: languageinfo API returns a TypeError if you request fallbacks as Medium priority.
Fri, Jan 10, 5:05 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), Core Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error, MediaWiki-API, Regression

Tue, Jan 7

Eevans updated subscribers of T228294: Cassandra PHP driver evaluation.

This seems in our wheelhouse.

  • Once we have a packaged driver, what would we do with it?
Tue, Jan 7, 9:35 PM · Core Platform Team, User-Eevans

Mon, Jan 6

Eevans added a comment to T241790: (No Need By Date Provided) rack/setup/install restbase202[123].
In T238580#5710739, @Eevans wrote:
In T238580#5709953, @RobH wrote:

Also note I assumed details for the racking/hostnames and would appreciate confirmation of those details in task description, thanks!

This cluster uses a replication count of 3 (per-DC), and for eqiad we have machines evenly distributed over a, b, and d. This replica-to-row affinity makes it very nice to reason about where data will be moving from/to on topology changes and it would be a shame if we lost that now. Will there be a problem keeping these to the same 3 rows currently in-use?

Mon, Jan 6, 4:58 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations
Eevans updated the task description for T241790: (No Need By Date Provided) rack/setup/install restbase202[123].
Mon, Jan 6, 4:57 PM · Core Platform Team Workboards (Clinic Duty Team), ops-codfw, Operations

Thu, Dec 19

Eevans closed T218609: Figure out future for newly created deployment-prep jessie instances, a subtask of T218729: Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster, as Resolved.
Thu, Dec 19, 9:17 PM · Cloud-VPS (Debian Jessie Deprecation), Beta-Cluster-Infrastructure
Eevans closed T218609: Figure out future for newly created deployment-prep jessie instances as Resolved.

This is now done. Sorry for the long delay.

Thu, Dec 19, 9:17 PM · Patch-For-Review, Beta-Cluster-Infrastructure
Eevans added a comment to T122825: Service Ownership and Maintenance.

I think most of the issues described here have been in the meantime solved by the implementation of the code stewardship review process and a list of developers/maintainers. @Pchelolo @Eevans @Clarakosi any opinions?

Thu, Dec 19, 4:52 PM · Core Platform Team, TechCom, User-mobrovac, Operations
Eevans added a comment to T218609: Figure out future for newly created deployment-prep jessie instances.

@Eevans: It has been 6 months, please respond.

Thu, Dec 19, 1:46 AM · Patch-For-Review, Beta-Cluster-Infrastructure

Dec 4 2019

Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

Summarizing an IRC discussion: @Catrope will pick this up mid-November(ish), and we'll target deployment for sometime after the November freeze (27th–29th), and before the December freeze (December 23rd-January 3rd).

Dec 4 2019, 7:55 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans triaged T239856: Fold services recommendations into Standards for services RfC as Medium priority.
Dec 4 2019, 7:52 PM · Core Platform Team Workboards (Clinic Duty Team)
Eevans created T239856: Fold services recommendations into Standards for services RfC.
Dec 4 2019, 7:51 PM · Core Platform Team Workboards (Clinic Duty Team)

Dec 2 2019

Eevans added a comment to T236113: API developer creates automated documentation.

We discussed this in our kickoff meeting today.
There was a lot of resistance to the idea of having an endpoint for OpenAPI 3.0 definitions of the (other) endpoints. I like the idea of using OpenAPI since there are a lot of other tools that would benefit, such as client code generators.

Dec 2 2019, 11:02 PM · Core Platform Team Workboards (Green), Story, CPT Initiatives (Core REST API in PHP)

Nov 27 2019

Eevans moved T207946: Evaluate possible optimizations for concurrent JVMs from Inbox to Icebox on the Core Platform Team board.
Nov 27 2019, 7:23 PM · Core Platform Team, Cassandra, User-Eevans
Eevans edited projects for T207946: Evaluate possible optimizations for concurrent JVMs, added: Core Platform Team; removed Core Platform Team (Needs Cleaning - Cassandra Operational).
Nov 27 2019, 7:23 PM · Core Platform Team, Cassandra, User-Eevans
Eevans moved T226553: Install Cassandra table properties Debian package on Cassandra hosts from Inbox to Backlog on the Core Platform Team Workboards (Clinic Duty Team) board.
Nov 27 2019, 7:22 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, User-WDoran
Eevans edited projects for T226553: Install Cassandra table properties Debian package on Cassandra hosts, added: Core Platform Team Workboards (Clinic Duty Team); removed Core Platform Team (Needs Cleaning - Cassandra Operational).
Nov 27 2019, 7:22 PM · Core Platform Team Workboards (Clinic Duty Team), Patch-For-Review, User-WDoran
Eevans edited projects for T228294: Cassandra PHP driver evaluation, added: Core Platform Team; removed Core Platform Team (Needs Cleaning - Cassandra Operational).
Nov 27 2019, 7:20 PM · Core Platform Team, User-Eevans

Nov 25 2019

Eevans moved T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times from Waiting for Review to Done on the Core Platform Team Workboards (Clinic Duty Team) board.

This was deployed during SWAT. See: https://logstash.wikimedia.org/goto/a61eb70c51d26b11835e4bb4caadda0b

Nov 25 2019, 11:13 PM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)

Nov 22 2019

Eevans moved T231027: Cassandra instances outages (was: Outage of restbase2017-b) from Backlog to Ready on the Core Platform Team Workboards (Green) board.
Nov 22 2019, 1:38 AM · Core Platform Team Workboards (Green), User-Eevans
Eevans edited projects for T231027: Cassandra instances outages (was: Outage of restbase2017-b), added: Core Platform Team Workboards (Green); removed Core Platform Team Workboards (Clinic Duty Team).

Since an upgrade to Cassandra 3.11.4 was pending anyway (T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release)), we prioritized that work a) in the event the issue had been fixed upstream, and b) so that if we had to dig deeper to troubleshoot, that we would be doing so against a current release. Unfortunately, this does not seem to have fixed (see: T238591), and we will indeed need to dig deeper.

Nov 22 2019, 1:38 AM · Core Platform Team Workboards (Green), User-Eevans
Eevans merged T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4) into T231027: Cassandra instances outages (was: Outage of restbase2017-b).
Nov 22 2019, 1:37 AM · Core Platform Team Workboards (Green), User-Eevans
Eevans merged task T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4) into T231027: Cassandra instances outages (was: Outage of restbase2017-b).
Nov 22 2019, 1:37 AM · Core Platform Team Workboards (Clinic Duty Team), Cassandra
Eevans renamed T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4) from Casssandra node outage: restbase2015-c to Casssandra node outage: restbase2015-c (Cassandra 3.11.4).
Nov 22 2019, 1:32 AM · Core Platform Team Workboards (Clinic Duty Team), Cassandra

Nov 19 2019

Eevans triaged T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4) as Medium priority.
Nov 19 2019, 7:50 PM · Core Platform Team Workboards (Clinic Duty Team), Cassandra

Nov 18 2019

Eevans added a subtask for T231027: Cassandra instances outages (was: Outage of restbase2017-b): T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4).
Nov 18 2019, 8:46 PM · Core Platform Team Workboards (Green), User-Eevans
Eevans added a parent task for T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4): T231027: Cassandra instances outages (was: Outage of restbase2017-b).
Nov 18 2019, 8:46 PM · Core Platform Team Workboards (Clinic Duty Team), Cassandra
Eevans added a comment to T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4).

This looks suspiciously similar to T231027, right down to the read timeout exceptions during read-repair that precede the event:

Nov 18 2019, 8:46 PM · Core Platform Team Workboards (Clinic Duty Team), Cassandra
Eevans added projects to T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4): Core Platform Team, Cassandra.
Nov 18 2019, 8:30 PM · Core Platform Team Workboards (Clinic Duty Team), Cassandra
Eevans created T238591: Cassandra node outage: restbase2015-c (Cassandra 3.11.4).
Nov 18 2019, 8:30 PM · Core Platform Team Workboards (Clinic Duty Team), Cassandra
Eevans raised the priority of T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times from Medium to High.
Nov 18 2019, 5:05 PM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)

Nov 7 2019

Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

I like the idea of having the ParserCache being a more generalized caching mechanism for MediaWiki. I have serious doubts about other things hinted here, specifically exposing a caching endpoint to other services. I'd argue that such a caching service should be separated from MediaWiki, have a simple API, and probably be structured around the page/revision identifier. We also probably don't want such a system to be written in PHP, as we would aim for the highest possible throughput.
.
We do not want an application doing some business logic to also be the cache storage for everything else. It was wrong with restbase, it would be wrong here. Each application should manage its own caching logic. This logic should not be delegated to another application and should not rely on the automagic properties of some centralized management system that then becomes the brain of the whole architecture. The only exception I see to this could be some purging logic.
.
So if we want such a system to be generalized and usable outside of MediaWiki it should be a thin service in front of a storage system[1] and it should:

  • Have primitives that reproduce whatever API we use with e.g. BagOfStuff
  • Be able to work across datacenters in write/write mode
Nov 7 2019, 10:57 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

What is the use case for external access?

To clarify: I was referring to access outside MW core, but inside the local (in our case, WMF) network. The intent is not to make this a public service that can be accessed directly by external clients.
Concrete use cases (some currently in core), for extracting data from page content, and caching it for later access: Wikibase constraint validation, graphoid, kartographer, mathoid, template data, page summary...

How would an external consumer deal with value validation (e.g. matches known rev id), fragmentation parameters

Ideally, the cache service itself would know about these things and handle them correctly. E.g. before returning a cache entry, it would check that it's not stale, and when purging the entry for a given page, it would purge the entire "bucket" of cached variants.

and how would it deal with absence of the value? -I see ParserCache as fundamentally a getWithSet-like interface (with very high persistence and poolcounter etc, but nonetheless fundamentally lazy-populated).

Currently, ParserCache isn't getWithSet. If there is no entry cached or the cached entry is stale, you get nothing back. Generating and then caching is the caller's responsibility.
For the new component described here, I'd propose to keep it that way. Generally, a component that accesses the cache (inside mw core or as a standalone service) would be using the cache for a kind of derived resource it knows how to generate.
The idea is: there would be one place to go to for getting rendered content, and one to go to for getting extracted infobox data, and one to go to for graphoid output, etc - and each of these places knows how to generate the derived resources, and uses the unified cache internally. This makes more sense to me than a generic end point for fetchi9ng any kind of resource, with some kind of internal routing to generate each resource.
When then should different components that derive different kind of things from pages share the caching infrastructure, instead of writing their own? Because the purging mechanism is the same, and the access keys are the same, and the scale is similar. Having to re-invent this wheel leads to duplication and annoyance, or the abuse of less-than-ideal mechanisms that exist, like page props.

Nov 7 2019, 10:47 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

Kask,[1] accessed via RESTBagOStuff?

Probably not Kask, but perhaps something similar, or a derivative or successor of Kask.

This sounds an awful lot like file storage (where I'm defining "file" to mean some semi-large (for definition of large) chunk of opaque data), which Kask (and Cassandra) aren't well suited for.

Though I'm not entirely sure that we want Cassandra as a backend for this.

Same.

Nov 7 2019, 10:35 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T180051: Reduce the number of fields declared in elasticsearch by logstash.

An additional 2¢

Nov 7 2019, 7:18 PM · Patch-For-Review, observability, Core Platform Team Legacy (Watching / External), Services (watching), Operations, Wikimedia-Logstash

Nov 4 2019

Eevans updated subscribers of T234295: Migration of old timestamps.

During discussions w/ @Catrope, we determined that the impact of timestamp misses were sufficiently minor as to not justify us spending the time to write, test, and debug a migration of data from Redis. What we will do instead: Deploy with a MultiWriteBagOStuff that wraps the new and old store (read-from-new, fallback-to-old, write-to-both). There will be some 90 days or more between this deployment and the decommission of Redis, during which time most active users will have seen-times seeded into the new store.

Nov 4 2019, 3:26 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)

Nov 2 2019

Eevans triaged T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times as Medium priority.
Nov 2 2019, 12:33 AM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)
Eevans created T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times.
Nov 2 2019, 12:33 AM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)

Nov 1 2019

Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

Summarizing an IRC discussion: @Catrope will pick this up mid-November(ish), and we'll target deployment for sometime after the November freeze (27th–29th), and before the December freeze (December 23rd-January 3rd).

Nov 1 2019, 3:00 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications

Oct 31 2019

Eevans reopened T222851: Improve Echo seentime code for multi-DC access, a subtask of T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed., as Open.
Oct 31 2019, 9:03 PM · MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, Performance-Team (Radar), Operations
Eevans reopened T222851: Improve Echo seentime code for multi-DC access, a subtask of T234294: Configurable timestamp storage, as Open.
Oct 31 2019, 9:03 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans reopened T222851: Improve Echo seentime code for multi-DC access as "Open".

I believe this task to be done.

Oct 31 2019, 9:03 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans updated the task description for T222851: Improve Echo seentime code for multi-DC access.
Oct 31 2019, 9:02 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans updated the task description for T222851: Improve Echo seentime code for multi-DC access.
Oct 31 2019, 9:01 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans closed T222851: Improve Echo seentime code for multi-DC access, a subtask of T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed., as Resolved.
Oct 31 2019, 9:00 PM · MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, Performance-Team (Radar), Operations
Eevans closed T222851: Improve Echo seentime code for multi-DC access, a subtask of T234294: Configurable timestamp storage, as Resolved.
Oct 31 2019, 9:00 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans closed T222851: Improve Echo seentime code for multi-DC access as Resolved.

I believe this task to be done.

Oct 31 2019, 9:00 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications

Oct 29 2019

Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

@Catrope given the difficulty in doing a phased migration (vis-a-vis global notifications), do you have any objections to transitioning to the new store in a single go (i.e. updating it to be the default)? We accidentally did this yesterday, and TTBMK, there were no issues. We'd of course keep the multi-write configuration in place for the time being, so if rolling back were necessary, up-to-date seen times would continue to exist in Redis.

Oct 29 2019, 7:06 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans added a comment to T92471: enable authenticated access to Cassandra JMX.

@Eevans Should we move this to the Icebox?

Oct 29 2019, 4:58 PM · Core Platform Team Workboards (Clinic Duty Team), User-Eevans, Cassandra, Operations, Patch-For-Review
Eevans awarded T92471: enable authenticated access to Cassandra JMX a Heartbreak token.
Oct 29 2019, 4:57 PM · Core Platform Team Workboards (Clinic Duty Team), User-Eevans, Cassandra, Operations, Patch-For-Review

Oct 28 2019

Eevans committed rDEPLOYCHARTS994baef95157: echostore: Set TTL to 1 year (31536000) (authored by Eevans).
echostore: Set TTL to 1 year (31536000)
Oct 28 2019, 11:22 PM
Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

To summarize a conversation with @Catrope on IRC:

Oct 28 2019, 10:56 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications
Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

The config change was rolled out in SWAT today. It was rolled back a short time later, out of an abundance of caution, because it seemed to apply more broadly than just testwiki (our expectation). During the time it was deployed, we were seeing ~1k/s requests. Sampling the records in Cassandra, the vast majority were of the form global:echo:seen:{alert,message}:time:%d. A number of them however were for enwiki, ruwiki, idwiki, outreachwiki, wikidatawiki, and others.

Oct 28 2019, 6:37 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), CPT Initiatives (Multi-DC Echo Notification Storage), User-Eevans, Growth-Team, Notifications