Page MenuHomePhabricator

Eevans (Eric Evans)
Senior Software Engineer

Projects (13)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Feb 27 2015, 10:47 PM (245 w, 4 d)
Availability
Available
IRC Nick
urandom
LDAP User
Eevans
MediaWiki User
Unknown

Recent Activity

Thu, Nov 7

Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

I like the idea of having the ParserCache being a more generalized caching mechanism for MediaWiki. I have serious doubts about other things hinted here, specifically exposing a caching endpoint to other services. I'd argue that such a caching service should be separated from MediaWiki, have a simple API, and probably be structured around the page/revision identifier. We also probably don't want such a system to be written in PHP, as we would aim for the highest possible throughput.
.
We do not want an application doing some business logic to also be the cache storage for everything else. It was wrong with restbase, it would be wrong here. Each application should manage its own caching logic. This logic should not be delegated to another application and should not rely on the automagic properties of some centralized management system that then becomes the brain of the whole architecture. The only exception I see to this could be some purging logic.
.
So if we want such a system to be generalized and usable outside of MediaWiki it should be a thin service in front of a storage system[1] and it should:

  • Have primitives that reproduce whatever API we use with e.g. BagOfStuff
  • Be able to work across datacenters in write/write mode
Thu, Nov 7, 10:57 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

What is the use case for external access?

To clarify: I was referring to access outside MW core, but inside the local (in our case, WMF) network. The intent is not to make this a public service that can be accessed directly by external clients.
Concrete use cases (some currently in core), for extracting data from page content, and caching it for later access: Wikibase constraint validation, graphoid, kartographer, mathoid, template data, page summary...

How would an external consumer deal with value validation (e.g. matches known rev id), fragmentation parameters

Ideally, the cache service itself would know about these things and handle them correctly. E.g. before returning a cache entry, it would check that it's not stale, and when purging the entry for a given page, it would purge the entire "bucket" of cached variants.

and how would it deal with absence of the value? -I see ParserCache as fundamentally a getWithSet-like interface (with very high persistence and poolcounter etc, but nonetheless fundamentally lazy-populated).

Currently, ParserCache isn't getWithSet. If there is no entry cached or the cached entry is stale, you get nothing back. Generating and then caching is the caller's responsibility.
For the new component described here, I'd propose to keep it that way. Generally, a component that accesses the cache (inside mw core or as a standalone service) would be using the cache for a kind of derived resource it knows how to generate.
The idea is: there would be one place to go to for getting rendered content, and one to go to for getting extracted infobox data, and one to go to for graphoid output, etc - and each of these places knows how to generate the derived resources, and uses the unified cache internally. This makes more sense to me than a generic end point for fetchi9ng any kind of resource, with some kind of internal routing to generate each resource.
When then should different components that derive different kind of things from pages share the caching infrastructure, instead of writing their own? Because the purging mechanism is the same, and the access keys are the same, and the scale is similar. Having to re-invent this wheel leads to duplication and annoyance, or the abuse of less-than-ideal mechanisms that exist, like page props.

Thu, Nov 7, 10:47 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data.

Kask,[1] accessed via RESTBagOStuff?

Probably not Kask, but perhaps something similar, or a derivative or successor of Kask.

This sounds an awful lot like file storage (where I'm defining "file" to mean some semi-large (for definition of large) chunk of opaque data), which Kask (and Cassandra) aren't well suited for.

Though I'm not entirely sure that we want Cassandra as a backend for this.

Same.

Thu, Nov 7, 10:35 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans added a comment to T180051: Reduce the number of fields declared in elasticsearch by logstash.

An additional 2¢

Thu, Nov 7, 7:18 PM · observability, Core Platform Team Legacy (Watching / External), Services (watching), Operations, Wikimedia-Logstash

Mon, Nov 4

Eevans updated subscribers of T234295: Migration of old timestamps.

During discussions w/ @Catrope, we determined that the impact of timestamp misses were sufficiently minor as to not justify us spending the time to write, test, and debug a migration of data from Redis. What we will do instead: Deploy with a MultiWriteBagOStuff that wraps the new and old store (read-from-new, fallback-to-old, write-to-both). There will be some 90 days or more between this deployment and the decommission of Redis, during which time most active users will have seen-times seeded into the new store.

Mon, Nov 4, 3:26 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)

Sat, Nov 2

Eevans triaged T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times as Normal priority.
Sat, Nov 2, 12:33 AM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)
Eevans created T237143: Log warning: Duplicate get(): "officewiki:echo:seen:message:time:{n}" fetched 2 times.
Sat, Nov 2, 12:33 AM · Notifications, MediaWiki-Cache, Growth-Team, Core Platform Team Workboards (Clinic Duty Team)

Fri, Nov 1

Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

Summarizing an IRC discussion: @Catrope will pick this up mid-November(ish), and we'll target deployment for sometime after the November freeze (27th–29th), and before the December freeze (December 23rd-January 3rd).

Fri, Nov 1, 3:00 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team

Thu, Oct 31

Eevans reopened T222851: Improve Echo seentime code for multi-DC access, a subtask of T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed., as Open.
Thu, Oct 31, 9:03 PM · MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, Performance-Team (Radar), Operations
Eevans reopened T222851: Improve Echo seentime code for multi-DC access, a subtask of T234294: Configurable timestamp storage, as Open.
Thu, Oct 31, 9:03 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans reopened T222851: Improve Echo seentime code for multi-DC access as "Open".

I believe this task to be done.

Thu, Oct 31, 9:03 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team
Eevans updated the task description for T222851: Improve Echo seentime code for multi-DC access.
Thu, Oct 31, 9:02 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team
Eevans updated the task description for T222851: Improve Echo seentime code for multi-DC access.
Thu, Oct 31, 9:01 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team
Eevans closed T222851: Improve Echo seentime code for multi-DC access, a subtask of T212129: Use a multi-dc aware store for ObjectCache's MainStash if needed., as Resolved.
Thu, Oct 31, 9:00 PM · MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, Performance-Team (Radar), Operations
Eevans closed T222851: Improve Echo seentime code for multi-DC access, a subtask of T234294: Configurable timestamp storage, as Resolved.
Thu, Oct 31, 9:00 PM · Growth-Team, Notifications, Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans closed T222851: Improve Echo seentime code for multi-DC access as Resolved.

I believe this task to be done.

Thu, Oct 31, 9:00 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team

Tue, Oct 29

Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

@Catrope given the difficulty in doing a phased migration (vis-a-vis global notifications), do you have any objections to transitioning to the new store in a single go (i.e. updating it to be the default)? We accidentally did this yesterday, and TTBMK, there were no issues. We'd of course keep the multi-write configuration in place for the time being, so if rolling back were necessary, up-to-date seen times would continue to exist in Redis.

Tue, Oct 29, 7:06 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team
Eevans added a comment to T92471: enable authenticated access to Cassandra JMX.

@Eevans Should we move this to the Icebox?

Tue, Oct 29, 4:58 PM · Core Platform Team Workboards (Clinic Duty Team), User-Eevans, Cassandra, Operations, Patch-For-Review
Eevans awarded T92471: enable authenticated access to Cassandra JMX a Heartbreak token.
Tue, Oct 29, 4:57 PM · Core Platform Team Workboards (Clinic Duty Team), User-Eevans, Cassandra, Operations, Patch-For-Review

Mon, Oct 28

Eevans committed rDEPLOYCHARTS994baef95157: echostore: Set TTL to 1 year (31536000) (authored by Eevans).
echostore: Set TTL to 1 year (31536000)
Mon, Oct 28, 11:22 PM
Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

To summarize a conversation with @Catrope on IRC:

Mon, Oct 28, 10:56 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team
Eevans added a comment to T222851: Improve Echo seentime code for multi-DC access.

The config change was rolled out in SWAT today. It was rolled back a short time later, out of an abundance of caution, because it seemed to apply more broadly than just testwiki (our expectation). During the time it was deployed, we were seeing ~1k/s requests. Sampling the records in Cassandra, the vast majority were of the form global:echo:seen:{alert,message}:time:%d. A number of them however were for enwiki, ruwiki, idwiki, outreachwiki, wikidatawiki, and others.

Mon, Oct 28, 6:37 PM · CPT Initiatives (Multi-DC Echo Notification Storage), MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), User-Eevans, Notifications, Growth-Team

Fri, Oct 25

Eevans added a project to T236414: CPT review/work for MediaWiki caching class maintenance ramp-up: User-Eevans.
Fri, Oct 25, 4:38 PM · Performance-Team (Radar), User-Eevans, Core Platform Team Workboards (Clinic Duty Team)
Eevans moved T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release) from Doing to Done on the Core Platform Team Workboards (Green) board.

This is now complete.

Fri, Oct 25, 12:34 AM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Fri, Oct 25, 12:34 AM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra

Thu, Oct 24

Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Thu, Oct 24, 4:37 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra

Wed, Oct 23

Eevans added a comment to T230848: Reader gets file description.

@mobrovac We already have a hierarchy for getting the content of the file. This is the metadata endpoint. I'll update the user story to reflect that.

Wed, Oct 23, 3:47 PM · Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Core REST API in PHP)

Tue, Oct 22

Eevans updated the language for P9438 Masterwork From Distant Lands from autodetect to js.
Tue, Oct 22, 4:12 PM
Eevans edited P9438 Masterwork From Distant Lands.
Tue, Oct 22, 4:12 PM
Eevans added a comment to T230848: Reader gets file description.

@mobrovac We already have a hierarchy for getting the content of the file. This is the metadata endpoint. I'll update the user story to reflect that.

Tue, Oct 22, 3:59 PM · Core Platform Team Workboards (User Stories), Story, CPT Initiatives (Core REST API in PHP)

Fri, Oct 18

Eevans moved T235558: Dashboards for monitoring of echostore from Ready to Done on the Core Platform Team Workboards (Green) board.

This is now done: See https://logstash.wikimedia.org/app/kibana#/dashboard/AW3go_uCx3rdj6D8q-he & https://grafana.wikimedia.org/d/IfJykaTZk/echostore

Fri, Oct 18, 10:10 PM · Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans reassigned T235675: Upload 3.11.4 packages to APT repo from Eevans to Joe.

I believe this is complete.

Fri, Oct 18, 9:00 PM · Patch-For-Review, Operations, Core Platform Team Legacy (Later), User-Eevans, Cassandra
Eevans triaged T235920: Provision deployment-prep instance of echostore as Normal priority.
Fri, Oct 18, 8:04 PM · Beta-Cluster-Infrastructure, Core Platform Team Workboards (Green), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans created T235920: Provision deployment-prep instance of echostore.
Fri, Oct 18, 8:03 PM · Beta-Cluster-Infrastructure, Core Platform Team Workboards (Green), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Fri, Oct 18, 3:53 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra

Thu, Oct 17

Eevans added a comment to T234376: Provision Kask for Echo timestamp storage in k8s.

Heh yes sorry, I forgot to tell you yesterday - you need to use helmfile destroy in newer versions of helmfile.

Thu, Oct 17, 1:58 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)

Wed, Oct 16

Eevans moved T234376: Provision Kask for Echo timestamp storage in k8s from Doing to Done on the Core Platform Team Workboards (Green) board.
Wed, Oct 16, 11:12 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans updated subscribers of T234376: Provision Kask for Echo timestamp storage in k8s.

Hat tip to @CDanis who pointed me at https://github.com/helm/helm/issues/3208#issuecomment-348154521; A helm delete production --purge did the trick.

Wed, Oct 16, 11:11 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans committed rDEPLOYCHARTSc396479f7409: echostore: fixup Cassandra contact list (authored by Eevans).
echostore: fixup Cassandra contact list
Wed, Oct 16, 11:05 PM
Eevans updated subscribers of T234376: Provision Kask for Echo timestamp storage in k8s.

From a conversation w/ @Joe on IRC, it seems the nodeAffinity section (copypasta from the sessionstore deployment) was likely causing the problem. I issued a helmfile delete, and updated the config (removing that section), but am now getting:

Wed, Oct 16, 10:00 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans committed rDEPLOYCHARTS67d540bac7b3: echostore: remove affinity (copypasta from sessionstore) (authored by Eevans).
echostore: remove affinity (copypasta from sessionstore)
Wed, Oct 16, 9:37 PM
Eevans added a comment to T234376: Provision Kask for Echo timestamp storage in k8s.

I'm unable to deploy to codfw; I'm seeing the following:

Wed, Oct 16, 9:22 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans committed rDEPLOYCHARTS92e110e8c13f: echostore: create production deployments (authored by Eevans).
echostore: create production deployments
Wed, Oct 16, 9:11 PM
Eevans committed rDEPLOYCHARTS9d037746827b: echostore: create new staging deployment (authored by Eevans).
echostore: create new staging deployment
Wed, Oct 16, 8:42 PM
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Wed, Oct 16, 8:20 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Wed, Oct 16, 8:19 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Wed, Oct 16, 8:18 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans moved T235558: Dashboards for monitoring of echostore from Backlog to Ready on the Core Platform Team Workboards (Green) board.
Wed, Oct 16, 8:17 PM · Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans added a project to T235558: Dashboards for monitoring of echostore: Core Platform Team Workboards (Green).
Wed, Oct 16, 8:17 PM · Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans moved T234376: Provision Kask for Echo timestamp storage in k8s from Backlog to Doing on the Core Platform Team Workboards (Green) board.
Wed, Oct 16, 8:16 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans added a project to T234376: Provision Kask for Echo timestamp storage in k8s: Core Platform Team Workboards (Green).
Wed, Oct 16, 8:16 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans moved T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release) from Backlog to Doing on the Core Platform Team Workboards (Green) board.
Wed, Oct 16, 8:14 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans edited projects for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release), added: Core Platform Team Workboards (Green); removed Core Platform Team Legacy (Later), Services (next).
Wed, Oct 16, 8:14 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans triaged T235675: Upload 3.11.4 packages to APT repo as Normal priority.
Wed, Oct 16, 6:00 PM · Patch-For-Review, Operations, Core Platform Team Legacy (Later), User-Eevans, Cassandra
Eevans updated the task description for T235675: Upload 3.11.4 packages to APT repo.
Wed, Oct 16, 4:53 PM · Patch-For-Review, Operations, Core Platform Team Legacy (Later), User-Eevans, Cassandra
Eevans created T235675: Upload 3.11.4 packages to APT repo.
Wed, Oct 16, 4:52 PM · Patch-For-Review, Operations, Core Platform Team Legacy (Later), User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Wed, Oct 16, 4:45 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Wed, Oct 16, 4:01 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra

Tue, Oct 15

Eevans triaged T235558: Dashboards for monitoring of echostore as Normal priority.
Tue, Oct 15, 8:47 PM · Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans created T235558: Dashboards for monitoring of echostore.
Tue, Oct 15, 8:47 PM · Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans added a comment to T235299: Cassandra cluster management support for multi-tenancy.

Instead of having full templates in roles profiles, we could have an add_user.pp manifest in modules/cassandra that compiles it for the profile, so a structure with the relevant info could be passed, something like:

[
  {
    title => 'title1',
    user => 'user1',
    pass => 'pass1',
    keyspaces => ['ks1', 'ks2']
  },
  {
    title => 'title2,
    user => 'user2',
    pass => 'pass2',
    keyspaces => ['ks3', 'ks4']
  },
]

Compiling the CQL from these statements and writing it to disk on the target nodes would be rather straightforward. We could have a special 'all' argument for keyspaces for cases that remain unchanged, like AQS or Maps.

This might work. It would need to be more robust than this though. For example: (for historical reasons) RESTBase includes GRANTs for CREATE, ALTER, and DROP, but that would not be the case typically. Come to think of it, we could probably remove those for RESTBase now and standardize on an assumption of SELECT and MODIFY, but there may come a day when those assumptions don't hold anymore either.

Tue, Oct 15, 8:27 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans added a comment to T235299: Cassandra cluster management support for multi-tenancy.

Instead of having full templates in roles profiles, we could have an add_user.pp manifest in modules/cassandra that compiles it for the profile, so a structure with the relevant info could be passed, something like:

[
  {
    title => 'title1',
    user => 'user1',
    pass => 'pass1',
    keyspaces => ['ks1', 'ks2']
  },
  {
    title => 'title2,
    user => 'user2',
    pass => 'pass2',
    keyspaces => ['ks3', 'ks4']
  },
]

Compiling the CQL from these statements and writing it to disk on the target nodes would be rather straightforward. We could have a special 'all' argument for keyspaces for cases that remain unchanged, like AQS or Maps.

Tue, Oct 15, 8:21 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans added a project to T227776: Generalize ParserCache into a generic service class for large "current" page-derived data: User-Eevans.
Tue, Oct 15, 7:36 PM · CPT Initiatives (Parsoid REST API in PHP (CDP2)), User-Eevans, User-mobrovac, TechCom, User-Daniel, Proposal
Eevans moved T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release) from Next to In-Progress on the User-Eevans board.
Tue, Oct 15, 7:31 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans added a comment to T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).

If I hear no objections, I will upgrade a canary node in each datacenter of the RESTBase cluster on Monday, and plan to upgrade the remaining nodes on Tuesday if everything checks out.

Tue, Oct 15, 4:29 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans added a comment to T234464: Echostore service endpoints.

I've done all the puppet/dns prep work. You can now proceed to prepare this new kask deployment in operations/deployment-charts.
[ ... ]

Tue, Oct 15, 3:23 PM · serviceops, Operations, Core Platform Team Workboards (User Stories), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)

Oct 11 2019

Eevans updated subscribers of T235299: Cassandra cluster management support for multi-tenancy.

/cc @mobrovac

Oct 11 2019, 8:03 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans added a project to T235299: Cassandra cluster management support for multi-tenancy: Core Platform Team Workboards (Clinic Duty Team).
Oct 11 2019, 8:03 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans moved T235299: Cassandra cluster management support for multi-tenancy from Backlog to Next on the User-Eevans board.
Oct 11 2019, 7:59 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans updated subscribers of T235299: Cassandra cluster management support for multi-tenancy.

/cc @elukey, @Joe

Oct 11 2019, 7:59 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans triaged T235299: Cassandra cluster management support for multi-tenancy as Low priority.
Oct 11 2019, 7:58 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans updated the task description for T235299: Cassandra cluster management support for multi-tenancy.
Oct 11 2019, 7:57 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans added a comment to T234374: Provision Cassandra access for Echo timestamp storage.

T235299: Cassandra cluster management support for multi-tenancy has been created for follow-up, this issue is otherwise complete.

Oct 11 2019, 7:36 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans updated the task description for T235299: Cassandra cluster management support for multi-tenancy.
Oct 11 2019, 7:34 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans
Eevans created T235299: Cassandra cluster management support for multi-tenancy.
Oct 11 2019, 7:33 PM · Cassandra, Core Platform Team Workboards (Clinic Duty Team), User-Eevans

Oct 10 2019

Eevans added a comment to T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).

The session storage cluster has been upgraded, and things look Good. If I hear no objections, I will upgrade a canary node in each datacenter of the RESTBase cluster on Monday, and plan to upgrade the remaining nodes on Tuesday if everything checks out.

Oct 10 2019, 7:57 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 10 2019, 4:24 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra

Oct 9 2019

Eevans added a comment to T234374: Provision Cassandra access for Echo timestamp storage.

The following has been created on the RESTBase cluster:

Oct 9 2019, 8:15 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans updated subscribers of T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).

Applicable staging/test environments have been updated to 3.11.4 and no issues are apparent. If this continues to look OK, and there are no objections, I will upgrade the production sessionstore cluster tomorrow (technically, it is not yet in production).j

Oct 9 2019, 8:04 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans renamed T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release) from Test/evaluate Cassandra 3.11.4 for production upgrade to Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 9 2019, 8:02 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 9 2019, 8:01 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 9 2019, 7:58 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 9 2019, 7:49 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans updated the task description for T200803: Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).
Oct 9 2019, 7:42 PM · Core Platform Team Workboards (Green), Patch-For-Review, User-Eevans, Cassandra
Eevans awarded T234928: RESTBase sometimes not retaining stashed content? a Stroopwafel token.
Oct 9 2019, 4:45 PM · Core Platform Team Workboards (Clinic Duty Team), RESTBase-Cassandra, Cassandra, RESTBase
Eevans added a comment to T234928: RESTBase sometimes not retaining stashed content?.

Ha! I think I found the bug. The problem is that when a user first opens the page and starts editing with VE, then VE calls /page/html/{title}?stash=true, but RESTBase expects the revision ID to be present as well, which causes it to store the page under the key {title}:undefined:{tid} instead of {title}:{revid}:{tid} in the stash bucket. However, if the user tries to VE-edit a page that they edited before (i.e. it was already loaded, because they have already edited it either via VE or the wt editor), then VE calls /page/html/{title}/{revision}?stash=true.
I have deployed the fix as well as a fallback to look for {title}:undefined:{tid} in case the original stash could not be found, and so far so good - no transforms are failing! I will keep the task open and monitor for a little while longer to ensure this is truly the case (and that no other edge cases materialise).

Oct 9 2019, 4:45 PM · Core Platform Team Workboards (Clinic Duty Team), RESTBase-Cassandra, Cassandra, RESTBase

Oct 8 2019

Eevans added a comment to T234374: Provision Cassandra access for Echo timestamp storage.

Additionally, a dedicated application user should be created, with corresponding access rights to the allocated table.

Oct 8 2019, 7:10 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans removed projects from T234961: Deploy migration config: Core Platform Team Workboards (User Stories), Story.
Oct 8 2019, 4:47 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans created T234963: Deploy final configuration.
Oct 8 2019, 4:46 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans created T234961: Deploy migration config.
Oct 8 2019, 4:42 PM · Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans added a comment to T234928: RESTBase sometimes not retaining stashed content?.

@Pchelolo @Eevans could you take a look? Perhaps I've overlooked something, but to me this is starting to smell like Cassandra is losing data somehow.

Oct 8 2019, 2:59 PM · Core Platform Team Workboards (Clinic Duty Team), RESTBase-Cassandra, Cassandra, RESTBase
Eevans closed T209110: Logging for the session storage service as Resolved.

We've been calling this out as a blocker to moving session storage to production, so I guess what I'm trying to determine is: Are we still blocked?

For a few more days, it looks like yes.

I think this means the block has been removed.

Oct 8 2019, 2:47 PM · CPT Initiatives (Session Management Service (CDP2)), Patch-For-Review, User-Clarakosi, User-Eevans
Eevans closed T209110: Logging for the session storage service, a subtask of T206016: Create a service for session storage, as Resolved.
Oct 8 2019, 2:47 PM · CPT Initiatives (Multi-DC (TEC1)), User-Clarakosi, User-Eevans

Oct 3 2019

Eevans added a comment to T227514: k8s liveness check(?) generating session storage log noise.

I think there were (implicitly) two issues related to this open task: a) a superfluous log message (aka log spam), and b) unstructured log messages. The latter is now solved, the former is not. Possible options for addressing (a):

Oct 3 2019, 8:21 PM · CPT Initiatives (Multi-DC (TEC1)), serviceops
Eevans committed rDEPLOYCHARTS4830f43bccc9: sessionstore: Upgrade to v1.0.5 release (authored by Eevans).
sessionstore: Upgrade to v1.0.5 release
Oct 3 2019, 7:58 PM
Eevans added a comment to T227514: k8s liveness check(?) generating session storage log noise.

Deployed to staging, the log output now looks like:

Oct 3 2019, 7:15 PM · CPT Initiatives (Multi-DC (TEC1)), serviceops
Eevans committed rDEPLOYCHARTS2da687d1b87d: staging/sessionstore: Upgrade image to 2019-10-03-182310-production (authored by Eevans).
staging/sessionstore: Upgrade image to 2019-10-03-182310-production
Oct 3 2019, 7:10 PM
Eevans added projects to T234464: Echostore service endpoints: Operations, serviceops.
Oct 3 2019, 6:25 PM · serviceops, Operations, Core Platform Team Workboards (User Stories), Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans added projects to T234376: Provision Kask for Echo timestamp storage in k8s: Operations, serviceops.
Oct 3 2019, 6:24 PM · Patch-For-Review, Core Platform Team Workboards (Green), serviceops, Operations, Notifications, Growth-Team, CPT Initiatives (Multi-DC Echo Notification Storage)
Eevans committed rMSKSdbb8ec9ea896: Use Logger as Writer for log module (authored by Eevans).
Use Logger as Writer for log module
Oct 3 2019, 6:20 PM