Page MenuHomePhabricator

akosiaris (Alexandros Kosiaris)
Senior Site Reliability Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 8:40 AM (262 w, 6 d)
Availability
Available
IRC Nick
akosiaris
LDAP User
Alexandros Kosiaris
MediaWiki User
AKosiaris (WMF) [ Global Accounts ]

Blurb

Recent Activity

Yesterday

akosiaris committed rDEPLOYCHARTS93cafe931384: echostore: Add namespace creation stanzas (authored by akosiaris).
echostore: Add namespace creation stanzas
Wed, Oct 16, 2:08 PM
akosiaris added a comment to T229209: Strengthen backup infrastructure and support.

Sorry I missed that, thanks for pinging me on T234900.

Wed, Oct 16, 1:29 PM · Patch-For-Review, Goal, DBA, serviceops, Operations

Tue, Oct 15

akosiaris added a comment to T234900: Setup bacula backup monitoring.
  • Certain configured backup is not active (As far I can see, configurations are not cleaned up on decommission, something to look at)
Tue, Oct 15, 11:46 AM · Availability, observability, Goal, Operations
akosiaris added a comment to T234900: Setup bacula backup monitoring.

There's a number of ready plugins for icinga on https://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/Bacula

Tue, Oct 15, 10:25 AM · Availability, observability, Goal, Operations
akosiaris added a comment to T235479: codfw: 1 VM for idp.

Same as eqiad, LGTM

Tue, Oct 15, 9:46 AM · vm-requests, Operations

Mon, Oct 14

akosiaris added a comment to T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes.

In the interest of splitting off from this task what is probably going to be somewhat of a discussion, I 've created subtask T235437 for the rate limiting functionality of RESTBase/RESTrouter.

Mon, Oct 14, 2:41 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests
akosiaris added a comment to T235437: RESTBase/RESTRouter/service-runner rate limiting plans.

For what is worth, the poolcounter approach is probably the saner one long term. And per https://www.mediawiki.org/wiki/PoolCounter the protocol is simple enough that having a PoC to gauge whether it is a valid replacement shouldn't take too much work

Mon, Oct 14, 2:41 PM · service-runner, User-mobrovac, Core Platform Team Workboards (Clinic Duty Team), Services (doing), CPT Initiatives (RESTBase Split (CDP2)), serviceops, Kubernetes, Service-deployment-requests, Operations
akosiaris added a subtask for T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes: T235437: RESTBase/RESTRouter/service-runner rate limiting plans.
Mon, Oct 14, 2:39 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests
akosiaris added a parent task for T235437: RESTBase/RESTRouter/service-runner rate limiting plans: T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes.
Mon, Oct 14, 2:39 PM · service-runner, User-mobrovac, Core Platform Team Workboards (Clinic Duty Team), Services (doing), CPT Initiatives (RESTBase Split (CDP2)), serviceops, Kubernetes, Service-deployment-requests, Operations
akosiaris triaged T235437: RESTBase/RESTRouter/service-runner rate limiting plans as High priority.
Mon, Oct 14, 2:38 PM · service-runner, User-mobrovac, Core Platform Team Workboards (Clinic Duty Team), Services (doing), CPT Initiatives (RESTBase Split (CDP2)), serviceops, Kubernetes, Service-deployment-requests, Operations
akosiaris created T235437: RESTBase/RESTRouter/service-runner rate limiting plans.
Mon, Oct 14, 2:36 PM · service-runner, User-mobrovac, Core Platform Team Workboards (Clinic Duty Team), Services (doing), CPT Initiatives (RESTBase Split (CDP2)), serviceops, Kubernetes, Service-deployment-requests, Operations
akosiaris committed rDEPLOYCHARTSa41634f83458: Publish cxserver-0.0.8 (authored by akosiaris).
Publish cxserver-0.0.8
Mon, Oct 14, 1:11 PM
akosiaris committed rDEPLOYCHARTS619a431b8c4b: Also add templatemapping to cxserver prod config (authored by KartikMistry).
Also add templatemapping to cxserver prod config
Mon, Oct 14, 1:10 PM

Fri, Oct 11

akosiaris added a comment to T234900: Setup bacula backup monitoring.

The last issue we had with bacula host itself was some sort of storage degradation/failure, no?

Fri, Oct 11, 9:17 AM · Availability, observability, Goal, Operations
Dzahn awarded T234890: Upgrade OTRS to 5.0.38 a Orange Medal token.
Fri, Oct 11, 4:01 AM · serviceops, OTRS, Security

Thu, Oct 10

akosiaris awarded T234641: Wikimedia Technical Conference 2019 Session: Continuous Delivery/Deployment in Wikimedia: The Future of the Deployment Pipeline a Like token.
Thu, Oct 10, 7:27 PM · International-Developer-Events, Wikimedia-Technical-Conference-2019

Wed, Oct 9

akosiaris committed rDEPLOYCHARTS4f850c6b5236: restrouter: Allow egress on 3050/udp as well (authored by akosiaris).
restrouter: Allow egress on 3050/udp as well
Wed, Oct 9, 10:30 PM
akosiaris committed rDEPLOYCHARTSb5fa99f84507: restrouter: Allow the kademlia port in ingress (authored by akosiaris).
restrouter: Allow the kademlia port in ingress
Wed, Oct 9, 10:30 PM

Tue, Oct 8

akosiaris committed rLPRIad316a6bcfdf: k8s: Correctly escape double quotes (authored by akosiaris).
k8s: Correctly escape double quotes
Tue, Oct 8, 10:54 AM
akosiaris committed rLPRId3a0e02273f3: k8s: Escape double quotes (authored by akosiaris).
k8s: Escape double quotes
Tue, Oct 8, 10:47 AM
akosiaris committed rLPRId3c384ee0fbd: k8s: Move rsyslog/prometheus tokens to read only (authored by akosiaris).
k8s: Move rsyslog/prometheus tokens to read only
Tue, Oct 8, 10:47 AM
akosiaris committed rDEPLOYCHARTS962b76028815: RBAC: Add an api-metrics ClusterRole and binding (authored by akosiaris).
RBAC: Add an api-metrics ClusterRole and binding
Tue, Oct 8, 10:14 AM
akosiaris committed rDEPLOYCHARTS0943169c9ebc: admin: Fix typo with Group definition (authored by akosiaris).
admin: Fix typo with Group definition
Tue, Oct 8, 8:10 AM
akosiaris committed rDEPLOYCHARTS6f044feb9f10: admin: Add view clusterrolebinding (authored by akosiaris).
admin: Add view clusterrolebinding
Tue, Oct 8, 8:00 AM
akosiaris closed T234890: Upgrade OTRS to 5.0.38 as Resolved.

Upgrade done.

Tue, Oct 8, 7:53 AM · serviceops, OTRS, Security
akosiaris created T234890: Upgrade OTRS to 5.0.38.
Tue, Oct 8, 7:52 AM · serviceops, OTRS, Security
akosiaris added a comment to T209110: Logging for the session storage service.

We've been calling this out as a blocker to moving session storage to production, so I guess what I'm trying to determine is: Are we still blocked?

Tue, Oct 8, 6:20 AM · CPT Initiatives (Session Management Service (CDP2)), Patch-For-Review, User-Clarakosi, User-Eevans

Mon, Oct 7

akosiaris committed rDEPLOYCHARTS29ce36f19ae7: restrouter: Kademlia should listen on all IPs (authored by akosiaris).
restrouter: Kademlia should listen on all IPs
Mon, Oct 7, 3:27 PM

Thu, Oct 3

akosiaris triaged T234545: Alert on coreDNS misbehaving as High priority.
Thu, Oct 3, 4:24 PM · observability, serviceops
akosiaris created T234545: Alert on coreDNS misbehaving.
Thu, Oct 3, 4:24 PM · observability, serviceops
akosiaris triaged T234544: Alert on 0 zotero requests from zotero as Normal priority.
Thu, Oct 3, 4:21 PM · observability, serviceops, Citoid
akosiaris created T234544: Alert on 0 zotero requests from zotero.
Thu, Oct 3, 4:20 PM · observability, serviceops, Citoid
akosiaris added a comment to T209110: Logging for the session storage service.

@Eevans Logs from kubernetes make it to logstash now, albeit we lack one last change in logstash to parse correctly the JSON fields (for container runtime enginer reasons they are JSON-in-JSON). We 'll get on that soon.

Great!

Thu, Oct 3, 1:49 PM · CPT Initiatives (Session Management Service (CDP2)), Patch-For-Review, User-Clarakosi, User-Eevans
akosiaris committed rDEPLOYCHARTS3229da692ef3: zotero: Set the currently deployed version (authored by akosiaris).
zotero: Set the currently deployed version
Thu, Oct 3, 6:59 AM
akosiaris committed rDEPLOYCHARTS827891af3d49: zotero: kill logs (authored by akosiaris).
zotero: kill logs
Thu, Oct 3, 6:54 AM

Tue, Oct 1

akosiaris closed T230917: celery-ores-worker service failed on ores100[2,4,5] without any apparent reason or significant log as Resolved.

I 'll resolve this, all workers have now the check, with 91 children each, we will be alerted if this deviates too much from the configured thresholds

Tue, Oct 1, 9:16 PM · Patch-For-Review, Scoring-platform-team (Current), ORES, Operations, serviceops
akosiaris committed rDEPLOYCHARTS50b5d9ab7990: Bump restrouter chart version (authored by akosiaris).
Bump restrouter chart version
Tue, Oct 1, 4:09 PM
akosiaris committed rDEPLOYCHARTS57cbac754184: restrouter: Add ratelimiting support to chart (authored by akosiaris).
restrouter: Add ratelimiting support to chart
Tue, Oct 1, 4:09 PM
akosiaris added a comment to T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes.

@akosiaris regarding rate limiting, you mentioned a (semi-)permanent DNS entry.

Tue, Oct 1, 2:26 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests

Mon, Sep 30

herron awarded T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup a Party Time token.
Mon, Sep 30, 4:52 PM · serviceops, Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Services (watching), Release Pipeline, Operations
akosiaris closed T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup as Resolved.
Mon, Sep 30, 4:42 PM · serviceops, Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Services (watching), Release Pipeline, Operations
akosiaris closed T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup, a subtask of T198901: Migrate production services to kubernetes using the pipeline, as Resolved.
Mon, Sep 30, 4:42 PM · Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Epic, Services (watching), Operations, Release Pipeline
akosiaris added a comment to T209110: Logging for the session storage service.

@Eevans Logs from kubernetes make it to logstash now, albeit we lack one last change in logstash to parse correctly the JSON fields (for container runtime enginer reasons they are JSON-in-JSON). We 'll get on that soon.

Mon, Sep 30, 4:33 PM · CPT Initiatives (Session Management Service (CDP2)), Patch-For-Review, User-Clarakosi, User-Eevans
akosiaris added a comment to T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup.

Logs are now making it to logstash so I am gonna boldly resolve this. That being said, there is a minor straggler that needs to be resolved, namely the JSON-in-JSON parsing of logs as most services ship logs in JSON format which gets wrapped in docker's JSON. Discussion is ongoing in https://gerrit.wikimedia.org/r/539519, although the approach in that patch will probably not be chosen and the parsing of the JSON in JSON will be done in logstash

Mon, Sep 30, 4:27 PM · serviceops, Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Services (watching), Release Pipeline, Operations
akosiaris updated the task description for T207200: Revisit the logging work done on Q1 2017-2018 for the standard pod setup.
Mon, Sep 30, 4:19 PM · serviceops, Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Services (watching), Release Pipeline, Operations
akosiaris awarded T222424: configure BGP route damping on IX sessions a Love token.
Mon, Sep 30, 3:08 PM · Operations, netops
akosiaris triaged T234207: Investigate improvements to how puppet manages interfaces as Lowest priority.
Mon, Sep 30, 3:05 PM · Puppet, Operations, netops

Fri, Sep 27

akosiaris closed T233906: Broken network connection on ganeti2001 after reboot as Resolved.

We 've sidestepped the problem for now by disabling ip6 mapped addresses for ganeti hosts. This solves the chicken and problem, although we should arguably find a way to better configure IPv4 and IPv6 addresses on our hosts instead of relying on tricks like setting the token. I 'll resolve this for now.

Fri, Sep 27, 1:15 PM · Operations
akosiaris edited P9212 rsyslog with https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:T207200 applied.
Fri, Sep 27, 6:55 AM
akosiaris edited P9212 rsyslog with https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:T207200 applied.
Fri, Sep 27, 6:54 AM
akosiaris updated the language for P9212 rsyslog with https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:T207200 applied from autodetect to json.
Fri, Sep 27, 6:54 AM
akosiaris created P9212 rsyslog with https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:T207200 applied.
Fri, Sep 27, 6:46 AM
akosiaris archived P9187 msg missing from $!.
Fri, Sep 27, 6:45 AM

Thu, Sep 26

akosiaris claimed T233906: Broken network connection on ganeti2001 after reboot.

Sure.

Thu, Sep 26, 3:48 PM · Operations
akosiaris lowered the priority of T233906: Broken network connection on ganeti2001 after reboot from High to Normal.

Changing priority to normal since the host is now up and running, but we have a chicken and egg problem to solve here.

Thu, Sep 26, 3:14 PM · Operations
akosiaris added a comment to T233906: Broken network connection on ganeti2001 after reboot.

Found it. I had to comment out from /etc/network/interfaces the line

Thu, Sep 26, 3:12 PM · Operations
akosiaris removed a project from T233906: Broken network connection on ganeti2001 after reboot: ops-codfw.

I don't think this is hardware related.

Thu, Sep 26, 3:09 PM · Operations
akosiaris added a comment to T228910: Move restbase chart from local-charts to deployment-charts repository.

I wasn't aware of that. What's the rationale?

Basically, once RESTRouter is fully functional, RESTBase itself will represent only the storage API layer, so to speak. We will have two options at that point: (i) keep using it; or (ii) use kask instead. In the first case, IMHO it's better to keep the service as close to the storage as possible, which would mean keep RB where it is, while in the latter case RB goes away completely.

Thu, Sep 26, 2:54 PM · Release-Engineering-Team-TODO (201910), RESTBase, Core Platform Team Workboards (Clinic Duty Team), Release-Engineering-Team (Local Dev), Developer Productivity, local-charts
akosiaris added a comment to T228910: Move restbase chart from local-charts to deployment-charts repository.

As I wrote on the patch, I wonder if it's a good idea to put RESTBase in deployment-charts given that it will never be put into k8s, and given that we already have restrouter in here I find it confusing. @akosiaris @Joe thoughts?

Thu, Sep 26, 1:54 PM · Release-Engineering-Team-TODO (201910), RESTBase, Core Platform Team Workboards (Clinic Duty Team), Release-Engineering-Team (Local Dev), Developer Productivity, local-charts
akosiaris added a comment to T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes.
  • set up the rate-limiting DHT inside k8s for RESTRouter (this is currently disabled, and not having rate-limiting is not acceptable)
Thu, Sep 26, 10:14 AM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests
akosiaris added a comment to T233831: ores-redis-02 is out of disk space. AOF file is too big.

@Halfak, this is pretty much blog post material. Anyway I 'll try and summarize what I think happened and what we can do below

Thu, Sep 26, 10:10 AM · Scoring-platform-team (Current), Patch-For-Review, ORES

Wed, Sep 25

akosiaris created P9187 msg missing from $!.
Wed, Sep 25, 4:52 PM
akosiaris closed T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes, a subtask of T198901: Migrate production services to kubernetes using the pipeline, as Resolved.
Wed, Sep 25, 1:42 PM · Release-Engineering-Team (Pipeline), Release-Engineering-Team-TODO, Core Platform Team Legacy (Watching / External), Epic, Services (watching), Operations, Release Pipeline
akosiaris closed T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes, a subtask of T220449: Split RESTBase in two services: storage service and API router/proxy, as Resolved.
Wed, Sep 25, 1:42 PM · CPT Initiatives (RESTBase Split (CDP2)), User-mobrovac, serviceops, Epic, RESTBase
akosiaris closed T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes, a subtask of T228676: Self-service Deployment Pipeline, as Resolved.
Wed, Sep 25, 1:42 PM · Goal, Operations, Release Pipeline, Release-Engineering-Team (Pipeline), serviceops
akosiaris closed T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes as Resolved.

restrouter is up and running, LVS is setup and discovery records have been merged. I think the migration can start. A draft dashboard is present at https://grafana.wikimedia.org/d/ZA_JiypZk/restrouter, however restrouter differs enough from the rest of the other service-runner based services as far as the statsd emitted metrics goes, that I don't feel qualified to delve more into this. Feel free to amend it to your needs.

Wed, Sep 25, 1:42 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests
akosiaris committed rDEPLOYCHARTS1b299988657a: calico: Add port 8000 (parsoid) to restrouter (authored by akosiaris).
calico: Add port 8000 (parsoid) to restrouter
Wed, Sep 25, 12:43 PM
akosiaris committed rDEPLOYCHARTSf99f9f1323a6: restrouter: Fix the parsoid port in the configuration (authored by akosiaris).
restrouter: Fix the parsoid port in the configuration
Wed, Sep 25, 12:28 PM
akosiaris committed rDEPLOYCHARTS79efc26f5dd0: Rename codfw releases to production (authored by akosiaris).
Rename codfw releases to production
Wed, Sep 25, 12:28 PM
akosiaris closed T212189: New Service Request: Wikidata Termbox SSR as Resolved.

The service has for long been deployed and even has nice dashboards in grafana, resolving.

Wed, Sep 25, 10:09 AM · Core Platform Team Legacy (Later), User-Addshore, serviceops, Services (next), Wikidata-Termbox, Wikidata, Service-deployment-requests, Operations
akosiaris committed rDEPLOYCHARTSc9942d61627f: Bump number of replicas for restrouter (authored by akosiaris).
Bump number of replicas for restrouter
Wed, Sep 25, 9:49 AM
akosiaris committed rDEPLOYCHARTSb93b2c1c5cc4: restrouter: Stop passing the image parameter (authored by akosiaris).
restrouter: Stop passing the image parameter
Wed, Sep 25, 8:58 AM
akosiaris committed rDEPLOYCHARTS5d3f419d7905: Remove old deprecated helper scripts (authored by akosiaris).
Remove old deprecated helper scripts
Wed, Sep 25, 8:58 AM

Tue, Sep 24

akosiaris committed rDEPLOYCHARTSf5c3a0b84a0a: restrouter: Skip using https for mwapi_uri (authored by akosiaris).
restrouter: Skip using https for mwapi_uri
Tue, Sep 24, 2:38 PM
akosiaris committed rDEPLOYCHARTS294664cdfdf6: restrouter: Skip probes for the first 60 seconds (authored by akosiaris).
restrouter: Skip probes for the first 60 seconds
Tue, Sep 24, 2:25 PM
akosiaris committed rDEPLOYCHARTSa8fede2878ed: Publish restrouter 0.0.4 chart version (authored by akosiaris).
Publish restrouter 0.0.4 chart version
Tue, Sep 24, 2:21 PM
akosiaris added a comment to P9132 restrouter error logs.

And the config (sanitized)

Tue, Sep 24, 1:20 PM · RESTBase
akosiaris added a comment to P9132 restrouter error logs.

Tue, Sep 24, 1:09 PM · RESTBase
akosiaris committed rLPRIdb71ac548548: Populate dummy kubernetes tokens for syslog (authored by akosiaris).
Populate dummy kubernetes tokens for syslog
Tue, Sep 24, 9:29 AM

Sat, Sep 21

stjn awarded T211881: graphoid: Code stewardship request a Heartbreak token.
Sat, Sep 21, 12:17 PM · Release-Engineering-Team-TODO (201908), Release-Engineering-Team (Code Health), Core Platform Team Legacy (Watching / External), Services (watching), Operations, Code-Stewardship-Reviews, Graphoid

Fri, Sep 20

akosiaris added a comment to P9132 restrouter error logs.

After merging and deploying https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/538242/ and https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/538241/ more or less the same.

Fri, Sep 20, 11:33 AM · RESTBase
akosiaris committed rDEPLOYCHARTSc485c5ba3078: Release restrouter chart version 0.0.3 (authored by akosiaris).
Release restrouter chart version 0.0.3
Fri, Sep 20, 11:22 AM
akosiaris committed rDEPLOYCHARTSa124e95ebbb2: restrouter: Upgrade to version v1.1.1 (authored by akosiaris).
restrouter: Upgrade to version v1.1.1
Fri, Sep 20, 11:14 AM

Thu, Sep 19

akosiaris committed rDEPLOYCHARTSa0a41b06ad9c: scaffold: Fix bug with concatenation of args/command (authored by akosiaris).
scaffold: Fix bug with concatenation of args/command
Thu, Sep 19, 4:14 PM
akosiaris awarded T233189: Requesting access to Ops Group for papaul@ a Love token.
Thu, Sep 19, 3:37 PM · Operations, SRE-Access-Requests
akosiaris added a comment to P9132 restrouter error logs.

The previous was with v1.0.0-RC2

Thu, Sep 19, 12:39 PM · RESTBase
akosiaris renamed T233028: Define an SLIs/SLOs for wikifeeds from Define an SLA for wikifeeds to Define an SLIs/SLOs for wikifeeds.
Thu, Sep 19, 12:23 PM · Product-Infrastructure-Team-Backlog, Wikifeeds
akosiaris added a comment to T233028: Define an SLIs/SLOs for wikifeeds.

For starters let me say that the service owners should be the ones setting the SLIs/SLOs and those should be the ones the team can commit to. They are also not set in stone, but can be amended to better reflect the present reality (e.g. in case the SLOs were set very optimistically and it's impossible to reach them, or so pessimistically that they are always hit with extreme ease despite prolonged outages of the service) as long as they are clearly communicated and advertised (updating the wikipage and an email should suffice)

Thu, Sep 19, 12:23 PM · Product-Infrastructure-Team-Backlog, Wikifeeds
akosiaris awarded T233298: Proposal: simplify set up of basic CI jobs for new projects a Love token.
Thu, Sep 19, 11:22 AM · Release-Engineering-Team (CI & Testing services), serviceops-radar, Continuous-Integration-Infrastructure
akosiaris added a comment to T233298: Proposal: simplify set up of basic CI jobs for new projects.

I link this approach it has the benefit of removing toil from releng and abstracting CI from repo owners as they only have to care about a well documented and defined contract point.

Thu, Sep 19, 11:22 AM · Release-Engineering-Team (CI & Testing services), serviceops-radar, Continuous-Integration-Infrastructure
akosiaris updated the task description for T233298: Proposal: simplify set up of basic CI jobs for new projects.
Thu, Sep 19, 11:21 AM · Release-Engineering-Team (CI & Testing services), serviceops-radar, Continuous-Integration-Infrastructure
akosiaris created P9132 restrouter error logs.
Thu, Sep 19, 9:43 AM · RESTBase

Wed, Sep 18

akosiaris added a comment to T223953: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes.

Going forward with Plan #1 (which I also find better)

Wed, Sep 18, 2:35 PM · Core Platform Team Workboards (Clinic Duty Team), CPT Initiatives (RESTBase Split (CDP2)), Patch-For-Review, Release Pipeline, Kubernetes, serviceops, Operations, Service-deployment-requests
akosiaris reopened T170455: Extract the feed endpoints from PCS into a new wikifeeds service, a subtask of T229286: Resolve service instability due to excessive event loop blockage since starting PCS response pregeneration, as Open.
Wed, Sep 18, 2:05 PM · Product-Infrastructure-Team-Backlog, Epic, Page Content Service, Mobile-Content-Service, serviceops
akosiaris reopened T170455: Extract the feed endpoints from PCS into a new wikifeeds service as "Open".
Wed, Sep 18, 2:05 PM · Core Platform Team Workboards (Clinic Duty Team), Product-Infrastructure-Team-Backlog, Epic, Wikifeeds, Patch-For-Review, Page Content Service
akosiaris reopened T170455: Extract the feed endpoints from PCS into a new wikifeeds service, a subtask of T169242: Develop Page Content Service for Reading Clients, as Open.
Wed, Sep 18, 2:05 PM · Page Content Service, Product-Infrastructure-Team-Backlog, Epic, Reading Epics (Platform JS CSS and HTML consolidation)
akosiaris closed T170455: Extract the feed endpoints from PCS into a new wikifeeds service, a subtask of T229286: Resolve service instability due to excessive event loop blockage since starting PCS response pregeneration, as Resolved.
Wed, Sep 18, 2:04 PM · Product-Infrastructure-Team-Backlog, Epic, Page Content Service, Mobile-Content-Service, serviceops
akosiaris closed T170455: Extract the feed endpoints from PCS into a new wikifeeds service, a subtask of T169242: Develop Page Content Service for Reading Clients, as Resolved.
Wed, Sep 18, 2:04 PM · Page Content Service, Product-Infrastructure-Team-Backlog, Epic, Reading Epics (Platform JS CSS and HTML consolidation)
akosiaris closed T170455: Extract the feed endpoints from PCS into a new wikifeeds service as Resolved.

https://grafana.wikimedia.org/d/35vIuGpZk/wikifeeds?refresh=1m&orgId=1 has a tentative dashboard. Feel free to augment

Wed, Sep 18, 2:04 PM · Core Platform Team Workboards (Clinic Duty Team), Product-Infrastructure-Team-Backlog, Epic, Wikifeeds, Patch-For-Review, Page Content Service
akosiaris added a comment to T225128: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN..

Proposed fix for asw2-b:

delete interfaces interface-range cloud-hosts1-b-eqiad member xe-4/0/5
set interfaces interface-range vlan-analytics1-b-eqiad xe-4/0/5
Wed, Sep 18, 1:55 PM · Analytics-Kanban, ops-eqiad, Operations, netops, Analytics