Page MenuHomePhabricator

jcrespo (Jaime Crespo)
Sr Database Administrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
May 11 2015, 8:31 AM (239 w, 1 d)
Availability
Available
IRC Nick
jynus
LDAP User
Jcrespo
MediaWiki User
JCrespo (WMF) [ Global Accounts ]

Recent Activity

Yesterday

jcrespo edited Description on Operations.
Tue, Dec 10, 5:42 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:39 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:37 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:31 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:30 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:20 PM
jcrespo edited Description on Operations.
Tue, Dec 10, 5:17 PM
jcrespo added a comment to T240243: Add accraze to analytics-privatedata-users.

Thanks!

Tue, Dec 10, 5:05 PM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo claimed T240243: Add accraze to analytics-privatedata-users.
Tue, Dec 10, 5:05 PM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo moved T240243: Add accraze to analytics-privatedata-users from Backlog to Acknowledged on the Operations board.
Tue, Dec 10, 4:50 PM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo moved T240285: Clean up DNS server puppetization from Backlog to Acknowledged on the Operations board.
Tue, Dec 10, 4:46 PM · Operations, Traffic
jcrespo assigned T239957: Degraded RAID on cloudelastic1002 to Mathew.onipe.

Assigning to Mathew based on above update as part of clinic duty. Feel free to revert if this is wrong.

Tue, Dec 10, 4:37 PM · Discovery-Search (Current work), Discovery, ops-eqiad, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Copy jobs are running now- we will see how much it takes to do a full copy.

Tue, Dec 10, 3:24 PM · Goal, Operations
jcrespo updated the task description for T138562: Improve regular production database backups handling.
Tue, Dec 10, 3:14 PM · Epic, DBA
jcrespo updated the task description for T138562: Improve regular production database backups handling.
Tue, Dec 10, 3:14 PM · Epic, DBA
jcrespo added a comment to T240341: redirect non-existing wikimania2020.wikimedia.org to wikimania.wikimedia.org.

I am just here doing clinic duty for the Operations tag. Traffic should decide on this ticket, but based on my (limited) understanding of our setup, I suggest we should not do this unless there is a really good reason to.

Tue, Dec 10, 2:53 PM · Traffic, Operations, DNS
jcrespo awarded T237259: Document all uses of the puppetCA certificate a Love token.
Tue, Dec 10, 2:08 PM · Patch-For-Review, User-jbond, Puppet, Operations
jcrespo reassigned T240177: backup2001 rebooted itself from jcrespo to Papaul.

Not the first time this happens: T237730 And firmware was updated at that time.

Tue, Dec 10, 11:46 AM · Operations, DBA
jcrespo assigned T240303: Add wikiworkshop.org to the Foundation's DNS to leila.

Assigning to @leila as per BBlack and Reedy comments, as there seems to be some additional information required. Please feel free to reassign to the right person you are in contact with, as per your original comment there may be 3rd parties involved. Other than that, I will let Traffic handle the request on their own (I am just trying to move forward tasks while on clinic duty).

Tue, Dec 10, 11:40 AM · Research, Traffic, DNS, Operations
jcrespo moved T240243: Add accraze to analytics-privatedata-users from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
Tue, Dec 10, 11:35 AM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo added a comment to T223463: (2019-09) Create secteam groups in admin.yaml and define permissions.

Hey, @chasemp, is this in your radar (lot of time passed since last update)? If yes, but "there is need of some discussion and work not involving SRE", I would remove the SRE-Access-Requests so it doesn't appear on clinic duty dashboard. If no, maybe this should be closed and a different task should be open with further actionables (technically, the title has been already fullfilled, secteam-users exist on production). If yes, but SREs are blocking work, please let us know how. Cheers!

Tue, Dec 10, 11:34 AM · SRE-Access-Requests, Operations, Security-Team, Patch-For-Review
jcrespo assigned T240243: Add accraze to analytics-privatedata-users to Nuria.

Please reassign to me when ok or if there are comments.

Tue, Dec 10, 11:27 AM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo added a comment to T240243: Add accraze to analytics-privatedata-users.

^I have prepared the patch to merge it as soon as everybody agrees.

Tue, Dec 10, 11:26 AM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo claimed T240177: backup2001 rebooted itself.
Tue, Dec 10, 11:16 AM · Operations, DBA
jcrespo added a comment to T240243: Add accraze to analytics-privatedata-users.

@Nuria See original request at T226204#5279623 where @Ottomata suggested this group but was not added. Is this something you approve, as an addendum to the original request?

Tue, Dec 10, 11:13 AM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
jcrespo added a comment to T219592: Frequent Echo DB_MASTER write queries on HTTP GET.

though this case is complicated since people want their "latest views" to be immediately reflected

Tue, Dec 10, 9:36 AM · CPT Initiatives (Multi-DC (TEC1)), Growth-Team, Notifications, Services (watching), Performance-Team (Radar), Availability (MediaWiki-MultiDC)
jcrespo updated subscribers of T183485: Please consider purging/moving the cx_corpora table at x1.

FWD: @Marostegui You may want to defragment the named table before answering the question.

Tue, Dec 10, 9:27 AM · Language-Team (Language-2019-July-September), ContentTranslation

Thu, Dec 5

jcrespo added a comment to T239900: Sync understanding of MediaWiki rdbms 'weight' behaviour with DBAs.

we'd still probably lose the ability to reuse the same opened connection

Thu, Dec 5, 5:38 PM · Core Platform Team Workboards (Clinic Duty Team), DBA, Wikimedia-Rdbms
jcrespo added a comment to T239900: Sync understanding of MediaWiki rdbms 'weight' behaviour with DBAs.

Should we consider changing any of this?

Thu, Dec 5, 3:28 PM · Core Platform Team Workboards (Clinic Duty Team), DBA, Wikimedia-Rdbms
jcrespo awarded T239901: Disallow 'weight: 0' for MW db config in dbctl a Dislike token.
Thu, Dec 5, 11:49 AM · Operations, DBA, Wikimedia-Incident
jcrespo added a comment to T215445: comment and actor view challenges for Cloud Services.

I've not yet managed to find suitable ways to join the tables and make some query against revisions and usernames/comments.

Thu, Dec 5, 10:07 AM · cloud-services-team (Kanban), Data-Services

Wed, Dec 4

jcrespo added a comment to T143870: Some mw snapshot hosts are accessing main db servers.

I am seeing db1118 serving dumps. This is a high-throughput main-traffic enwiki replica. I thought at first this was the cause of an outage, but it was unrelated. However, it seems quite worrying.

Wed, Dec 4, 7:04 PM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Dumps-Generation, DBA
jcrespo lowered the priority of T143870: Some mw snapshot hosts are accessing main db servers from Unbreak Now! to High.
Wed, Dec 4, 5:22 PM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Dumps-Generation, DBA
jcrespo raised the priority of T143870: Some mw snapshot hosts are accessing main db servers from Medium to Unbreak Now!.
Wed, Dec 4, 5:15 PM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Dumps-Generation, DBA
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Now it is ok:

Wed, Dec 4, 4:53 PM · Goal, Operations
jcrespo updated the task description for T238048: Followup to backup1001 bacula switchover (misc pending tasks).
Wed, Dec 4, 4:47 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

I scheduled by accident the migration, not the copy.

Wed, Dec 4, 4:39 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

I have provided them already in

Wed, Dec 4, 4:33 PM · Goal, Operations
jcrespo created T239837: prometheus hosts try to start rsync and fails on every puppet run.
Wed, Dec 4, 4:28 PM · observability
jcrespo updated the task description for T238048: Followup to backup1001 bacula switchover (misc pending tasks).
Wed, Dec 4, 4:06 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Rentention change documented at: https://wikitech.wikimedia.org/wiki/Bacula#Modify_a_pool's_retention_(or_other_similar_properties)

Wed, Dec 4, 4:06 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

After update, the pools seem ok, although we probably should also increase the offsite one (creating patch).

*list pool
+--------+------------+---------+---------+-----------------+--------------+---------+----------+-------------+
| PoolId | Name       | NumVols | MaxVols | MaxVolBytes     | VolRetention | Enabled | PoolType | LabelFormat |
+--------+------------+---------+---------+-----------------+--------------+---------+----------+-------------+
|      1 | Default    |       0 |       1 |               0 |  155,520,000 |       1 | Backup   | *           |
|      2 | production |      33 |      60 | 536,870,912,000 |    7,776,000 |       1 | Backup   | production  |
|      3 | Archive    |       2 |       5 | 536,870,912,000 |  157,680,000 |       1 | Backup   | archive     |
|      4 | offsite    |       0 |      60 | 536,870,912,000 |    2,592,000 |       1 | Backup   | offsite     |
|      5 | Databases  |       5 |      60 | 536,870,912,000 |    7,776,000 |       1 | Backup   | databases   |
+--------+------------+---------+---------+-----------------+--------------+---------+----------+-------------+
Wed, Dec 4, 3:17 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Hi, @akosiaris, thanks for the reviews and feedback. Could I have further your thoughts on T238048#5701519 and T238048#5701534. Normally I would just find a solution or workaround on my own, but archive file copy was one of the parts in which I compromised my suggested plan because you were quite confident on its forward compatibility :-/. On the other side, most of those files seem to be around 5 years old, which may mean some should be actually be purged. Let me know your thoughts.

Wed, Dec 4, 3:14 PM · Goal, Operations
akosiaris awarded T238048: Followup to backup1001 bacula switchover (misc pending tasks) a Love token.
Wed, Dec 4, 1:05 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

"Offsite Job" seems to be correctly configured as "Copy", but it is not showing any activity. Needs checking.

Wed, Dec 4, 11:44 AM · Goal, Operations
jcrespo added a comment to T239791: DB: perform rolling restart of mariadb deamons to pick up CA changes.

I wonder if some of these could be done on reimage, if/when there is one planned anyway.

Wed, Dec 4, 11:14 AM · DBA, User-jbond, Puppet, Operations

Tue, Dec 3

jcrespo updated the task description for T238048: Followup to backup1001 bacula switchover (misc pending tasks).
Tue, Dec 3, 6:09 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Yay!

Full           Backup    10  04-Dec-19 02:05    dbprov2002.codfw.wmnet-Monthly-1st-Wed-Databases-mysql-srv-backups-dumps-latest *unknown*
Full           Backup    10  04-Dec-19 02:05    dbprov2001.codfw.wmnet-Monthly-1st-Wed-Databases-mysql-srv-backups-dumps-latest *unknown*
Tue, Dec 3, 6:08 PM · Goal, Operations
jcrespo updated the task description for T238048: Followup to backup1001 bacula switchover (misc pending tasks).
Tue, Dec 3, 5:41 PM · Goal, Operations
jcrespo added a comment to T239415: Modify maintain-views to skip "#" on new dblist files.

3fe8d696da846e6f3be372e8bf62939242857d99 could help inspire this. This is the reference implementation: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/master/multiversion/MWWikiversions.php#77

Tue, Dec 3, 9:00 AM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
jcrespo awarded T208323: Predictive failures on disk S.M.A.R.T. status a Like token.
Tue, Dec 3, 8:54 AM · Operations, DBA

Mon, Dec 2

jcrespo added a comment to T236833: wt2html: Out of memory crashers.

if the expectation is that the production error tag will give this higher priority compared other Parsoid bugs, that is not going to be the case right now because of the reality of Parsoid vs Parser.php differences. But, if the tag is just an indicator of that this is an exception raised on the production cluster, then, that is fine.

Mon, Dec 2, 3:39 PM · Wikimedia-production-error, serviceops, Operations, Parsoid-PHP
jcrespo added a subtask for T172492: Improve database alerting (tracking): T205628: Handle object metadata backups and compare it with stored database object inventory.
Mon, Dec 2, 10:05 AM · Epic, observability, DBA
jcrespo added a parent task for T205628: Handle object metadata backups and compare it with stored database object inventory: T172492: Improve database alerting (tracking).
Mon, Dec 2, 10:05 AM · DBA
jcrespo updated the task description for T138562: Improve regular production database backups handling.
Mon, Dec 2, 10:03 AM · Epic, DBA

Fri, Nov 29

jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Same for bast1001:

Fri, Nov 29, 1:26 PM · Goal, Operations
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Error while trying to restore sodium contents:

Fri, Nov 29, 1:13 PM · Goal, Operations

Wed, Nov 27

jcrespo triaged T238048: Followup to backup1001 bacula switchover (misc pending tasks) as High priority.
root@db1135.eqiad.wmnet[bacula9]> UPDATE Media SET StorageId = 11 WHERE StorageId = 4;
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0
Wed, Nov 27, 3:43 PM · Goal, Operations
jcrespo moved T234450: Some Special:Contributions requests cause "Error: 0" from database or WMFTimeoutException from Done to Inbox on the Core Platform Team Workboards (Clinic Duty Team) board.
Wed, Nov 27, 12:05 PM · MW-1.35-notes (1.35.0-wmf.5; 2019-11-05), Patch-For-Review, User-notice, Core Platform Team Workboards (Clinic Duty Team), Vuln-DoS, Security, Performance Issue, MediaWiki-Special-pages, Wikimedia-production-error
jcrespo reopened T234450: Some Special:Contributions requests cause "Error: 0" from database or WMFTimeoutException as "Open".

{P9764}

Wed, Nov 27, 12:04 PM · MW-1.35-notes (1.35.0-wmf.5; 2019-11-05), Patch-For-Review, User-notice, Core Platform Team Workboards (Clinic Duty Team), Vuln-DoS, Security, Performance Issue, MediaWiki-Special-pages, Wikimedia-production-error
jcrespo added a comment to T239211: HP SSD Failure Firmware Fix.

Please note those are for SAS disks, I beleive we have more SATA ones, which are affected by https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-a00048133ja_jp

Wed, Nov 27, 10:35 AM · ops-eqiad, ops-codfw, Operations, DC-Ops

Tue, Nov 26

jcrespo added a comment to T239261: add wbc_entity_usage to all labs projects?.

Although checking more closely, this should be closed as invalid- those wikis doesn't have wikidata enabled, so there is no such tables. Not all wikis have the same tables, some depend on the plugin configuration.

Tue, Nov 26, 6:45 PM · Data-Services, cloud-services-team, Analytics
jcrespo edited projects for T239261: add wbc_entity_usage to all labs projects?, added: cloud-services-team, Data-Services; removed DBA.
Tue, Nov 26, 6:41 PM · Data-Services, cloud-services-team, Analytics
jcrespo awarded T237559: wfEscapeWikiText() emits error "PHP Notice: Array to string conversion" on Special:Search a Like token.
Tue, Nov 26, 5:55 PM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Discovery-Search (Current work), Wikimedia-production-error, affects-translatewiki.net, MediaWiki-Search
jcrespo added a comment to T237559: wfEscapeWikiText() emits error "PHP Notice: Array to string conversion" on Special:Search.

Thank you for checking!

Tue, Nov 26, 5:55 PM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Discovery-Search (Current work), Wikimedia-production-error, affects-translatewiki.net, MediaWiki-Search
jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

Batch editing the DB

The update should be:

UPDATE Media SET StorageId = 11 WHERE StorageId = 4;
Tue, Nov 26, 5:22 PM · Goal, Operations
jcrespo added a comment to T239170: Create a new nova database on m5 named 'nova_cell0'.

It was more a question for @Andrew (which he already answered)

Tue, Nov 26, 4:42 PM · DBA, cloud-services-team (Kanban)
jcrespo added a comment to T239170: Create a new nova database on m5 named 'nova_cell0'.

Does this need backups? cc @jcrespo

Tue, Nov 26, 4:12 PM · DBA, cloud-services-team (Kanban)
jcrespo added a project to T236833: wt2html: Out of memory crashers: Wikimedia-production-error.

This is ongoing, so adding production error tag:

Tue, Nov 26, 6:42 AM · Wikimedia-production-error, serviceops, Operations, Parsoid-PHP
jcrespo added a comment to T237559: wfEscapeWikiText() emits error "PHP Notice: Array to string conversion" on Special:Search.

I am seeing ATM errors:

/wiki/Special:Search?search=<search string>&ns0=1   ErrorException from line 1591 of /srv/mediawiki/php-1.35.0-wmf.5/includes/GlobalFunctions.php: PHP Notice: Array to string conversion

Such as: https://logstash.wikimedia.org/goto/ba65dd4317ecfef44eac4372c9c13a62

Tue, Nov 26, 6:20 AM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Discovery-Search (Current work), Wikimedia-production-error, affects-translatewiki.net, MediaWiki-Search

Mon, Nov 25

jcrespo added a comment to T228759: Merge the Phabricator Priority values "Low" and "Lowest".

"Lowest" sounds belittling and demotivating

Mon, Nov 25, 3:01 PM · Phabricator
jcrespo added a comment to T99216: Please set up a CNAME for videoserver.wikimedia.org to Video Editing Server.

@Aklapper an answer to T99216#2057570

Mon, Nov 25, 8:17 AM · Traffic, Operations, Internet-Archive, DNS, Wikimedia-Video

Fri, Nov 22

jcrespo added a comment to T234826: Repurpose db1108 as generic Analytics db replica.

I was planning to have only one mariadb instance acting as multi-source

Fri, Nov 22, 12:30 PM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
jcrespo added a comment to T234900: Setup bacula backup monitoring.
Fri, Nov 22, 12:10 PM · Patch-For-Review, Availability, observability, Goal, Operations
jcrespo added a comment to T234826: Repurpose db1108 as generic Analytics db replica.

buster + last version of mariadb

Fri, Nov 22, 10:31 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
jcrespo added a comment to T234900: Setup bacula backup monitoring.

I will document the graph when it is "finished" (WIP), but for now:

  • Backup time: end_time - start_time of the last backup
  • Backup level: if it is a Full backup ord('F') => 70, incremental ord('I') => 73 or Differential ord('D') => 68, and other options may exist too.
  • Backup status: terminated successfully ord('T') => 84, still running, aborted by user, fatal error ('f'), ...
Fri, Nov 22, 10:09 AM · Patch-For-Review, Availability, observability, Goal, Operations
jcrespo added a comment to T234900: Setup bacula backup monitoring.

As I feared, the exported during peak hours gets too slow: https://grafana.wikimedia.org/d/413r2vbWk/bacula?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-job=gerrit1001.wikimedia.org-Hourly-Sun-production-srv-gerrit-git&from=1574390374636&to=1574405281903

Fri, Nov 22, 8:29 AM · Patch-For-Review, Availability, observability, Goal, Operations

Thu, Nov 21

jcrespo awarded T238301: MediaWiki\Extension\MachineVision\Maintenance\FetchSuggestions::execute: no transaction to commit, something got out of sync regularly erroring since 2019-11-13 a Like token.
Thu, Nov 21, 6:30 PM · MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Product-Infrastructure-Team-Backlog (Kanban), Machine vision, Wikimedia-production-error
jcrespo added a comment to T237229: Schema change procedure documentation blanked, no alternative given for 8 months.

I see, thanks. Again, just to be clear, I don't need this to be an approved policy- I just need something written on mw.org to direct wmf developers & deployers to wikitech instructions because ongoing coordination issues.

Thu, Nov 21, 12:37 PM · TechCom, MediaWiki-General, Documentation
jcrespo added a comment to T234900: Setup bacula backup monitoring.

Working now:

Thu, Nov 21, 9:48 AM · Patch-For-Review, Availability, observability, Goal, Operations
jcrespo added a comment to T237229: Schema change procedure documentation blanked, no alternative given for 8 months.

It is not in progress, I commented precisely there saying that.

Thu, Nov 21, 9:20 AM · TechCom, MediaWiki-General, Documentation
jcrespo added a comment to T234900: Setup bacula backup monitoring.

There is a bug:

Thu, Nov 21, 9:11 AM · Patch-For-Review, Availability, observability, Goal, Operations

Wed, Nov 20

jcrespo added a comment to T234900: Setup bacula backup monitoring.

This is what I got so far (only per-job information so far):

Wed, Nov 20, 8:45 AM · Patch-For-Review, Availability, observability, Goal, Operations

Fri, Nov 15

jcrespo added a comment to T238370: Apply schema changes for OAuth 2.0.

I doubt labtestwiki has replicas...

Fri, Nov 15, 4:55 PM · CPT Initiatives (OAuth 2.0), Blocked-on-schema-change, DBA
jcrespo added a comment to T231858: Archive data on eventlogging MySQL to analytics replica before decomisioning .

If you don't plan to recover the data, and it is for archival purposes, that is ok. However I strongly suggest to use mydumper in the future, or a recovery on a single thread would take around 5 days, and will make very difficult to do a partial recovery. We preciselly wrap backup_mariadb.py and recover_dump.py so sane defaults are used. The backup taking would also have been 5-10 times faster.

Fri, Nov 15, 8:08 AM · Analytics-Kanban, Analytics, Analytics-EventLogging
jcrespo added a comment to T238301: MediaWiki\Extension\MachineVision\Maintenance\FetchSuggestions::execute: no transaction to commit, something got out of sync regularly erroring since 2019-11-13.

Is this serious enough that we should halt the script and deploy a fix immediately, or can it wait until after the current run finishes?

Fri, Nov 15, 8:00 AM · MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Product-Infrastructure-Team-Backlog (Kanban), Machine vision, Wikimedia-production-error

Thu, Nov 14

akosiaris awarded T236406: Switchover backup director service from helium to backup1001 a Yellow Medal token.
Thu, Nov 14, 7:25 PM · Patch-For-Review, Goal, DBA, serviceops, Operations
jcrespo added a comment to T238296: job queue insert rate metrics gone from Grafana.

Just to be clear, I wasn't suggesting removing it- mostly it was fixing the missing metrics and making things more easy to find/document.

Thu, Nov 14, 10:00 AM · Core Platform Team Workboards (Clinic Duty Team), serviceops, WMF-JobQueue, MediaWiki-JobQueue, observability
jcrespo added a project to T238296: job queue insert rate metrics gone from Grafana: Core Platform Team.

Both in cases, be it deprecated or not, probably we will want better discoverability (tags) on the new dashboards, documentation update https://wikitech.wikimedia.org/wiki/Kafka_Job_Queue and potentially adding a link to the above dashboards on the old one (just a suggested fix).

Thu, Nov 14, 8:36 AM · Core Platform Team Workboards (Clinic Duty Team), serviceops, WMF-JobQueue, MediaWiki-JobQueue, observability
jcrespo created T238301: MediaWiki\Extension\MachineVision\Maintenance\FetchSuggestions::execute: no transaction to commit, something got out of sync regularly erroring since 2019-11-13.
Thu, Nov 14, 7:03 AM · MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Product-Infrastructure-Team-Backlog (Kanban), Machine vision, Wikimedia-production-error
jcrespo added a project to T238296: job queue insert rate metrics gone from Grafana: serviceops.
Thu, Nov 14, 6:40 AM · Core Platform Team Workboards (Clinic Duty Team), serviceops, WMF-JobQueue, MediaWiki-JobQueue, observability
jcrespo created T238296: job queue insert rate metrics gone from Grafana.
Thu, Nov 14, 6:38 AM · Core Platform Team Workboards (Clinic Duty Team), serviceops, WMF-JobQueue, MediaWiki-JobQueue, observability
jcrespo added a comment to T237559: wfEscapeWikiText() emits error "PHP Notice: Array to string conversion" on Special:Search.

Also happening on production (rarely though) according to Logstash.

Thu, Nov 14, 6:14 AM · MW-1.35-notes (1.35.0-wmf.10; 2019-12-10), Discovery-Search (Current work), Wikimedia-production-error, affects-translatewiki.net, MediaWiki-Search

Wed, Nov 13

jcrespo awarded T237650: Renew and deploy GlobalSign unified cert (2019) a Like token.
Wed, Nov 13, 2:16 PM · Operations, Traffic
jcrespo added a comment to T232446: Compress new Wikibase tables.

I'm in the wikibase team. Can you tell me who said it and where, maybe I'm missing something? Technically it's not possible but it's just matter of sending proper connection to the class and that's all.

Wed, Nov 13, 2:06 PM · DBA
jcrespo added a comment to T193224: Evaluate and decide the future of relational datastore at WMF after the upgrade of MariaDB 10.1 is finished.

db1114 is now running percona-server 8.0, if anyone wants to test it.

Wed, Nov 13, 10:03 AM · MediaWiki-General, Operations, DBA
jcrespo added a comment to T232446: Compress new Wikibase tables.

on our side (which stores and uses the most of stuff SDC use), we can safely move to another server, we don't do any joins with other tables in the code.

Wed, Nov 13, 8:22 AM · DBA

Tue, Nov 12

jcrespo added a comment to T238048: Followup to backup1001 bacula switchover (misc pending tasks).

@akosiaris Could you give a quick look to see if these seems like a complete archive contents?
{P9597}

Tue, Nov 12, 11:23 AM · Goal, Operations
jcrespo changed the status of T224589: Migrate dbmonitor hosts to Stretch/Buster, a subtask of T224549: Track remaining jessie systems in production, from Open to Stalled.
Tue, Nov 12, 9:39 AM · Operations
jcrespo changed the status of T224589: Migrate dbmonitor hosts to Stretch/Buster from Open to Stalled.
Tue, Nov 12, 9:39 AM · Operations