Page MenuHomePhabricator
Feed Advanced Search

Feb 4 2019

Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

Thanks for adding the URL @1997kB!

Feb 4 2019, 8:12 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests

Feb 3 2019

Marostegui added a comment to T213670: dbstore1002 Mysql errors.

You did it in the last all hands! :-)
I will walk you thru it so you can fix it yourself entirely!

Feb 3 2019, 8:44 PM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

Can you also post the wiki with this rename request so we can check which wikis have more edits?

Feb 3 2019, 9:50 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests

Feb 1 2019

Marostegui moved T214840: db2085/db1106 don't boot with 4.9.0-8-amd64 from Triage to In progress on the DBA board.
Feb 1 2019, 11:18 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui triaged T214840: db2085/db1106 don't boot with 4.9.0-8-amd64 as Medium priority.
Feb 1 2019, 11:18 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui moved T215050: Degraded RAID on db1073 from Triage to In progress on the DBA board.
Feb 1 2019, 11:18 PM · DBA, ops-eqiad, SRE
Marostegui added a project to T215050: Degraded RAID on db1073: DBA.
Feb 1 2019, 6:09 AM · DBA, ops-eqiad, SRE
Marostegui assigned T215050: Degraded RAID on db1073 to Cmjohnson.

Let's get it replaced sooner than later as it is a master on m5

Feb 1 2019, 6:07 AM · DBA, ops-eqiad, SRE

Jan 31 2019

Marostegui closed T215040: dbtree.wikimedia.org down as Resolved.

From what I can see it was failing on the call to: google.setOnLoadCallback(drawChart);

Jan 31 2019, 10:52 PM · SRE
Marostegui created T215040: dbtree.wikimedia.org down.
Jan 31 2019, 9:28 PM · SRE

Jan 30 2019

Marostegui added a comment to T206965: Degraded RAID on dbstore1002.

Thanks!

Jan 30 2019, 10:37 PM · Product-Analytics, Analytics, ops-eqiad, SRE
Marostegui added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

I had a chat with Moritz about this we he was not too sure it would be a kernel thing itself as in something really wrong with the kernel or maybe some sort of hardware thing or just maybe a punctual thing although you mentioned it was tried several times.

Jan 30 2019, 2:13 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

And after 4 days trying to alter mep_word_persistence dbstore1002 crashed again (T213706#4917915)

Jan 30 2019, 1:55 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

So, basically, the table mep_word_persistence cannot be altered without making dbstore1002 to crash. So I guess once we decide to fully migrate, we will need to convert that table to InnoDB once it has been moved to the new servers.
I will do a proof of concept to make sure it works on the new servers.

Jan 30 2019, 1:55 AM · Analytics-Kanban, User-Elukey, Analytics

Jan 28 2019

Marostegui added a comment to T214720: db1114 crashed (HW memory issues).

Go for it!
Thanks for checking it!

Jan 28 2019, 5:53 PM · Patch-For-Review, DBA, SRE, ops-eqiad
Marostegui added a comment to T214796: ms-be1034 icinga alers .

This might be more likely: T214838: ms-be1034 crash

Jan 28 2019, 3:49 PM · SRE
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

Alter table on the last Aria table on the staging database (mep_word_persistence) still running after 3 days.

Jan 28 2019, 2:35 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T214831: dbstore2001 low on disk space.

+1 to get rid of it:

root@dbstore2001:/srv/backups# ls -lhrt | tail -n5
drwx------ 2 dump dump  24K Feb 28  2018 s1.20180228121150
-rw-r--r-- 1 dump dump   86 Feb 28  2018 dump.s3.log
drwx------ 2 dump dump 7.7M Feb 28  2018 s3.20180228121150
-rw-r--r-- 1 dump dump    0 Feb 28  2018 dump.s4.log
drwx------ 2 dump dump  24K Feb 28  2018 s4.20180228121150
Jan 28 2019, 2:20 PM · SRE, DBA

Jan 26 2019

Marostegui added a comment to T213670: dbstore1002 Mysql errors.

All the tables on incubatorwiki are now InnoDB and replication is catching up.

Jan 26 2019, 4:00 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

s3 thread broke with:

             Last_SQL_Error: Error 'Got error 22 "Invalid argument" from storage engine TokuDB' on query. Default database: 'incubatorwiki'. Query: 'INSERT /* ActorMigration::getInsertValuesWithTempTable  */ INTO `revision_actor_temp` (revactor_rev,revactor_actor,revactor_timestamp,revactor_page) VALUES xxxxx
Replicate_Ignore_Server_Ids:
Jan 26 2019, 3:24 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui moved T214740: Provide access to testreduce* databases on scandium + revoke from ruthenium from Triage to In progress on the DBA board.

@ssastry just to make sure we have all the data we need here, so it is easier, faster and we can avoid mistakes, can you confirm the following info:

Jan 26 2019, 3:15 AM · Patch-For-Review, Parsing-Team--ARCHIVED, DBA
Marostegui closed T213748: swap a2-eqiad PDU with on-site spare as Resolved.

I believe there is nothing else pending here, and this was re-opened just to get an answer from Chris, which was done.
Going to close this, if someone else feels it should remain open, feel free to do so!

Jan 26 2019, 3:07 AM · Analytics-Radar, DBA, ops-eqiad, SRE
Marostegui closed T213748: swap a2-eqiad PDU with on-site spare, a subtask of T212861: Rack A2's hosts alarm for PSU broken, as Resolved.
Jan 26 2019, 3:07 AM · Analytics-Radar, ops-eqiad, SRE
Marostegui moved T214720: db1114 crashed (HW memory issues) from Triage to In progress on the DBA board.
Jan 26 2019, 3:06 AM · Patch-For-Review, DBA, SRE, ops-eqiad
Marostegui assigned T214720: db1114 crashed (HW memory issues) to Cmjohnson.

@jcrespo +1 to reimage/reclone from an existing host (or mariabackup!)

Jan 26 2019, 3:06 AM · Patch-For-Review, DBA, SRE, ops-eqiad
Marostegui closed T196726: db1115 (tendril DB) had OOM for some processes and some hw (memory) issues as Resolved.

I am going to close this for now, as it has been 10 days without issues:

root@db1115:~# free -g
              total        used        free      shared  buff/cache   available
Mem:            125          65           1           1          59          58
Swap:             0
Jan 26 2019, 3:02 AM · ops-eqiad, SRE, DBA

Jan 25 2019

Marostegui closed T214663: Degraded RAID on db2068 as Resolved.

Thanks!

18:15 <+icinga-wm> RECOVERY - HP RAID on db2068 is OK: OK: Slot 0: OK: 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:9, 1I:1:10, 1I:1:11, 1I:1:12 - Controller: OK - Battery/Capacitor: OK
Jan 25 2019, 6:27 PM · SRE, ops-codfw
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 25 2019, 9:51 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 25 2019, 9:51 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui assigned T214663: Degraded RAID on db2068 to Papaul.

@Papaul let's get it replaced - thanks!

Jan 25 2019, 6:06 AM · SRE, ops-codfw

Jan 24 2019

Marostegui added a comment to T212487: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup.

FWIW (I know I'm a little late on this) I think that illustration project was something we either never got off the ground, or haven't looked at in some time.

Jan 24 2019, 8:26 PM · User-Elukey, Analytics-Kanban, DBA, Analytics
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

These are the only two tables left in Aria after today's alters:

root@dbstore1002.eqiad.wmnet[information_schema]> select TABLE_NAME from tables where ENGINE='Aria' and TABLE_SCHEMA='staging';
+----------------------+
| TABLE_NAME           |
+----------------------+
| mep_word_persistence |
| organic_link         |
+----------------------+
2 rows in set (0.05 sec)
Jan 24 2019, 7:10 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui updated subscribers of T172497: Fix mediawiki heartbeat model, change pt-heartbeat model to not use super-user, avoid SPOF and switch automatically to the real master without puppet dependency.

@Addshore let us know today that there is a "new" error that started happening today which looks related to this thread (I think):
https://logstash.wikimedia.org/goto/018d06f1ac178c272964fa71b76702e1

Jan 24 2019, 10:50 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), MediaWiki-libs-Rdbms, DBA
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 24 2019, 8:59 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1095
  • db1090
  • db1076
  • db1074
  • db1066
Jan 24 2019, 8:58 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 24 2019, 8:57 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 24 2019, 8:39 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 24 2019, 8:31 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 24 2019, 7:56 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 24 2019, 7:49 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 24 2019, 6:45 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
RandomDSdevel awarded T213858: s3 master emergency failover (db1075) a Baby Tequila token.
Jan 24 2019, 12:47 AM · Patch-For-Review, DBA, SRE

Jan 23 2019

Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 23 2019, 4:02 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.
Jan 23 2019, 2:53 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

All the tokudb tables on staging database have been migrated to InnoDB:

root@DBSTORE[information_schema]> select TABLE_SCHEMA,TABLE_NAME,UPDATE_TIME,TABLE_ROWS from tables where ENGINE='TokuDB' and TABLE_SCHEMA='staging' order by update_time desc;
Empty set (1.20 sec)
Jan 23 2019, 2:52 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T211613: rack/setup/install db11[26-38].eqiad.wmnet.

Thank you!

Jan 23 2019, 2:46 PM · Goal, DBA, ops-eqiad, User-Marostegui, SRE
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 23 2019, 2:22 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui edited P8027 (An Untitled Masterwork).
Jan 23 2019, 2:15 PM
Marostegui updated the task description for T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.
Jan 23 2019, 1:57 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui edited P8027 (An Untitled Masterwork).
Jan 23 2019, 1:57 PM
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

The following Aria tables have been converted to InnoDB on dbstore1002 on the staging database:

tbayer_test2
tbayer_test1
theodora
tgr_uw_terminating_errors
tgr_gather_user_requests
top_2016_by_month
tgr_revdel_tmp
rad_labeled_user
pageviews_by_country_language
temp3
ve2_pilot_users
wiki_month_registrations
pageviews_per_project_country_v2
th2_experimental_user
ve2_experimental_users
tr_experimental_user
rev_reverted_20k_sample
tr_experimental_user_revision
ve2_experimental_user_revision_stats
woe_wiki_edit_count
rev_ids_20k_sample
overall_control_month_stats
th_link_additions
wikidata_nonbot_reverted_sample
wikidata_nonbot_sample
overall_token_stats_cleaned
overall_token_stats
revert_20150301_commonswiki
tbayer_readnavtimesessions_20160107
tbayer_readnavtimesessions5sec_20160107
tbayer_readnavsessions_20160107
tbayer_test3
pentaho04
pentahoviews_countries
pentahoviews
pentahoviews05
temp
resolved_organic_inlink_count
revert_20150301_ptwiki
resolved_inlink_count
tbayer_readnavevents_20160107
yearly_page_edits
record_impression
revert_20150301_dewiki
user_registration_approx
Jan 23 2019, 1:56 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui updated the task description for T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.
Jan 23 2019, 1:50 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui created P8027 (An Untitled Masterwork).
Jan 23 2019, 1:50 PM
Marostegui added a comment to P7978 Aria tables on dbstore1002.

Aria tables on the staging database:

root@dbstore1002.eqiad.wmnet[staging]>  select TABLE_SCHEMA,TABLE_NAME,UPDATE_TIME,TABLE_ROWS from information_schema.tables where ENGINE='aria' and TABLE_SCHEMA='staging' order by table_rows asc;
+--------------+-----------------------------------------+---------------------+------------+
| TABLE_SCHEMA | TABLE_NAME                              | UPDATE_TIME         | TABLE_ROWS |
+--------------+-----------------------------------------+---------------------+------------+
| staging      | tbayer_test2                            | 2019-01-14 13:57:19 |          8 |
| staging      | tbayer_test1                            | 2019-01-14 13:57:19 |         28 |
| staging      | theodora                                | 2019-01-14 13:57:37 |         79 |
| staging      | tgr_uw_terminating_errors               | 2019-01-14 13:57:37 |        110 |
| staging      | tgr_gather_user_requests                | 2019-01-14 13:57:37 |        123 |
| staging      | top_2016_by_month                       | 2019-01-14 14:07:43 |        240 |
| staging      | tgr_revdel_tmp                          | 2019-01-14 13:57:37 |        343 |
| staging      | rad_labeled_user                        | 2019-01-14 13:37:44 |       1063 |
| staging      | pageviews_by_country_language           | 2019-01-14 13:35:21 |       1228 |
| staging      | temp3                                   | 2019-01-14 13:57:23 |       2978 |
| staging      | ve2_pilot_users                         | 2019-01-14 14:08:52 |       4189 |
| staging      | wiki_month_registrations                | 2019-01-14 14:08:56 |      11184 |
| staging      | pageviews_per_project_country_v2        | 2019-01-14 13:37:21 |      12274 |
| staging      | th2_experimental_user                   | 2019-01-14 13:57:37 |      14766 |
| staging      | ve2_experimental_users                  | 2019-01-14 14:08:52 |      26971 |
| staging      | tr_experimental_user                    | 2019-01-14 14:07:44 |      41033 |
| staging      | rev_reverted_20k_sample                 | 2019-01-14 13:47:51 |      47289 |
| staging      | tr_experimental_user_revision           | 2019-01-14 14:07:44 |      50696 |
| staging      | ve2_experimental_user_revision_stats    | 2019-01-14 14:08:52 |      61541 |
| staging      | woe_wiki_edit_count                     | 2019-01-14 14:08:57 |      70877 |
| staging      | rev_ids_20k_sample                      | 2019-01-14 13:47:51 |      80000 |
| staging      | overall_control_month_stats             | 2019-01-14 13:21:55 |     407344 |
| staging      | th_link_additions                       | 2019-01-14 13:57:39 |     444682 |
| staging      | wikidata_nonbot_reverted_sample         | 2019-01-14 14:08:54 |     488536 |
| staging      | wikidata_nonbot_sample                  | 2019-01-14 14:08:56 |    1000000 |
| staging      | overall_token_stats_cleaned             | 2019-01-14 13:21:57 |    1028955 |
| staging      | overall_token_stats                     | 2019-01-14 13:22:00 |    1028955 |
| staging      | revert_20150301_commonswiki             | 2019-01-14 13:40:33 |    1460090 |
| staging      | tbayer_readnavtimesessions_20160107     | 2019-01-14 13:57:15 |    1498379 |
| staging      | tbayer_readnavtimesessions5sec_20160107 | 2019-01-14 13:57:19 |    1498379 |
| staging      | tbayer_readnavsessions_20160107         | 2019-01-14 13:57:12 |    1498379 |
| staging      | tbayer_test3                            | 2019-01-14 13:57:23 |    1510999 |
| staging      | pentaho04                               | 2019-01-14 13:37:26 |    1530114 |
| staging      | pentahoviews_countries                  | 2019-01-14 13:37:38 |    1679403 |
| staging      | pentahoviews                            | 2019-01-14 13:37:44 |    1679403 |
| staging      | pentahoviews05                          | 2019-01-14 13:37:32 |    1828373 |
| staging      | temp                                    | 2019-01-14 13:57:36 |    4686386 |
| staging      | resolved_organic_inlink_count           | 2019-01-14 13:40:27 |    4782450 |
| staging      | revert_20150301_ptwiki                  | 2019-01-14 13:41:47 |    4948260 |
| staging      | resolved_inlink_count                   | 2019-01-14 13:40:14 |    5044527 |
| staging      | tbayer_readnavevents_20160107           | 2019-01-14 13:57:05 |    7200310 |
| staging      | yearly_page_edits                       | 2019-01-14 14:09:28 |    8541686 |
| staging      | record_impression                       | 2019-01-14 13:38:48 |   13167822 |
| staging      | revert_20150301_dewiki                  | 2019-01-14 13:41:30 |   14081776 |
| staging      | user_registration_approx                | 2019-01-14 14:08:52 |   21283152 |
| staging      | referer_data                            | 2019-01-14 13:39:59 |   31056556 |
| staging      | pageviews04                             | 2019-01-14 13:33:05 |   43488232 |
| staging      | pageviews                               | 2019-01-14 13:37:20 |   43568621 |
| staging      | pageviews05                             | 2019-01-14 13:35:19 |   51109873 |
| staging      | revert_20150304_enwiki                  | 2019-01-14 13:47:49 |   87573047 |
| staging      | th_subst_template_additions             | 2019-01-14 14:07:32 |   94939126 |
| staging      | page_name_views_dupes                   | 2019-01-14 13:31:12 |  127987744 |
| staging      | sessions_enwiki_20150801                | 2019-01-14 13:56:47 |  154379416 |
| staging      | organic_link                            | 2019-01-14 13:12:06 |  209134095 |
| staging      | mep_word_persistence                    | 2019-01-14 12:34:42 |  435986042 |
+--------------+-----------------------------------------+---------------------+------------+
Jan 23 2019, 1:49 PM
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 23 2019, 1:44 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

I have altered the following tables:

Jan 23 2019, 1:31 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui edited P8023 (An Untitled Masterwork).
Jan 23 2019, 1:30 PM
Marostegui edited P8023 (An Untitled Masterwork).
Jan 23 2019, 1:26 PM
Marostegui edited P8023 (An Untitled Masterwork).
Jan 23 2019, 1:22 PM
Marostegui updated the task description for T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.
Jan 23 2019, 11:38 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

@elukey we need to get rid of TokuDB before importing on the final dbstore hosts.
These are the tables that currently run TokuDB on staging:

Jan 23 2019, 11:37 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui renamed T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002 from Convert Aria tables to InnoDB on dbstore1002 to Convert Aria/Tokudb tables to InnoDB on dbstore1002.
Jan 23 2019, 11:31 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui created P8023 (An Untitled Masterwork).
Jan 23 2019, 11:31 AM
Marostegui updated subscribers of T213566: Transferring data from Hadoop to production MySQL database.

Maybe @MoritzMuehlenhoff can give some ideas

Jan 23 2019, 11:03 AM · serviceops-radar, Platform Team Legacy (Watching / External), Services (watching), User-Marostegui, SRE, Article-Recommendation, Analytics
Marostegui added a comment to T211613: rack/setup/install db11[26-38].eqiad.wmnet.

@Cmjohnson you've got any rough ETA for these?
Thanks!

Jan 23 2019, 10:27 AM · Goal, DBA, ops-eqiad, User-Marostegui, SRE
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 23 2019, 9:39 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

s5 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • dbstore1002
  • db1124
  • db1113
  • db1110
  • db1102
  • db1100
  • db1097
  • db1096
  • db1082
  • db1070
Jan 23 2019, 9:39 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 23 2019, 9:31 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 23 2019, 9:19 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui removed a watcher for DBA: Banyek.
Jan 23 2019, 9:16 AM
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Jan 23 2019, 8:53 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 23 2019, 8:25 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated subscribers of P8014 SlowTimer.
Jan 23 2019, 8:17 AM
Marostegui moved T174802: Archive and drop education program (ep_*) tables on all wikis from Backlog to Pending comment on the DBA board.
Jan 23 2019, 6:57 AM · User-notice-archive, Datasets-General-or-Unknown, Data-Services, DBA
Marostegui moved T108255: Enable MariaDB/MySQL's Strict Mode from Triage to Backlog on the DBA board.
Jan 23 2019, 6:39 AM · SRE-Sprint-Week-Sustainability-March2023, Epic, Beta-Cluster-Infrastructure, DBA, MediaWiki-libs-Rdbms
Marostegui moved T71222: list=logevents slow for users with last log action long time ago from Triage to Backlog on the DBA board.
Jan 23 2019, 6:39 AM · mariadb-optimizer-bug, User-Marostegui, MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), DBA, Performance Issue, MediaWiki-Action-API
Marostegui created P8020 (An Untitled Masterwork).
Jan 23 2019, 6:26 AM
Marostegui added a comment to T210992: Increase parsercache keys TTL from 22 days back to 30 days.

I have merged both changes after the review from @Krinkle (thanks!).
Let's see how it goes

Jan 23 2019, 6:18 AM · Performance-Team (Radar), SRE, DBA

Jan 22 2019

Marostegui created P8019 (An Untitled Masterwork).
Jan 22 2019, 5:52 PM
Marostegui edited projects for T214402: populateCognatePages.php query keeps timing out while waiting for replication, added: MediaWiki-Maintenance-system; removed DBA.

There is really not much we (DBAs) can do about this particular issue other than T172497 - check also T203059#4896539

Jan 22 2019, 4:46 PM · Wikidata, Cognate
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

db1098:3316 has some differences on change_tag table comparing it with the rest of the hosts on the section. I am going to get those fixed.

Jan 22 2019, 11:10 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T212308: Rerun maintain-views for all tables to drop valid_tag and tag_summary tables.

This can now fully go, as tag_summary got fully dropped everywhere yesterday.

Jan 22 2019, 9:27 AM · Patch-For-Review, cloud-services-team (Kanban)
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 22 2019, 8:21 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui closed T85757: Dropping user.user_options on wmf databases as Resolved.

All done

Jan 22 2019, 8:08 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change
Marostegui closed T85757: Dropping user.user_options on wmf databases, a subtask of T51188: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change], as Resolved.
Jan 22 2019, 8:08 AM · DBA, Tracking-Neverending, Schema-change
Marostegui updated the task description for T85757: Dropping user.user_options on wmf databases.
Jan 22 2019, 8:08 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change
Marostegui updated the task description for T85757: Dropping user.user_options on wmf databases.
Jan 22 2019, 8:08 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change
Marostegui added a comment to T157227: MediaWiki DB tables with columns which references other columns but have different type (tracking) .

From the databases point of view it is all done (I just did a quick check to confirm)

Jan 22 2019, 8:07 AM · MediaWiki-General, User-Reedy, MediaWiki-Change-tagging, Schema-change
Marostegui added a comment to T214003: Merge the "extended-uploader" and "autopatrolled" user groups on Commons.

Ah ok! Thanks :)

Jan 22 2019, 8:04 AM · Patch-For-Review, User-Kizule, Wikimedia-Site-requests, Commons
Marostegui updated the task description for T85757: Dropping user.user_options on wmf databases.
Jan 22 2019, 7:48 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 22 2019, 6:35 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

All the replication threads but x1 started fine.
I have fixed all the x1 rows that failed and it has now caught up

Jan 22 2019, 6:34 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

Another crash happened last night

Thread pointer: 0x0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x48000
mysys/stacktrace.c:247(my_print_stacktrace)[0xbdd6ee]
sql/signal_handler.cc:153(handle_fatal_signal)[0x73dc40]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fe0261f7330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fe02500bc37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fe02500f028]
srv/srv0srv.cc:2200(srv_error_monitor_thread)[0x9870aa]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fe0261ef184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe0250d303d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
190122 00:19:05 mysqld_safe Number of processes running now: 0
190122 00:19:05 mysqld_safe mysqld restarted
190122  0:19:06 [Note] /opt/wmf-mariadb10/bin/mysqld (mysqld 10.0.22-MariaDB) starting as process 18005 ...
2019-01-22 00:19:06 7efde9a2e7c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
190122  0:19:06 [Note] InnoDB: Using mutexes to ref count buffer pool pages
190122  0:19:06 [Note] InnoDB: The InnoDB memory heap is disabled
190122  0:19:06 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
190122  0:19:06 [Note] InnoDB: Memory barrier is not used
190122  0:19:06 [Note] InnoDB: Compressed tables use zlib 1.2.8
190122  0:19:06 [Note] InnoDB: Using CPU crc32 instructions
190122  0:19:06 [Note] InnoDB: Initializing buffer pool, size = 18.0G
190122  0:19:07 [Note] InnoDB: Completed initialization of buffer pool
190122  0:19:07 [Note] InnoDB: Highest supported file format is Barracuda.
190122  0:19:07 [Note] InnoDB: Log scan progressed past the checkpoint lsn 99856990038248
190122  0:19:07 [Note] InnoDB: Database was not shutdown normally!
190122  0:19:07 [Note] InnoDB: Starting crash recovery.
190122  0:19:07 [Note] InnoDB: Reading tablespace information from the .ibd files...
190122  0:27:20 [Note] InnoDB: Restoring possible half-written data pages
190122  0:27:20 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 99856995280896
InnoDB: Doing recovery: scanned up to log sequence number 99857000523776
InnoDB: Doing recovery: scanned up to log sequence number 99857005766656
InnoDB: Doing recovery: scanned up to log sequence number 99857011009536
InnoDB: Doing recovery: scanned up to log sequence number 99857016252416
Jan 22 2019, 6:13 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Jan 22 2019, 6:10 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA

Jan 21 2019

Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Jan 21 2019, 4:55 PM · SRE, DBA
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Jan 21 2019, 4:54 PM · SRE, DBA
Marostegui added a comment to T188327: Deploy refactored actor storage.

Thanks for the heads up

Jan 21 2019, 4:43 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Platform Team Initiatives (Revision Storage Schema Improvements), Platform Team Workboards (Clinic Duty Team), MW-1.33-notes, MW-1.32-notes, Epic
Marostegui added a comment to P8014 SlowTimer.

Keep in mind that there is also the Actor migration (T188327#4895206) on going

Jan 21 2019, 1:33 PM
Marostegui updated the task description for T85757: Dropping user.user_options on wmf databases.
Jan 21 2019, 9:50 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change
Marostegui updated the task description for T85757: Dropping user.user_options on wmf databases.
Jan 21 2019, 8:20 AM · User-Banyek, Schema-change-in-production, DBA, Schema-change