Banyek (Balazs Pocze)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Aug 27 2018, 1:02 PM (15 w, 6 d)
Availability
Available
LDAP User
Banyek
MediaWiki User
Unknown

Recent Activity

Fri, Dec 14

Banyek added a comment to T209488: Global rename of Massimo Telò → Teseo: supervision needed.

the progress url for rename is: https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/Teseo

Fri, Dec 14, 9:24 AM · DBA, Wikimedia-Site-requests
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

We didn't talked about this so far, but these views doesn't ask for having proper indexes?

Fri, Dec 14, 8:34 AM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T209488: Global rename of Massimo Telò → Teseo: supervision needed.

@1997kB I'd say we should stick to the date on the calendar event

Fri, Dec 14, 8:26 AM · DBA, Wikimedia-Site-requests
Banyek added a comment to T209488: Global rename of Massimo Telò → Teseo: supervision needed.

Aye, I am here, noted

Fri, Dec 14, 8:16 AM · DBA, Wikimedia-Site-requests

Wed, Dec 12

Banyek moved T211804: A huge spike on read rows for commonswiki from Triage to In progress on the DBA board.
Wed, Dec 12, 9:29 PM · Core Platform Team Kanban (Done with CPT), MW-1.33-notes (1.33.0-wmf.8; 2018-12-11), Patch-For-Review, DBA
Banyek created T211804: A huge spike on read rows for commonswiki.
Wed, Dec 12, 6:22 PM · Core Platform Team Kanban (Done with CPT), MW-1.33-notes (1.33.0-wmf.8; 2018-12-11), Patch-For-Review, DBA
Banyek added a comment to T211544: Drop FlaggedRevs tables in database for ptwikipedia.

Tables were renamed on db1122 for proof:

Wed, Dec 12, 2:45 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek added a comment to T211544: Drop FlaggedRevs tables in database for ptwikipedia.

I'll do first renaming the tables on db1122, and if nothing breaks this week, I'll do the drops

Wed, Dec 12, 1:30 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek created P7908 (An Untitled Masterwork).
Wed, Dec 12, 10:53 AM
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Wed, Dec 12, 10:16 AM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Wed, Dec 12, 9:35 AM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek closed T211537: Degraded RAID on db1063 as Resolved.

The sync finished, thank you @Cmjohnson

	Virtual Drive: 0 (Target Id: 0)
	RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
	State: Optimal
	Number Of Drives per span: 2
	Number of Spans: 6
	Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Wed, Dec 12, 9:28 AM · DBA, ops-eqiad, Operations
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

The materialized view generation completed.
The total size of the materialized views are ~150 G all together.

root@labsdb1010:~# find /srv -iname comment_mat.ibd -ls | awk '{size_in_g +=$7} END {print "Total size: " size_in_g/1024/1024/1024}'
Total size: 149.266

The total time view generation take is ~18 hours (!This run excluded enwiki_p.comments_mat as it was created earlier and took ~5hrs!)

root@labsdb1010:~# cat create_mat.log | egrep  "Starting|Completed"
Tue Dec 11 15:07:11 UTC 2018 - Starting
Wed Dec 12 04:18:08 UTC 2018 - Completed
Wed, Dec 12, 8:27 AM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics

Tue, Dec 11

Banyek added a comment to T211544: Drop FlaggedRevs tables in database for ptwikipedia.

ptwikipedia lives in the s2 section, the following hosts needs to be done:

Tue, Dec 11, 7:19 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek added a comment to T211537: Degraded RAID on db1063.
root@db1063:~# megacli -PDList -aall | egrep -i "Slot|Firmw"
Slot Number: 0
Firmware state: Rebuild
Device Firmware Level: 0008

awesome, thanks!

Tue, Dec 11, 7:12 PM · DBA, ops-eqiad, Operations
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Tue, Dec 11, 4:57 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Tue, Dec 11, 4:55 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Tue, Dec 11, 4:45 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Tue, Dec 11, 4:34 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Tue, Dec 11, 4:21 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

on labsdb1010 I stared to create materialized views from all the comment views, except enwiki - as we know how much time and space it demands. The process is running on the host in a screen named create_mat also it creates a create_mat.loglogfile about the time it takes

Tue, Dec 11, 3:09 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

While the section install schema being discussed on T172410 I created and mounted the data directories on the new hosts.

Tue, Dec 11, 2:39 PM · Patch-For-Review, User-Banyek, Analytics-Kanban, DBA, Analytics
Banyek moved T211544: Drop FlaggedRevs tables in database for ptwikipedia from Backlog to next on the User-Banyek board.
Tue, Dec 11, 1:50 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek claimed T211544: Drop FlaggedRevs tables in database for ptwikipedia.
Tue, Dec 11, 1:50 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21

Mon, Dec 10

Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

I was thinking we should let it catch up, and then redo the import, but with mydumper instead of mysqldump as we were talking about that
Also it's weird why it happened again

Mon, Dec 10, 6:38 PM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

aand after I restarted the instance, I've got:

Mon, Dec 10, 6:09 PM · DBA
Banyek added a comment to T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26).

I think we should adjust the slow timer in a way of not to alert if the scripts runs for n seconds

Mon, Dec 10, 6:08 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek added a comment to T208622: Import recommendations into production database.

A quick recap on today's meeting:

  • we'll have an import in every month or in every quarter, this is tbd, but it will happen continuouly, but not too often
  • the data to import is about 3-4 gigabytes. Not too much, but definitely a volume.
Mon, Dec 10, 5:55 PM · Analytics, User-Banyek, Patch-For-Review, Operations, Research
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Mon, Dec 10, 3:37 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Mon, Dec 10, 3:12 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek added a project to T211544: Drop FlaggedRevs tables in database for ptwikipedia: User-Banyek.
Mon, Dec 10, 2:43 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek moved T211544: Drop FlaggedRevs tables in database for ptwikipedia from Triage to Backlog on the DBA board.
Mon, Dec 10, 2:29 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21
Banyek triaged T211544: Drop FlaggedRevs tables in database for ptwikipedia as Normal priority.
Mon, Dec 10, 2:28 PM · Patch-For-Review, User-Banyek, DBA, User-Zoranzoki21

Sun, Dec 9

Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

ETA is still 20hrs

Sun, Dec 9, 8:45 AM · DBA

Sat, Dec 8

Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

Importing back the table is still in progress after ~1 day. The file is ~137G currently, on the source it is 384G.
The pv tool estimates another 23 hours. I'll check it again in the future.

Sat, Dec 8, 12:13 PM · DBA

Fri, Dec 7

Banyek added a comment to T208622: Import recommendations into production database.

I propose a quick talk with @bmansurov and @Ottomata to clarify a few questions on monday

Fri, Dec 7, 3:52 PM · Analytics, User-Banyek, Patch-For-Review, Operations, Research
Banyek added a comment to T208622: Import recommendations into production database.

ignore https://phabricator.wikimedia.org/T208622#4803814 I mis-read something

Fri, Dec 7, 3:39 PM · Analytics, User-Banyek, Patch-For-Review, Operations, Research
Banyek moved T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26) from Wait on external to FYI on the User-Banyek board.
Fri, Dec 7, 3:24 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek added a comment to T172410: Phase out and replace analytics-store (multisource).

@elukey the ports are mapped from 3311 to 3318 along with the section names (eg. s3 will be on 3313, s5 on 3315 etc.)

Fri, Dec 7, 3:23 PM · Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
Banyek moved T208383: Implement parsercache service on pc[12]0(07|08|09|10) and replace leased pc[12]00[456] from In progress to FYI on the User-Banyek board.
Fri, Dec 7, 3:18 PM · Patch-For-Review, User-Banyek, DBA, Operations
Banyek added a comment to T208622: Import recommendations into production database.

@Banyek another Q: Can we add permissions to the recommendationapi user on m2-master to be able to connect from stat1007? This might not be the final place where this import runs from, but it will at least allow @bmansurov to do this on his own next time.

Fri, Dec 7, 3:16 PM · Analytics, User-Banyek, Patch-For-Review, Operations, Research
Banyek added a comment to T107610: Setup separate logical External Store for Flow in production.

Sorry for the late answer @Catrope,

Fri, Dec 7, 12:31 PM · Growth-Team (Current Sprint), User-Banyek, DBA, Operations, WorkType-Maintenance, Collaboration-Team-Triage, StructuredDiscussions
Banyek created P7894 (An Untitled Masterwork).
Fri, Dec 7, 10:13 AM
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

You could've used compare.py, it might have taken several hours though.

Fri, Dec 7, 10:06 AM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

The table is too huge to easily find the inconsistencies (a SELECT COUNT(*) ... command took as much time I stopped it) so I decided to reimport everything. The script is started, running on labsdb1004 in a screen named T211210. I'll take attention on it.

Fri, Dec 7, 9:49 AM · DBA
Banyek created P7893 (An Untitled Masterwork).
Fri, Dec 7, 9:00 AM
Banyek moved T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26) from Backlog to Wait on external on the User-Banyek board.
Fri, Dec 7, 8:55 AM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek added a comment to T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26).

@kaldari if you need any help for further debugging this, you can ask me

Fri, Dec 7, 8:48 AM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek moved T207584: Prepare and check storage layer for punjabiwikimedia from In progress to Done on the DBA board.
Fri, Dec 7, 8:39 AM · DBA, Data-Services
Banyek added a project to T207584: Prepare and check storage layer for punjabiwikimedia: DBA.
Fri, Dec 7, 8:39 AM · DBA, Data-Services
Banyek edited projects for T207584: Prepare and check storage layer for punjabiwikimedia, added: Data-Services; removed Cloud-Services, DBA.
Fri, Dec 7, 8:34 AM · DBA, Data-Services

Thu, Dec 6

Banyek added a comment to T210749: Hardware for cloud db replicas for analytics usage .

Ok, done and agreed. But instead of trying to find hardware that will keep up with replication, I'm asking if replication is necessary, could we do it any other way, given the relatively simple requirements?

Thu, Dec 6, 5:15 PM · User-Banyek, Data-Services, User-Elukey, DBA, Analytics
Banyek added a project to T210749: Hardware for cloud db replicas for analytics usage : User-Banyek.
Thu, Dec 6, 5:12 PM · User-Banyek, Data-Services, User-Elukey, DBA, Analytics
Banyek added a comment to T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26).

i'd like to add the owner of the script as a subscriber, but I don't know how to find who is it

Thu, Dec 6, 5:04 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek removed a project from T207584: Prepare and check storage layer for punjabiwikimedia: User-Banyek.
Thu, Dec 6, 5:03 PM · DBA, Data-Services
Banyek edited projects for T207584: Prepare and check storage layer for punjabiwikimedia, added: Cloud-Services; removed Data-Services.

@Urbanecm Nope, I checked it on the labsdb instances and it was sanitized properl;y.
@Bstorm you can create the views (if the fishbowl wikis have those views

Thu, Dec 6, 5:02 PM · DBA, Data-Services
Banyek added a comment to T207253: Compare a few tables per section between hosts and DC.

@Anomie it worth a look, thanks

Thu, Dec 6, 4:59 PM · Patch-For-Review, User-Banyek, Wikimedia-Incident, DBA
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

@Bstorm if you prepare a depool patch for me for tomorrow I can start create the mat. view on the other schemas along enwiki, and we can evaluate time and disk space

Thu, Dec 6, 4:58 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

150 Gb seems acceptable to me, but

  • we still have some time factor to think about, because those tables have to be created.
  • if the analytics will have a separate instance that shouldn't be a problem, but until/if the labsdb hosts will be used, we need to depool one of the hosts while the mat.views getting generated
  • we also need to calculate with sizes (and creation time) of indexes, because the new tables will not have those
  • even if we can afford depooling one of the labsdb hosts for building the new indexes, we still have to solve it to do automatically, because now it involves puppet deploy
Thu, Dec 6, 4:57 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T208622: Import recommendations into production database.

@Ottomata Yes, I don't see anything against this. Just make sure that the data is copied over a secure channel and get removed both the export and the import servers after the import finished.

Thu, Dec 6, 4:39 PM · Analytics, User-Banyek, Patch-For-Review, Operations, Research
Banyek moved T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26) from Triage to Backlog on the DBA board.
Thu, Dec 6, 2:34 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek added a project to T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26): User-Banyek.
Thu, Dec 6, 2:22 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek updated the task description for T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26).
Thu, Dec 6, 2:21 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek edited projects for T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26), added: DBA; removed WMF-NDA.
Thu, Dec 6, 1:55 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek closed T211269: cron spam from mwmaint1002 as Invalid.

This is a duplicate of T208231

Thu, Dec 6, 1:54 PM · MediaWiki-Database, DBA, MediaWiki-extensions-PageAssessments, Operations, Community-Tech
Banyek added a comment to T208231: Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26).

I checked the query

SELECT /* Wikimedia\Rdbms\Database::select www-data@mwmain... */  DISTINCT( pa_project_id )  FROM `page_assessments
Thu, Dec 6, 1:47 PM · Community-Tech, Performance, MediaWiki-extensions-PageAssessments, User-Banyek, Operations
Banyek moved T211269: cron spam from mwmaint1002 from Triage to Backlog on the DBA board.
Thu, Dec 6, 1:34 PM · MediaWiki-Database, DBA, MediaWiki-extensions-PageAssessments, Operations, Community-Tech
Banyek moved T211338: Make a copy of the current wb_terms table on the MCR testing DB servers from Triage to Backlog on the DBA board.
Thu, Dec 6, 1:34 PM · Wikidata, DBA
Banyek triaged T211338: Make a copy of the current wb_terms table on the MCR testing DB servers as Normal priority.
Thu, Dec 6, 1:34 PM · Wikidata, DBA
Banyek added a comment to T211269: cron spam from mwmaint1002.

I checked the query

SELECT /* Wikimedia\Rdbms\Database::select www-data@mwmain... */  DISTINCT( pa_project_id )  FROM `page_assessments
Thu, Dec 6, 1:32 PM · MediaWiki-Database, DBA, MediaWiki-extensions-PageAssessments, Operations, Community-Tech
Banyek updated the task description for T211269: cron spam from mwmaint1002.
Thu, Dec 6, 1:20 PM · MediaWiki-Database, DBA, MediaWiki-extensions-PageAssessments, Operations, Community-Tech
Banyek added a comment to T211269: cron spam from mwmaint1002.

I'll take a look into this

Thu, Dec 6, 12:38 PM · MediaWiki-Database, DBA, MediaWiki-extensions-PageAssessments, Operations, Community-Tech
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

hm, I leave this as-is as I am not sure which user to use

Thu, Dec 6, 10:27 AM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

The operation would be:

  • reimport the data with the /home/marostegui/reimport_from_master.sh script. (It will take time as the table is ~400G)
  • when it's done, I'll remove the replication filter from the s51230__linkwatcher.linkwatcher_linklog table with

SET SESSION SQL_LOG_BIN=0; SET GLOBAL Replicate_Wild_Ignore_Table='s51412\_\_data.%,s51071\_\_templatetiger\_p.%,s52721\_\_pagecount\_stats\_p.%,s51290\_\_dpl\_p.%,';

  • restart the slave with START SLAVE\G
Thu, Dec 6, 10:17 AM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

The replication is caught up, Doing the reimport now

Thu, Dec 6, 10:11 AM · DBA

Wed, Dec 5

Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

also we need to create indices for comment_mat

Wed, Dec 5, 8:09 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

If you want to fake up an actor_mat for a size estimate, I think something like this would do it. Obviously it won't work for a speed estimate though.

INSERT INTO actor_mat (actor_user, actor_name) SELECT user_id, user_name FROM user;
 -- All the rest have "xx_user = 0" because any that aren't 0 should have been done above
INSERT INTO actor_mat (actor_user, actor_name) 
 SELECT DISTINCT rev_user, rev_user_text FROM revision WHERE rev_user = 0
 UNION SELECT DISTINCT ar_user, ar_user_text FROM archive WHERE ar_user = 0
 UNION SELECT DISTINCT ipb_by, ipb_by_text FROM ipblocks WHERE ipb_by = 0
 UNION SELECT DISTINCT img_user, img_user_text FROM image WHERE img_user = 0
 UNION SELECT DISTINCT oi_user, oi_user_text FROM oldimage WHERE oi_user = 0
 UNION SELECT DISTINCT fa_user, fa_user_text FROM filearchive WHERE fa_user = 0
 UNION SELECT DISTINCT log_user, log_user_text FROM logging WHERE log_user = 0;
Wed, Dec 5, 8:09 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

The actor view was empty, and the empty actor_mat too.

Wed, Dec 5, 8:01 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

The 'materialized view' for comments is completed. I moved it into enwiki_p with name comment_mat.
It took almost six hours, and the file is 31G

Wed, Dec 5, 7:50 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek moved T210693: Create materialized views on Wiki Replica hosts for better query performance from Backlog to In progress on the User-Banyek board.
Wed, Dec 5, 7:39 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek lowered the priority of T211210: labsdb1004 replication broken for linkwatcher_linklog table from High to Normal.
Wed, Dec 5, 7:28 PM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

replication is catching up

Wed, Dec 5, 7:28 PM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

I ignore the table with

SET SQL_LOG_BIN=0; 
SET GLOBAL Replicate_Wild_Ignore_Table='s51412\_\_data.%,s51071\_\_templatetiger\_p.%,s52721\_\_pagecount\_stats\_p.%,s51290\_\_dpl\_p.%,s51230\_\_linkwatcher.linkwatcher\_linklog';
Wed, Dec 5, 7:22 PM · DBA
Banyek moved T211210: labsdb1004 replication broken for linkwatcher_linklog table from Triage to In progress on the DBA board.
Wed, Dec 5, 7:15 PM · DBA
Banyek triaged T211210: labsdb1004 replication broken for linkwatcher_linklog table as High priority.
Wed, Dec 5, 5:23 PM · DBA
Banyek added a comment to T211210: labsdb1004 replication broken for linkwatcher_linklog table.

I don't find the corresponding event in the master binary log:

Wed, Dec 5, 5:22 PM · DBA
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Wed, Dec 5, 3:11 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek updated the task description for T85757: Dropping user.user_options on wmf databases.
Wed, Dec 5, 3:10 PM · Patch-For-Review, User-Banyek, Blocked-on-schema-change, DBA, Schema-change
Banyek added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

This is a very good point, I'll bring it up to my team's standup today and I'll let you know. It has been used, as far as I know, for two purposes:

  • join tables from different databases in tmp tables to work on them freely (thing not possible anymore)
  • use it as holding area for various scripts/analytics-reporing/etc..

    So we should have one mysql instance running staging for sure, and I'd lean towards having it in one host only. I'll update the task once discussed with my team :)
Wed, Dec 5, 1:59 PM · Patch-For-Review, User-Banyek, Analytics-Kanban, DBA, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

Thanks for the clarification - I just wanted to know if there was some specific reason for it that I might have missed.
Once the test is done make sure to either clean it up or move it to the views DB, as that is where the labsdbuser has access to and to keep it consistent. (views on one db and underlying core tables on the other)

Wed, Dec 5, 1:49 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek updated subscribers of T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

maybe I can summon here @Milimetric about the staging db?

Wed, Dec 5, 1:42 PM · Patch-For-Review, User-Banyek, Analytics-Kanban, DBA, Analytics
Banyek added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

About the staging db:

Wed, Dec 5, 1:23 PM · Patch-For-Review, User-Banyek, Analytics-Kanban, DBA, Analytics
Banyek added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

What I am not sure is what to do with the 'extra' databases (I see there are no previous clusters having them, but if we move them to a separate instance, that instance has to be mapped 'somewhere'.)
Plus, I don't know who to ask about those db's.

Wed, Dec 5, 1:21 PM · Patch-For-Review, User-Banyek, Analytics-Kanban, DBA, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

No, there is no any real reson, but as this is a test only I wanted to keep this 'clean' and enwiki feels more suitable for testing than enwiki_p.
Don't ask why, I can't answer it.
But at the end I can move the test table to enwiki_p is that is preferable

Wed, Dec 5, 11:52 AM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

For testing I created the view as comment_view_temp with the query @Bstorm wrote in T210693#4798638 and the table is being created with

create table comment_mat_view as select * from comment_view_temp;

it will take time :/

Wed, Dec 5, 11:39 AM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics

Tue, Dec 4

Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

@Milimetric could you be a little bit more specific please? As I understood currently you run the queries described in T210693#4795470 and what you need from us is to create a few tables which would make those queries run faster, right?
What is not clear to me, is that we - as the Persistence Team - should figure out how to build those tables or you can provide us some SQL queries which will build those tables, and our job is to integrate this process into our current system?
I have a proposal about the integration part in T210693#4796863: that would install a script on any chosen host, and run it via cron. Currently it is aimed to labsdb1010, but it could be easily moved to any future host.

Tue, Dec 4, 2:51 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

In https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/477503/ I have a proposal about how to install the materialized script generator on any of the hosts.
In this case it will be put to one of the wikireplica_analytics (labsdb1010) host, but it is easily movable anywhere

Tue, Dec 4, 10:33 AM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics

Mon, Dec 3

Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

We talked about the kick-off, and tomorrow we'll sync up about what we found.
@Bstorm gives me a few SQL's and I'll check what can I do with those

Mon, Dec 3, 6:06 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek added a comment to T210693: Create materialized views on Wiki Replica hosts for better query performance.

@Bstrom when could we talk about the details?

Mon, Dec 3, 5:44 PM · Patch-For-Review, User-Banyek, Core Platform Team Backlog (Watching / External), Analytics-Kanban, DBA, Data-Services, Analytics
Banyek moved T202367: Productionize dbproxy101[2-7].eqiad.wmnet from Backlog to In progress on the User-Banyek board.
Mon, Dec 3, 5:23 PM · User-Banyek, Patch-For-Review, DBA