Page MenuHomePhabricator

Modify analytics/wmde/scripts repo to work with the new analytics multisource db setup
Closed, ResolvedPublic8 Estimate Story Points

Description

Right now these scripts make the assumption that there is a single mysql DB and everything lives there.
The scripts need to be modified for the new layout T210478#4794536

dbstore1002 will be decommed in april due to Ubuntu Trusty going EOL (the current host of the single source)
The new hosts should be setup and ready to transfer to in roughly a month

New docs @ https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas

Files that need changing:

  • ./advancedsearch/userprops.php
  • ./revslider/userprops.php
  • ./wikidata/datamodel/sitelinks_per_item.php
  • ./wikidata/datamodel/sitelinks_per_site.php
  • ./wikidata/datamodel/properties_by_datatype.php
  • ./wikidata/datamodel/statements_per_entity.php
  • ./wikidata/datamodel/terms_by_language.php
  • ./wikidata/site_stats/active_users.php
  • ./wikidata/site_stats/page_size.php
  • ./wikidata/site_stats/total_pages.php
  • ./wikidata/site_stats/good_articles.php
  • ./wikidata/site_stats/pages_by_namespace.php
  • ./wikidata/site_stats/user_groups.php
  • ./wikidata/site_stats/total_edits.php
  • ./wikidata/site_stats/users.php
  • ./wikidata/site_stats/rolling_rc.php
  • ./wikidata/site_stats/lexemes.php
  • ./wikidata/site_stats/user_languages.php
  • ./wikidata/entityUsage.php
  • ./betafeatures/counts.php
  • ./echo/statusNotifications.php
  • ./catwatch/userprops.php

Details

Related Gerrit Patches:
analytics/wmde/scripts : productionRemove WikimediaDb::getPdo()
analytics/wmde/scripts : masterRemove WikimediaDb::getPdo()
analytics/wmde/scripts : productionRewrite user_langauges.php to use babel table and Inclusion–exclusion principle
analytics/wmde/scripts : masterRewrite user_langauges.php to use babel table and Inclusion–exclusion principle
analytics/wmde/scripts : productionChange conditions on active_users
analytics/wmde/scripts : masterChange conditions on active_users
analytics/wmde/scripts : productionRebuild the PDO when the insert fails
analytics/wmde/scripts : masterRebuild the PDO when the insert fails
analytics/wmde/scripts : productionFix active_users.php by doing most of the work in code
analytics/wmde/scripts : masterFix active_users.php by doing most of the work in code
analytics/wmde/scripts : productionMove reads to multi-source db setup
analytics/wmde/scripts : masterMove reads to multi-source db setup
analytics/wmde/scripts : productionFix connecting to the right port of multisource db setup
analytics/wmde/scripts : masterFix connecting to the right port of multisource db setup
analytics/wmde/scripts : productionFixes for new multisource db setup
analytics/wmde/scripts : masterFixes for new multisource db setup
operations/puppet : productionstatistics: Add port for staging database
analytics/wmde/scripts : masterAdd methods for new hosts and changing good_articles.php to use that
analytics/wmde/scripts : productionAdd methods for new hosts and changing good_articles.php to use that
operations/puppet : productionstatistics: Add configs for new analytics db hosts
analytics/wmde/scripts : productionIntroduce WikimediaDbSectionMapper based on db-eqiad.php config
analytics/wmde/scripts : masterIntroduce WikimediaDbSectionMapper based on db-eqiad.php config

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 491508 had a related patch set uploaded (by Addshore; owner: Ladsgroup):
[analytics/wmde/scripts@production] Introduce WikimediaDbSectionMapper based on db-eqiad.php config

https://gerrit.wikimedia.org/r/491508

Change 489097 merged by jenkins-bot:
[analytics/wmde/scripts@master] Introduce WikimediaDbSectionMapper based on db-eqiad.php config

https://gerrit.wikimedia.org/r/489097

Change 491508 merged by jenkins-bot:
[analytics/wmde/scripts@production] Introduce WikimediaDbSectionMapper based on db-eqiad.php config

https://gerrit.wikimedia.org/r/491508

Change 490085 merged by Elukey:
[operations/puppet@production] statistics: Add configs for new analytics db hosts

https://gerrit.wikimedia.org/r/490085

Change 490088 merged by jenkins-bot:
[analytics/wmde/scripts@master] Add methods for new hosts and changing good_articles.php to use that

https://gerrit.wikimedia.org/r/490088

Change 490105 merged by jenkins-bot:
[analytics/wmde/scripts@production] Add methods for new hosts and changing good_articles.php to use that

https://gerrit.wikimedia.org/r/490105

Addshore updated the task description. (Show Details)Feb 19 2019, 5:16 PM

It's way more work than expected because the temp tables get directly populated from other tables. I built a non-working POC for anyone who wants to pick this up: P8107
I need to fix some small issues in the code too.

One imporant thing. Ports needs to be addressed too. I completely missed those.

  • 331 + the digit of the section in case of sX. Example: s5 will be accessible to s5-analytics-replica.eqiad.wmnet:3315
  • 3320 for x1. Example: x1-analytics-replica.eqiad.wmnet:3320
  • 3350 for staging

One imporant thing. Ports needs to be addressed too. I completely missed those.

  • 331 + the digit of the section in case of sX. Example: s5 will be accessible to s5-analytics-replica.eqiad.wmnet:3315
  • 3320 for x1. Example: x1-analytics-replica.eqiad.wmnet:3320
  • 3350 for staging

You can also use the DNS PTR records (if the DNS library/code supports those type of queries), so you'll get hostname+port without hardcoding anything.

Change 492986 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] statistics: Add port for staging database

https://gerrit.wikimedia.org/r/492986

Change 492988 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Fixes for nwe multisource db setup

https://gerrit.wikimedia.org/r/492988

Change 492986 merged by Elukey:
[operations/puppet@production] statistics: Add port for staging database

https://gerrit.wikimedia.org/r/492986

Change 493013 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Fix connecting to the right port of multisource db setup

https://gerrit.wikimedia.org/r/493013

Change 493198 had a related patch set uploaded (by Addshore; owner: Ladsgroup):
[analytics/wmde/scripts@production] Fixes for new multisource db setup

https://gerrit.wikimedia.org/r/493198

Change 492988 merged by jenkins-bot:
[analytics/wmde/scripts@master] Fixes for new multisource db setup

https://gerrit.wikimedia.org/r/492988

Change 493198 merged by jenkins-bot:
[analytics/wmde/scripts@production] Fixes for new multisource db setup

https://gerrit.wikimedia.org/r/493198

Change 493200 had a related patch set uploaded (by Addshore; owner: Ladsgroup):
[analytics/wmde/scripts@production] Fix connecting to the right port of multisource db setup

https://gerrit.wikimedia.org/r/493200

Change 493013 merged by jenkins-bot:
[analytics/wmde/scripts@master] Fix connecting to the right port of multisource db setup

https://gerrit.wikimedia.org/r/493013

Change 493200 merged by jenkins-bot:
[analytics/wmde/scripts@production] Fix connecting to the right port of multisource db setup

https://gerrit.wikimedia.org/r/493200

Addshore updated the task description. (Show Details)Feb 27 2019, 11:14 AM
Addshore updated the task description. (Show Details)

Change 493207 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Move reads to multi-source db setup

https://gerrit.wikimedia.org/r/493207

Ladsgroup updated the task description. (Show Details)Feb 27 2019, 11:38 AM

With my patch, everything is fixed except src/wikidata/site_stats/active_users.php and src/wikidata/site_stats/user_languages.php because they are both read and write at the same time. I will fix those in a later patch.

Change 493207 merged by jenkins-bot:
[analytics/wmde/scripts@master] Move reads to multi-source db setup

https://gerrit.wikimedia.org/r/493207

Change 493448 had a related patch set uploaded (by Addshore; owner: Ladsgroup):
[analytics/wmde/scripts@production] Move reads to multi-source db setup

https://gerrit.wikimedia.org/r/493448

Change 493448 merged by jenkins-bot:
[analytics/wmde/scripts@production] Move reads to multi-source db setup

https://gerrit.wikimedia.org/r/493448

Change 493701 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Fix active_users.php by doing most of the work in code

https://gerrit.wikimedia.org/r/493701

Change 493701 merged by jenkins-bot:
[analytics/wmde/scripts@master] Fix active_users.php by doing most of the work in code

https://gerrit.wikimedia.org/r/493701

Change 493706 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@production] Fix active_users.php by doing most of the work in code

https://gerrit.wikimedia.org/r/493706

Change 493706 merged by jenkins-bot:
[analytics/wmde/scripts@production] Fix active_users.php by doing most of the work in code

https://gerrit.wikimedia.org/r/493706

WMDE-Fisch added a comment.EditedMar 1 2019, 3:40 PM

Hmm it seems that on https://grafana.wikimedia.org/d/000000259/betafeatures the Global users enabled Features are broken. :-/

Hmm it seems that on https://grafana.wikimedia.org/d/000000259/betafeatures the Global users enabled Features are broken. :-/

Yes, I mentioned this above and I'm currently working on it.

Looking at the logs it's very likely due to lack of write rights (pun intended) for the wmde db user (in /etc/mysql/conf.d/research-wmde-client.cnf). Hopefully this will be fixed soon.

The ./wikidata/site_stats/user_languages.php is such a mess I doubt we will be able to do it at all if not with lots of work :(((

Looking at the logs it's very likely due to lack of write rights (pun intended) for the wmde db user (in /etc/mysql/conf.d/research-wmde-client.cnf). Hopefully this will be fixed soon.

It's not, back to square one.
One note: The beta features scripts takes seven hours to finish. We should find a better way for this.

Change 493802 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Rebuild the PDO when the insert fails

https://gerrit.wikimedia.org/r/493802

Looking at the logs it's very likely due to lack of write rights (pun intended) for the wmde db user (in /etc/mysql/conf.d/research-wmde-client.cnf). Hopefully this will be fixed soon.

It's not, back to square one.
One note: The beta features scripts takes seven hours to finish. We should find a better way for this.

I put the debugger on it, the reason it fails is 2019-03-01 23:49:33 betafeature-counts MySQL server has gone away The proper way to handle is to have a proper connection manager but it's outside of scope of this ticket.

Change 493802 merged by jenkins-bot:
[analytics/wmde/scripts@master] Rebuild the PDO when the insert fails

https://gerrit.wikimedia.org/r/493802

Change 494056 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@production] Rebuild the PDO when the insert fails

https://gerrit.wikimedia.org/r/494056

Change 494056 merged by jenkins-bot:
[analytics/wmde/scripts@production] Rebuild the PDO when the insert fails

https://gerrit.wikimedia.org/r/494056

Change 494203 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Change conditions on active_users

https://gerrit.wikimedia.org/r/494203

Change 494203 merged by jenkins-bot:
[analytics/wmde/scripts@master] Change conditions on active_users

https://gerrit.wikimedia.org/r/494203

Change 494217 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@production] Change conditions on active_users

https://gerrit.wikimedia.org/r/494217

Change 494217 merged by jenkins-bot:
[analytics/wmde/scripts@production] Change conditions on active_users

https://gerrit.wikimedia.org/r/494217

Ladsgroup updated the task description. (Show Details)Mar 4 2019, 2:12 PM

Change 494245 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Rewrite user_langauges.php to use babel table and Inclusion–exclusion principle

https://gerrit.wikimedia.org/r/494245

Change 494248 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@master] Remove WikimediaDb::getPdo()

https://gerrit.wikimedia.org/r/494248

Change 494245 merged by jenkins-bot:
[analytics/wmde/scripts@master] Rewrite user_langauges.php to use babel table and Inclusion–exclusion principle

https://gerrit.wikimedia.org/r/494245

Change 494248 merged by jenkins-bot:
[analytics/wmde/scripts@master] Remove WikimediaDb::getPdo()

https://gerrit.wikimedia.org/r/494248

Change 494692 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@production] Rewrite user_langauges.php to use babel table and Inclusion–exclusion principle

https://gerrit.wikimedia.org/r/494692

Change 494692 merged by jenkins-bot:
[analytics/wmde/scripts@production] Rewrite user_langauges.php to use babel table and Inclusion–exclusion principle

https://gerrit.wikimedia.org/r/494692

Change 494693 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/scripts@production] Remove WikimediaDb::getPdo()

https://gerrit.wikimedia.org/r/494693

Change 494693 merged by jenkins-bot:
[analytics/wmde/scripts@production] Remove WikimediaDb::getPdo()

https://gerrit.wikimedia.org/r/494693