Page MenuHomePhabricator

Use External Store on Beta Cluster
Closed, ResolvedPublic

Description

This is how production stores content. Beta Cluster should follow the same path.

This will also effect StructuredDiscussions and other extensions that automatically or can use External Store if available.

Event Timeline

greg renamed this task from Use External Store on Beta Labs to Use External Store on Beta Cluster.Apr 13 2015, 3:39 AM
greg updated the task description. (Show Details)
greg set Security to None.
greg triaged this task as Medium priority.Apr 13 2015, 11:33 PM
greg added a project: Staging.

@Mattflaschen External Store is a bit tricky. Gotta need to setup a few instances to hold the SQL databases then add the related mediawiki-config configuration. One might want to migrate existing content to the External Stores as well but I can't remember which maintenance script make it possible.

Could you reach out to engineering and figure out whom has the knowledge / skill to make it happen? Thanks!

@Mattflaschen External Store is a bit tricky. Gotta need to setup a few instances to hold the SQL databases then add the related mediawiki-config configuration. One might want to migrate existing content to the External Stores as well but I can't remember which maintenance script make it possible.

Could you reach out to engineering and figure out whom has the knowledge / skill to make it happen? Thanks!

https://github.com/wikimedia/mediawiki/blob/master/maintenance/storage/moveToExternal.php

Don't know the last time it was actually tested though... :)

Quite a few other related scripts in https://github.com/wikimedia/mediawiki/tree/master/maintenance/storage

https://github.com/wikimedia/mediawiki/blob/master/maintenance/storage/moveToExternal.php

Don't know the last time it was actually tested though... :)

It won't work for Flow anyway.

@jcrespo, do you think we actually have to do that migration on Beta? In my opinion should be enough to switch it then create some content on Beta (the browser tests should also help with that).

That will populate the initial External Store on Beta. We can then test the migration for these ES rows. Old content won't be using ES, but will still work and should be skipped by the migration script (this code is already in place).

In my opinion should be enough to switch it then create some content on Beta (the browser tests should also help with that).

I was thinking exactly that, testing it on a few records created from now on.

@Reedy @jcrespo @hashar @greg So it looks like we have make-all-blobs in WikimediaMaintenance, in addition to make-blobs in core which I knew about previously.

There is support for the Beta Cluster (knowing which dblist to loop over). I checked to confirm, and it looks like DB is a defined store, and as expected there is no external store enabled ($wgDefaultExternalStore).

P2644

I don't see make-all-blobs referenced anywhere, so it's not puppetized that I can tell.

So it looks like we need to choose a DB server to use on the Beta Cluster (deployment-db1, deployment-db2, or make another one, ...), then run:

make-all-blobs <server> blobs1

Adding Chad for their help/expertise.

Adding Chad for their help/expertise.

So I'm not exactly sure what's needed here from me :) I don't really know much of anything about External Storage. I can certainly run some scripts or setup a new DB server I suppose tho :)

Talked to @demon a bit about this.

In T95871#2047174, @Mattflaschen wrote:

make-all-blobs <server> blobs1

If no one objects in a day, I'll just do this on both db1 and db2 after verifying locally that migration is not necessary. One is master and one is slave.

Then, I'll put up a config change for core and Flow to use the cluster going forward.

The config change will first be done for one Beta wiki, then if that goes fine I'll do the rest.

Yep, that about summarizes it. Ping me on IRC when you wanna do this so you can have a second set of hands :)

Yep, that about summarizes it. Ping me on IRC when you wanna do this so you can have a second set of hands :)

I retested locally, so it's ready. Looks like you're away on IRC right now.

I'll check back, but you could also ping me at a good time for you, or schedule it.

Change 273989 had a related patch set uploaded (by Mattflaschen):
WIP: Fix make-all-blobs-beta to work on Beta Cluster

https://gerrit.wikimedia.org/r/273989

@jcrespo Please review that patch when you're able to.

I'm guessing you're not using it, since wikiadmin_pass no longer exists. Do you have a proposed solution for making it work in production? If not, is it okay to just make it Beta-only for now?

For Beta, we can use sql.php, which uses the main wiki server and database (unless you think we should setup a dedicated External Store DB server on Beta). One question, is it okay to make Beta schema changes using sql.php like this? Will they be properly replicated?

For production, that is not suitable since External Store in production uses dedicated servers/databases.

Change 273989 abandoned by Mattflaschen:
Make make-all-blobs-beta to work on Beta Cluster (rather than nowhere)

Reason:
Actually, I guess we can just do it on Beta the same way for now. I will just add a note about wikiadmin_pass instead.

https://gerrit.wikimedia.org/r/273989

Change 273989 restored by Mattflaschen:
Make make-all-blobs-beta to work on Beta Cluster (rather than nowhere)

https://gerrit.wikimedia.org/r/273989

Change 273989 merged by jenkins-bot:
Add comment to note that wikiadmin_pass does not exist

https://gerrit.wikimedia.org/r/273989

I can fix production, but I need the original script for that (even if it is non-working). I will not be able to find it if it is deleted. I have not used it yet, but I plan to (we are still migrating the hardware right now).

The wikiadmin_pass was provisioned via a puppet erb template:

#! /bin/bash
echo -n "<%= @wikiadmin_pass %>"

From misc::deployment::passwordscripts which had:

$wikiadmin_pass = $passwords::misc::scripts::wikiadmin_pass

That has been removed in August 2015 by ee4975d9c5bd6fbf83f70db4d3bbce646c6b380c and the script have switched to use something like:

pass=`echo 'echo \$wgDBadminpassword;' | /usr/local/bin/mwscript eval.php --wiki="$db"`

In effect: ask MediaWiki (eval.php) for $wgDBadminpassword which is itself in /private/PrivateSettings.php

I can fix production, but I need the original script for that (even if it is non-working). I will not be able to find it if it is deleted.

Yep, I will not delete or move it. Thanks for clarifying you are still using it.

The wikiadmin_pass was provisioned via a puppet erb template:

#! /bin/bash
echo -n "<%= @wikiadmin_pass %>"

From misc::deployment::passwordscripts which had:

$wikiadmin_pass = $passwords::misc::scripts::wikiadmin_pass

That has been removed in August 2015 by ee4975d9c5bd6fbf83f70db4d3bbce646c6b380c and the script have switched to use something like:

pass=`echo 'echo \$wgDBadminpassword;' | /usr/local/bin/mwscript eval.php --wiki="$db"`

In effect: ask MediaWiki (eval.php) for $wgDBadminpassword which is itself in /private/PrivateSettings.php

Thanks, hashar. I'm going to try that.

Change 275680 had a related patch set uploaded (by Mattflaschen):
make-all-blobs: Password from MW (i.e. ultimately PrivateSettings.php)

https://gerrit.wikimedia.org/r/275680

Change 275680 merged by jenkins-bot:
make-all-blobs: Password from MW (i.e. ultimately PrivateSettings.php)

https://gerrit.wikimedia.org/r/275680

@demon The script should work now. Do you want to pair again on this, or should I just do it?

Created the tables on Beta Cluster. Working command (working directory is also significant):

mattflaschen@deployment-tin:/srv/mediawiki/php-master/maintenance/storage$ export MEDIAWIKI_STAGING_DIR; ../../extensions/WikimediaMaintenance/storage/make-all-blobs deployment-db1.deployment-prep.eqiad.wmflabs blobs1

I did not need to (i.e. it failed) do the same on db2, since it replicated it.

Next is the config.

Change 282440 had a related patch set uploaded (by Mattflaschen):
Beta Cluster: Use ExternalStore on testwiki

https://gerrit.wikimedia.org/r/282440

Change 282440 merged by jenkins-bot:
Beta Cluster: Use ExternalStore on testwiki

https://gerrit.wikimedia.org/r/282440

Change 285765 had a related patch set uploaded (by Mattflaschen):
Enable External Store everywhere on Beta Cluster

https://gerrit.wikimedia.org/r/285765

Change 285765 merged by jenkins-bot:
Enable External Store everywhere on Beta Cluster

https://gerrit.wikimedia.org/r/285765

A few DB queries I used to verify (260 is http://test.wikimedia.beta.wmflabs.org/wiki/Vorlage:Lk_%28Bibel%29):

mysql> SELECT LOWER(HEX(rev_type_id)), LOWER(HEX(rev_id)), rev_content FROM flow_revision ORDER by rev_id DESC LIMIT 5;
+-------------------------+------------------------+------------------------------------------------+
| LOWER(HEX(rev_type_id)) | LOWER(HEX(rev_id))     | rev_content                                    |
+-------------------------+------------------------+------------------------------------------------+
| 0551674379b265ddb4696b  | 0551674379b265ddb4696b | DB://cluster1/5                                |
| 0551674379aa65ddb4696b  | 0551674379ae65ddb4696b | DB://cluster1/4                                |
| 0551674197c1e14f40af18  | 055167419885e14f40af18 | DB://cluster1/3                                |
| 05499b41d6ea654f40af18  | 05499b41d6ea654f40af18 | �)PHI,I�-H,*��L�U�VJ).R��6�1�1�1��U�+I-.��/�                   |
| 05499b41d6e6654f40af18  | 05499b41d6e6664f40af18 | �)N.�,(�K�I-*�0Դч
                                  |
+-------------------------+------------------------+------------------------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT rev_id, old_id, old_text FROM revision JOIN text ON old_id = rev_text_id ORDER by rev_id DESC LIMIT 5;
+--------+--------+--------------------------------------------------------------------------------------------------------------------------------------------+
| rev_id | old_id | old_text                                                                                                                                   |
+--------+--------+--------------------------------------------------------------------------------------------------------------------------------------------+
|   2460 |   1980 | DB://cluster1/6                                                                                                                            |
|   2459 |   1979 | DB://cluster1/2                                                                                                                            |
|   2458 |   1978 | DB://cluster1/1                                                                                                                            |
|   2457 |   1975 | m˱!��W�0�_gaaabag,6�w���%���vv&�܂4\�����BU�gh`܃ji��c
                                                                             �����u�{��K95�i:V��q��sş��ap�����B3�?{S�P"Sc����k�UҲ
(-��K�ʴ�                                                           |
|   2456 |   1974 | ]�1�0
                         �=���247`� ���K#�I0
Bܝ�t��_z�[r`�B{z�����&��By��U!Y��+�>���ƭ5�,�ԏ�3vnC'�
                                                        ڄ�V���n^<�Y-�Q8`�Ҝ�-u�vV5��y]�o����F�                                                               |
+--------+--------+--------------------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT page_id, rev_id, old_id, old_text FROM page JOIN revision ON page_latest = rev_id JOIN text ON rev_text_id = old_id WHERE page_id = 260 ORDER BY rev_id DESC LIMIT 5;
+---------+--------+--------+-----------------+
| page_id | rev_id | old_id | old_text        |
+---------+--------+--------+-----------------+
|     260 |   2458 |   1978 | DB://cluster1/1 |
+---------+--------+--------+-----------------+
1 row in set (0.00 sec)

Tested enwiki at http://en.wikipedia.beta.wmflabs.org/wiki/UserMergeoujuvz and http://en.wikipedia.beta.wmflabs.org/wiki/Talk:Flow .

Equivalent queries for enwiki:

mattflaschen@deployment-tin:~$ sql enwiki
[...]
mysql> SELECT LOWER(HEX(rev_type_id)), LOWER(HEX(rev_id)), rev_content FROM flow_revision ORDER by rev_id DESC LIMIT 5;
+-------------------------+------------------------+------------------------------------------------------------+
| LOWER(HEX(rev_type_id)) | LOWER(HEX(rev_id))     | rev_content                                                |
+-------------------------+------------------------+------------------------------------------------------------+
| 055167930119e14f40af18  | 055167930119e14f40af18 | DB://cluster1/3                                            |
| 055167930115e14f40af18  | 055167930115e24f40af18 | DB://cluster1/2                                            |
| 055167763fae66ddb4696b  | 055167763fae66ddb4696b | �)PHI,I�-H,*��L�U�VJ).R��6�14�1�1��U�K.JM,IMQ(�LTp
                                                                                                       ���/�                     |
� 055167763faa65ddb4696b  | 055167763fae65ddb4696b | +�,�I��0364�073450�023��45rL͌MLLL
 �M�-�                             |
| 05516774639265ddb4696b  | 05516774639265ddb4696b | �)PHI,I�-H,*��L�U�VJ).R��6�14�1�1��U�K.JM,IMQ(�LTp
                                                                                                       ���/�                     |
+-------------------------+------------------------+------------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT rev_id, old_id, old_text FROM revision JOIN text ON old_id = rev_text_id ORDER by rev_id DESC LIMIT 5;
+--------+--------+-------------------------------------+
| rev_id | old_id | old_text                            |
+--------+--------+-------------------------------------+
| 336606 | 336429 | DB://cluster1/4                     |
| 336605 | 336428 | DB://cluster1/1                     |
| 336604 | 336427 | �VJ��/�-�/�1���J��+K
                                         2sr�K�,򳋔j                  |
| 336603 | 336426 | �VJ��/�-�/�1���J��+K��
                                           ҍ�ͫ*L��j                   |
| 336602 | 336425 | �VJ��/�-�/�1���J��+�
K�,,LM                                   ,
��j                |
+--------+--------+-------------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT page_id, rev_id, old_id, old_text FROM page JOIN revision ON page_latest = rev_id JOIN text ON rev_text_id = old_id WHERE page_namespace = 0 AND page_title =  'UserMergeoujuvz';
+---------+--------+--------+-----------------+
| page_id | rev_id | old_id | old_text        |
+---------+--------+--------+-----------------+
|  106003 | 336605 | 336428 | DB://cluster1/1 |
+---------+--------+--------+-----------------+
1 row in set (0.00 sec)
    • Re-run the @Mattflaschen queries on betalabs for newly created flow-board and wikitext pages - all refer to DB://cluster1
  • did general regression testing - all look normal

The separate blobs1 tables (e.g. testwiki.blobs1, enwiki.blobs1) (which is the actual External Store) are also getting populated as expected.