Page MenuHomePhabricator

Establish process of determining shard for new wikis
Closed, ResolvedPublic

Description

Hello,

currently, the DBAs gets a heads-up about a new requested wiki via a "Prepare and check storage layer for XXX" task. Previously, the wiki creator created the wiki at s3, as suggested at https://wikitech.wikimedia.org/wiki/Add_a_wiki. However, that practice is about to change, as s3 is filled with a lot wikis already.

I propose the wiki creator would create the headsup task and wait for a DBA to decide. The DBA would then update parent task's description, which already has a field for "Shard", saying TBD by default. See T259432 as an example.

What do the DBA think?

Event Timeline

Reedy renamed this task from Estabilish process of determining shard for new wikis to Establish process of determining shard for new wikis.Aug 2 2020, 5:21 PM
Marostegui subscribed.

I would prefer if we updated the docs and the phab template to point it to s5 (once we are fully ready for it) and even send an email to wikitech-l to make sure everyone knows that from that date every new wikis will be on s5.
The reason I propose that is because I don't want us to be a blocking point for the wiki creation (we are already one for the views creation).

Marostegui triaged this task as Medium priority.Aug 3 2020, 5:22 AM

Well, anything that works for you works for me too, I really shouldn't be the blocking one. I thought what happened now could happen at any point in the future. If you're saying it's not likely to change, I'm happy with changing it to be s5 by default.

Yeah, let's change it once we've moved the two wikis on Tuesday. Then we can update the documentation and all that.
Thank you very much for your help

Assigning to @Urbanecm for the documentation change.
Thank you very much

Updated shard in https://meta.wikimedia.org/w/index.php?title=Template:New_wiki_request&diff=20355000&oldid=20353965 to be s5 by default. Add a wiki docs were updated to say s5 is default. Guidance added for changing db-* files, since that's needed now.

Marostegui added subscribers: Bstorm, bd808.

@bd808 @Bstorm we have moved two wikis from s3 to s5.
While nothing needs to be done data/views-wise on labsdb hosts, Jaime has noticed that on https://replag.toolforge.org/ those two still appear on s3 and not on s5.
Is there anything needed from your side or from the meta_p table to update that?

Not sure if there is change needed on dns per-wiki endpoints too.

@elukey not sure if there's something from your side when you sqoop stuff (even though those two wikis are closed), see: T259438#6375572

@bd808 @Bstorm we have moved two wikis from s3 to s5.
While nothing needs to be done data/views-wise on labsdb hosts, Jaime has noticed that on https://replag.toolforge.org/ those two still appear on s3 and not on s5.
Is there anything needed from your side or from the meta_p table to update that?

Replag learns which slice (I thought we didn't use the "shard" word) a wiki is in via the meta_p databases which are locally created on each Wiki Replica host. This is a common pattern for Toolforge tools in general. Any time that existing wikis are moved around from slice to slice the /usr/local/sbin/maintain-meta_p script needs to be run on each Wiki Replica server to update the meta_p database. The /usr/local/sbin/wikireplica_dns script also needs to be run to update the <wikidb>.{analytics,web}.db.svc.eqiad.wmflabs DNS entries as well. This second part is not super important today with all the slices attached to all of the database servers, but it will be very important once that is no longer the case.

Both of these scripts read the wiki->slice mapping data from the dblist files in operations/mediawiki-config. Maintain-meta_p reads these files from a local git clone in /usr/local/lib/mediawiki-config. Wikireplica_dns fetches the files on demand from https://noc.wikimedia.org/.

Makes sense, thanks. @Marostegui Do we have any docs for changing the shard/slice/section/whatever the terminology is? If so, can we add the info from @bd808's comment?

Change 619627 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] labsdb.zone: Update s5 wikis

https://gerrit.wikimedia.org/r/619627

@bd808 @Bstorm we have moved two wikis from s3 to s5.
While nothing needs to be done data/views-wise on labsdb hosts, Jaime has noticed that on https://replag.toolforge.org/ those two still appear on s3 and not on s5.
Is there anything needed from your side or from the meta_p table to update that?

Replag learns which slice (I thought we didn't use the "shard" word) a wiki is in via the meta_p databases which are locally created on each Wiki Replica host. This is a common pattern for Toolforge tools in general. Any time that existing wikis are moved around from slice to slice the /usr/local/sbin/maintain-meta_p script needs to be run on each Wiki Replica server to update the meta_p database.

I have ran the following on all labsdb hosts, but not sure whether this worked or not, @bd808 could you double check?

root@labsdb1012:~# /usr/local/sbin/maintain-meta_p --databases muswiki (also done for mhwiktionary)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists")
  result = self._query(query)
root@labsdb1012:~# /usr/local/sbin/maintain-meta_p --databases mhwitionary
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists")
  result = self._query(query)

The /usr/local/sbin/wikireplica_dns script also needs to be run to update the <wikidb>.{analytics,web}.db.svc.eqiad.wmflabs DNS entries as well. This second part is not super important today with all the slices attached to all of the database servers, but it will be very important once that is no longer the case.

Couldn't find wikireplica_dns, assuming you meant:

wmcs-wikireplica-dns --aliases --shard s3
wmcs-wikireplica-dns --aliases --shard s5

After running those, muswiki and mhwiktionary show now under s5 at https://replag.toolforge.org/

I have sent https://gerrit.wikimedia.org/r/c/operations/puppet/+/619627 is this also needed?

Makes sense, thanks. @Marostegui Do we have any docs for changing the shard/slice/section/whatever the terminology is? If so, can we add the info from @bd808's comment?

We have https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replica_DNS but we don't have an specific doc about moving wikis between sections that group the above comments from Bryan.
We can probably create one once we've gather all the info from this task and from @bd808 :)

I have ran the following on all labsdb hosts, but not sure whether this worked or not, @bd808 could you double check?

root@labsdb1012:~# /usr/local/sbin/maintain-meta_p --databases muswiki (also done for mhwiktionary)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists")
  result = self._query(query)
root@labsdb1012:~# /usr/local/sbin/maintain-meta_p --databases mhwitionary
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists")
  result = self._query(query)
/usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists")
  result = self._query(query)
$ sql meta_p
(u3518@s7.analytics.db.svc.eqiad.wmflabs) [meta_p]> select dbname, slice from wiki where dbname in ('muswiki', 'mhwiktionary');
+--------------+-----------+
| dbname       | slice     |
+--------------+-----------+
| mhwiktionary | s5.labsdb |
| muswiki      | s5.labsdb |
+--------------+-----------+
2 rows in set (0.00 sec)

Looks good to me. Those warnings come from the seed_schema() function in the script. It does things like CREATE DATABASE IF NOT EXISTS ... & CREATE TABLE IF NOT EXISTS ... and that log output is what the python mysql library we are using outputs when the target schema objects do exist. The upstream seems uninterested in making them easy to suppress as they are normal warnings generated from the server side.

Yes, because these 2 wikis existed in the era of dbname.labsdb DNS records we should keep their legacy names in sync with their new slice. I left some notes on the patchset about possibly removing those legacy DNS records in the next round of Wiki Replica breaking changes. It has been nearly 3 years since we introduced the new *.{analytics,web}.db.svc.eqiad.wmflabs names to replace *.labsdb and any wikis created in those intervening years do not have *.labsdb records so the legacy names are becoming less and less useful globally. I know I have seen a lot of older tools though that are still using them because these tools typically only work with a small number of dbs (or very often just one).

Change 619627 merged by Marostegui:
[operations/puppet@production] labsdb.zone: Update s5 wikis

https://gerrit.wikimedia.org/r/619627

Thank you Bryan.
@Urbanecm I think we are done from the "code" side. I think I am going to add this step to https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas as a new section: "Moving wikis between shards"

Change 693148 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] site.pp: s3 is no longer default for new wikis.

https://gerrit.wikimedia.org/r/693148

Change 693148 merged by Marostegui:

[operations/puppet@production] site.pp: s3 is no longer default for new wikis.

https://gerrit.wikimedia.org/r/693148