Page MenuHomePhabricator

Wiki replicas are not fully setup for newly created wikis
Closed, ResolvedPublic

Description

I noticed that several wikis created a couple of months ago (kcgwiki [2022-05], guwwiki [2022-03], shnwikivoyage [2022-03]) do not have wiki replicas fully working (sql <wiki> does not connect to the wiki's replica):

urbanecm@tools-sgebastion-10  ~
$ sql kcgwiki
Could not find requested database
Make sure to ask for a db in format of <wiki>_p
urbanecm@tools-sgebastion-10  ~
$ sql guwwiki
Could not find requested database
Make sure to ask for a db in format of <wiki>_p
urbanecm@tools-sgebastion-10  ~
$ sql shnwikivoyage
Could not find requested database
Make sure to ask for a db in format of <wiki>_p
urbanecm@tools-sgebastion-10  ~
$

However, manually checking s5.analytics.db.svc.wikimedia.cloud shows that the views themselves exist:

urbanecm@tools-sgebastion-10  ~
$ mysql -h s5.analytics.db.svc.wikimedia.cloud
[...]
MariaDB [(none)]> use kcgwiki_p
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [kcgwiki_p]> select page_id from page limit 1;
+---------+
| page_id |
+---------+
|     893 |
+---------+
1 row in set (0.005 sec)

MariaDB [kcgwiki_p]> use guwwiki_p
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [guwwiki_p]> select page_id from page limit 1;
+---------+
| page_id |
+---------+
|      21 |
+---------+
1 row in set (0.002 sec)

MariaDB [guwwiki_p]> use shnwikivoyage_p
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [shnwikivoyage_p]> select page_id from page limit 1;
+---------+
| page_id |
+---------+
|    1300 |
+---------+
1 row in set (0.003 sec)

MariaDB [shnwikivoyage_p]> Bye
urbanecm@tools-sgebastion-10  ~
$

Further investigation shows that the issue is that /usr/bin/sql determines the right host via the <wiki>.{analytics,web}.db.svc.wikimedia.cloud DNS record, which does not exist for the three wikis mentioned above:

urbanecm@tools-sgebastion-10  ~
$ host kcgwiki.analytics.db.svc.wikimedia.cloud
Host kcgwiki.analytics.db.svc.wikimedia.cloud not found: 3(NXDOMAIN)
urbanecm@tools-sgebastion-10  ~
$ host guwwiki.analytics.db.svc.wikimedia.cloud
Host guwwiki.analytics.db.svc.wikimedia.cloud not found: 3(NXDOMAIN)
urbanecm@tools-sgebastion-10  ~
$ host shnwikivoyage.analytics.db.svc.wikimedia.cloud
Host shnwikivoyage.analytics.db.svc.wikimedia.cloud not found: 3(NXDOMAIN)
urbanecm@tools-sgebastion-10  ~
$

In addition to the wiki-replica issue mentioned above, it also looks like the wikis were not added to meta_p.wiki, meaning tools relying on that database will not be not aware of the wiki's existence:

urbanecm@tools-sgebastion-10  ~
$ sql meta
[...]
MariaDB [meta_p]> select * from wiki where dbname in ('kcgwiki', 'guwwiki', 'shnwikivoyage');
Empty set (0.003 sec)

MariaDB [meta_p]> Bye
urbanecm@tools-sgebastion-10  ~
$

According to a discussion with @taavi, this seems to be Data-Engineering's responsibility. Tagging as such (feel free to correct me if this is not correct).

Event Timeline

Further investigation shows that @BTullis likely created the views by running maintain-views manually (T305280#8016106, T302798#7881457; T303761#7886862 does not mention any particular method). As far as I can see, there's wmcs.wikireplicas.add_wiki cookbook (source), which should take care of the three steps that are necessary for wiki replicas to fully work (view creation, DNS records, update of meta_p.wiki). The cookbook appears to be successfully used by @nskaggs in the past.

@BTullis I'm wondering, is the any reason why the cookbook was not used (perhaps it's obsolete, or something)?

Thanks for the investigation @Urbanecm - You're right that I did run the maintain-views manually, without using the cookbook.
I clearly didn't know all of the steps that were required and I didn't know about the existence of the cookbook, so apologies for this inconvenience.

The responsibility for maintaining these wikireplica views has only recently (i.e. in the last year or so) moved from the Cloud-Services team to the Data-Engineering team and it looks like there's been some degree of incomplete knowledge transfer in handing over the processes. e.g.

I'm more than happy to run the cookbook now and to try to update the Wikitech documentation on the process, if you believe that's the best course of action.
Alternatively, I could execute the steps manually to update the DNS and the meta_p database, if you think that's best.

I'll try to make sure that we have a better set of reference material for this process in future too.

Thanks for the investigation @Urbanecm - You're right that I did run the maintain-views manually, without using the cookbook.
I clearly didn't know all of the steps that were required and I didn't know about the existence of the cookbook, so apologies for this inconvenience.

No problem, I totally understand that.

The responsibility for maintaining these wikireplica views has only recently (i.e. in the last year or so) moved from the Cloud-Services team to the Data-Engineering team and it looks like there's been some degree of incomplete knowledge transfer in handing over the processes. e.g.

I'm more than happy to run the cookbook now and to try to update the Wikitech documentation on the process, if you believe that's the best course of action.
Alternatively, I could execute the steps manually to update the DNS and the meta_p database, if you think that's best.

I'm not sure if the cookbook is safe to run if some of the steps are already done. I defer to your judgement to decide on what's the best course of action here. So long the wikis have the DNS records, a row in meta_p.wiki and the views, I'll be happy :-).

I'll try to make sure that we have a better set of reference material for this process in future too.

Thanks!

It looks like the DNS records got updated, as I can connect to the replicas now. However, meta_p did not get updated, so a rOPUP modules/profile/files/wmcs/db/wikireplicas/maintain-meta_p.py run is still required.

pcmwiki is also missing from meta_p.wiki.

Apologies for the delay in responding again to this.
I've checked the four wikis mentioned here and they all seem to be working now. Are you able to verify this please @Urbanecm ?

  • kcgwiki
  • guwwiki
  • shnwikivoyage
  • pcmwiki
btullis@tools-sgebastion-10:~$ sql kcgwiki "select * from page limit 2"
+---------+----------------+----------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title           | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+----------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|       2 |              8 | Sitesupport-url      |                0 |           1 | 0.780423446314 | 20220516120808 | 20220516120808     |           2 |      109 | wikitext           | NULL      |
|       3 |             10 | A̱kuu_nta̱mpi̱let    |                0 |           0 | 0.683607919661 | 20221222112537 | 20221222112553     |       13714 |       76 | wikitext           | NULL      |
+---------+----------------+----------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
btullis@tools-sgebastion-10:~$ sql guwwiki "select * from page limit 2"
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title      | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|       1 |              0 | Weda_Tangan     |                0 |           0 | 0.443739644563 | 20230105004345 | 20230105004826     |       30684 |       15 | wikitext           | NULL      |
|       2 |              8 | Sitesupport-url |                0 |           1 | 0.273839265839 | 20220323151647 | 20220323151647     |           2 |      109 | wikitext           | NULL      |
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
btullis@tools-sgebastion-10:~$ sql shnwikivoyage "select * from page limit 2"
+---------+----------------+--------------------------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title                           | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+--------------------------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|       1 |              0 | ၼႃႈႁူဝ်ႁႅၵ်ႈ                         |                0 |           0 | 0.689626825083 | 20230105004352 | 20230105004858     |        5946 |     7605 | wikitext           | NULL      |
|       2 |              8 | Sitesupport-url                      |                0 |           1 | 0.142618882848 | 20220323150434 | 20220323150434     |           2 |      110 | wikitext           | NULL      |
+---------+----------------+--------------------------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
btullis@tools-sgebastion-10:~$ sql pcmwiki "select * from page limit 2"
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title      | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|       1 |              0 | Main_Pej        |                0 |           0 |   0.4287915033 | 20230105004350 | 20230105004837     |       17442 |       19 | wikitext           | NULL      |
|       2 |              8 | Sitesupport-url |                0 |           1 | 0.078457951077 | 20220817111906 | 20220817111210     |           2 |      109 | wikitext           | NULL      |
+---------+----------------+-----------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+

Please do let me know if there is still anything missing and I will investigate further.

Hello @BTullis, some wikis are still missing from meta_p table. Browsing in Wiki-Setup (Create) for the latest created wikis I see only two out of 7 are there:

MariaDB [meta_p]> SELECT dbname, lang FROM wiki WHERE dbname IN ('kcgwiki', 'guwwiki', 'guwwiktionary', 'guwwikiquote', 'shnwikivoyage', 'pcmwiki', 'gurwiki');
+--------------+------+
| dbname       | lang |
+--------------+------+
| guwwikiquote | guw  |
| pcmwiki      | pcm  |
+--------------+------+
2 rows in set (0.002 sec)

Could the missing 5 be added?

Thanks.

Hello @BTullis - Any updates? Some tools are not working for us due to those wikis not being in meta_p. Regards.

Hi @MarcoAurelio - Apologoes for the delay. I'll update these asap.

@MarcoAurelio - I believe that these seven wikis you mentioned are all present in meta_p.wiki now.

btullis@tools-sgebastion-10:~$ sql meta_p
MariaDB [meta_p]> SELECT dbname, lang FROM wiki WHERE dbname IN ('kcgwiki', 'guwwiki', 'guwwiktionary', 'guwwikiquote', 'shnwikivoyage', 'pcmwiki', 'gurwiki');
+---------------+------+
| dbname        | lang |
+---------------+------+
| gurwiki       | gur  |
| guwwiki       | guw  |
| guwwikiquote  | guw  |
| guwwiktionary | guw  |
| kcgwiki       | kcg  |
| pcmwiki       | pcm  |
| shnwikivoyage | shn  |
+---------------+------+
7 rows in set (0.002 sec)

MariaDB [meta_p]>

The DNS aliases for all seven are also present. Would you be able to confirm for me please?

Once again, I apologise for the delay in getting these fully configured.

Hello @BTullis - thank you. I checked for the list of wikis created in 2022 and 2023 as per the site creation log:

MariaDB [meta_p]> SELECT dbname, lang FROM wiki WHERE dbname IN ('guwwiki', 'shnwikivoyage', 'kcgwiki', 'blkwiki', 'bjnwiktionary', 'guwwiktionary', 'pcmwiki', 'bnwikiquote', 'tlwikiquote', 'bclwikiquote', 'igwikiquote', 'igwiktionary', 'gorwiktionary', 'shnwikibooks', 'guwwikiquote', 'aswikiquote', 'gurwiki', 'gucwiki', 'anpwiki', 'ckbwiktionary');
+---------------+------+
| dbname        | lang |
+---------------+------+
| aswikiquote   | as   |
| bclwikiquote  | bcl  |
| bjnwiktionary | bjn  |
| blkwiki       | blk  |
| bnwikiquote   | bn   |
| gorwiktionary | gor  |
| gurwiki       | gur  |
| guwwiki       | guw  |
| guwwikiquote  | guw  |
| guwwiktionary | guw  |
| igwikiquote   | ig   |
| igwiktionary  | ig   |
| kcgwiki       | kcg  |
| pcmwiki       | pcm  |
| shnwikibooks  | shn  |
| shnwikivoyage | shn  |
| tlwikiquote   | tl   |
+---------------+------+
17 rows in set (0.002 sec)

So yes, this seems fixed :) Only the three latest created wikis remain to be done:

MariaDB [meta_p]> SELECT dbname, lang FROM wiki WHERE dbname IN ('gucwiki', 'anpwiki', 'ckbwiktionary');
Empty set (0.002 sec)

These are waiting the views creation and are tracked at:

As such I think this Task may be closed as resolved unless there's further investigation that needs to be done?

Thanks for confirmation @MarcoAurelio

As such I think this Task may be closed as resolved unless there's further investigation that needs to be done?

I think it's just documentation updates that need updating, since this procedure was originally carried out by a different team. There were clearly some gaps in my understanding.

Today I learnt about the site creation log, so thanks for that.

I'll get on with the other three sites that you linked, while I'm on a roll. :-)