Actually three tables: globaljsonlinks, globaljsonlinks_wiki,
and globaljsonlinks_target.
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Stalled | None | T387419 Create wiki replicas views for globaljsonlinks tables | |||
| Stalled | None | T395881 Set up x1 replication to Wiki Replicas | |||
| Stalled | None | T407485 Set up x1 replication to an-redacteddb1001 | |||
| Duplicate | None | T407486 Set up x1 replication to an-redacteddb1001 | |||
| Duplicate | Marostegui | T408692 Set up replication on new hosts clouddb102[2-5] | |||
| Resolved | fnegri | T401295 Decide how to use the new clouddb hosts (clouddb102[2-5]) | |||
| Open | Marostegui | T409557 Productionize new clouddb* hosts (clouddb1022-1033) | |||
| Stalled | Marostegui | T404715 Setup fake x4 section | |||
| Resolved | Marostegui | T406550 Productionize db126[0-3] | |||
| Resolved | Marostegui | T406551 Productionize db224[5-8] | |||
| Resolved | Marostegui | T410365 SystemdUnitFailed: generate-mysqld-exporter-config.service on prometheus100[56]:9100 | |||
| Stalled | None | T409560 Add support for x1 and x4 sections on wiki replicas on the load balancer layer | |||
| In Progress | Rsilvola | T415219 Privacy review of x1 tables in preparation of adding them to wikireplicas |
Event Timeline
@Bugreporter : It's not very clear what you are expecting here. Could you provide more details so we can understand what needs to be done without too much investigation?
Currently there is no view of these three tables in Cloud, so they are not queryable in Cloud.
@Ahoelzl : could you validate if those tables should or should not be exposed? Is redaction needed?
@Bugreporter thanks for filing. Can you elaborate on the use cases? And also on priority?
I think that this is likely to be uncontentious and relatively easy to achieve.
The three tables mentioned appear to be related in some way to the use of the Charts extension, which has extended the JsonConfig model.
We can see more details in this ticket: T380226: Install globaljsonlinks* tables on X1 for use with commons commons for Charts deployment although I'm not sure whether there is a better source for information.
I had a quick look at one of the databases where this is installed, commonswiki and the tables are installed in the x1 section, since they relate to an extension.
btullis@stat1008:~$ analytics-mysql --use-x1 commonswiki Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 8035121 Server version: 10.6.20-MariaDB-log MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql:research@dbstore1009.eqiad.wmnet [commonswiki]> show tables; +---------------------------------------+ | Tables_in_commonswiki | +---------------------------------------+ | echo_email_batch | | echo_event | | echo_notification | | echo_target_page | | globaljsonlinks | | globaljsonlinks_target | | globaljsonlinks_wiki | | mediamoderation_scan | | translate_cache | | translate_message_group_subscriptions | +---------------------------------------+ 10 rows in set (0.001 sec) mysql:research@dbstore1009.eqiad.wmnet [commonswiki]> describe globaljsonlinks; +---------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------------+---------------------+------+-----+---------+-------+ | gjl_wiki | int(11) | NO | PRI | NULL | | | gjl_namespace | int(11) | NO | PRI | NULL | | | gjl_title | varbinary(255) | NO | PRI | NULL | | | gjl_target | bigint(20) unsigned | NO | PRI | NULL | | +---------------+---------------------+------+-----+---------+-------+ 4 rows in set (0.002 sec) mysql:research@dbstore1009.eqiad.wmnet [commonswiki]> describe globaljsonlinks_target; +----------------+---------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------------+---------------------+------+-----+---------+----------------+ | gjlt_id | bigint(20) unsigned | NO | PRI | NULL | auto_increment | | gjlt_namespace | int(11) | NO | MUL | NULL | | | gjlt_title | varbinary(255) | NO | | NULL | | +----------------+---------------------+------+-----+---------+----------------+ 3 rows in set (0.001 sec) mysql:research@dbstore1009.eqiad.wmnet [commonswiki]> describe globaljsonlinks_wiki; +---------------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------------+------------------+------+-----+---------+----------------+ | gjlw_id | int(10) unsigned | NO | PRI | NULL | auto_increment | | gjlw_wiki | varbinary(32) | NO | MUL | NULL | | | gjlw_namespace | int(11) | YES | | NULL | | | gjlw_namespace_text | varbinary(255) | YES | | NULL | | +---------------------+------------------+------+-----+---------+----------------+ 4 rows in set (0.001 sec) mysql:research@dbstore1009.eqiad.wmnet [commonswiki]>
From an eyeball of the schemas, I think that there is unlikely to be any need for redaction of data.
We can also see another table which is similarly named: globalimagelinks
btullis@stat1008:~$ analytics-mysql commonswiki Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 43933951 Server version: 10.6.20-MariaDB-log MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql:research@dbstore1007.eqiad.wmnet [commonswiki]> describe globalimagelinks; +-----------------------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------------------+------------------+------+-----+---------+-------+ | gil_wiki | varbinary(32) | NO | PRI | | | | gil_page | int(10) unsigned | NO | PRI | 0 | | | gil_page_namespace_id | int(11) | NO | | 0 | | | gil_page_namespace | varbinary(255) | NO | | | | | gil_page_title | varbinary(255) | NO | | | | | gil_to | varbinary(255) | NO | PRI | | | +-----------------------+------------------+------+-----+---------+-------+ 6 rows in set (0.002 sec) mysql:research@dbstore1007.eqiad.wmnet [commonswiki]>
This can be found in the fullviews section of the maintain-views.yaml data file.
https://github.com/wikimedia/operations-puppet/blob/production/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml#L85
So, while I would like a second opinion from someone who understands the JsonConfig model and the Charts extension better, I would say that this request is reasonable, from both a privacy and technical viewpoint.
@EBernhardson do you know enough about JsonConfig (and maybe charts) to validate Ben's analysis above?
I don't know a ton about this, but i took a look at it. A few thoughts:
- x1 doesn't really mean it's about extensions, but typically that means more that these are tables shared between wikis. Normal per-wiki tables are on the s* db clusters, the x* clusters are typically shared between wikis.
- Plausibly the view that exposes these tables publicly may need to filter for when wiki is a private wiki? I took a look over select distinct gjlw_wiki from globaljsonlinks_wiki and didn't notice any private wikis, but not clear if that's by accident or by design.
- The gjl_wiki column of globaljsonlinks looks to be a foreign key to globaljsonlinks_wiki. I imagine if we filter the private wiki from globaljsonlinks_wiki then matching rows in globaljsonlinks also have to be elided?
The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!
@Gehel I don't think there's anything actionable for cloud-services-team yet. I'm looping in Data-Persistence as I think we would need to add x1 to the replicated sections. Currently Wiki Replicas only include s1 to s8, plus x3.
(replicating x3 was already recently asked in T395072, but likely should have its own task?)
but likely should have its own task?
Agree, I created T395881: Set up x1 replication to Wiki Replicas.
Does anybody have a specific use case that needs to access these tables? If yes, that might help in bumping the priority of T395881: Set up x1 replication to Wiki Replicas.
It would be very helpful to have access to these tables on toolforge (and quarry, https://superset.wmcloud.org/ etc) to generate statistics and usage information about charts.