Page MenuHomePhabricator

Create database schema for the CampaignEvents extension on testwiki, test2wiki, and officewiki
Closed, ResolvedPublic

Description

Once the database review (T318593) is completed, we would like that the extension tables be created in production.

Partial checklist (from mw:Creating_new_tables)

Details

  • Should this table be replicated to wiki replicas (does it not contain private data)?
    • There's more than one table with different rules. See P35370
  • Will you be doing cross-joins with the wiki metadata?
    • If I understood the question correctly, no.
  • Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok).
    • See P35379 for some estimates.
  • Size of the table (number of rows expected).
    • Initially 0, then it grows as detailed below.
  • Expected growth per year (number of rows).
    • Using the same (over)estimates as P35379, we could say:
      • campaign_events: 500 * 1 * 12 = 6000
      • ce_address: 500 * 0.5 * 12 = 3000
      • ce_event_address: 500 * 0.5 * 12 = 3000
      • ce_organizers: 500 * 5 * 12 = 30000
      • ce_participants: 500 * 300 * 12 = 1800000
  • Examples of queries that will be using the table.
    • See T308738#8085151 for a description of the schema and some example queries (could be slightly outdated)
  • The release plan for the feature (are there specific wikis you'd like to test first etc).
    • For now, we are requesting that the schema be created on testwiki, test2wiki, and officewiki

Auto-generated mysql file (master version)

Deployment

  • Make sure the patch which introduces the tables got merged, and confirm files were deployed to production -- done by @Daimona
  • Schedule table creation at a time that doesn't conflict with the deployment schedule -- done: https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=2023555&oldid=2023550
  • Create the tables for each wiki:
    • mwscript sql.php --wiki=testwiki --cluster=extension1 /srv/mediawiki/php-1.40.0-wmf.8/extensions/CampaignEvents/db_patches/mysql/tables-generated.sql
    • mwscript sql.php --wiki=test2wiki --cluster=extension1 /srv/mediawiki/php-1.40.0-wmf.8/extensions/CampaignEvents/db_patches/mysql/tables-generated.sql
    • mwscript sql.php --wiki=officewiki --cluster=extension1 /srv/mediawiki/php-1.40.0-wmf.8/extensions/CampaignEvents/db_patches/mysql/tables-generated.sql

Event Timeline

Daimona changed the task status from Open to Stalled.Sep 26 2022, 4:37 PM
Daimona updated the task description. (Show Details)

Stalled because subtask.

Marostegui added a subscriber: Ladsgroup.
vyuen changed the task status from Stalled to Open.Oct 5 2022, 5:31 PM
vyuen moved this task from Blocked to Triage on the DBA board.
vyuen added a subscriber: vyuen.

I think this is unblocked now that we have completed the subtasks. I am thinking the review will be a part of this process?

Hi, just a friendly comment that there are currently no DBAs available, so expect no response at the very least until next week. :-(

Just to double check, we decided to put this in x1. Do you want the testwiki ones in x1 too or you want them in core dbs?

Just to double check, we decided to put this in x1. Do you want the testwiki ones in x1 too or you want them in core dbs?

No, the ones that we need for this task (testwiki, test2wiki, officewiki) should be in the "normal" wiki databases.

Also, for completeness... We are reconsidering the current central DB setup (T318381), and depending on the outcome of that task we could abandon the x1 idea.

Also, for completeness... We are reconsidering the current central DB setup (T318381), and depending on the outcome of that task we could abandon the x1 idea.

I strongly recommend having these dbs in x1. If you want more details on why, let me know. If you want per-wiki data, it's still possible to keep them in x1 (each wiki has a database in x1)

Also, for completeness... We are reconsidering the current central DB setup (T318381), and depending on the outcome of that task we could abandon the x1 idea.

I strongly recommend having these dbs in x1. If you want more details on why, let me know. If you want per-wiki data, it's still possible to keep them in x1 (each wiki has a database in x1)

I'd definitely be interested! We're still trying to understand if we want central or local, and T318381 (as well as the page on meta linked there) explains some of the problems we're facing. More context would be very useful to help us make a better decision.

Also, for completeness... We are reconsidering the current central DB setup (T318381), and depending on the outcome of that task we could abandon the x1 idea.

I strongly recommend having these dbs in x1. If you want more details on why, let me know. If you want per-wiki data, it's still possible to keep them in x1 (each wiki has a database in x1)

Hi @Ladsgroup, thank you!
I also would be interested in more details about this, do you think we could have a meeting, so you can give us more details about it?

@Daimona by checking P35370, please do not create the tables until we've deployed the filters for those columns.

@Daimona by checking P35370, please do not create the tables until we've deployed the filters for those columns.

Hi @Marostegui, just to clarify, we won't be creating these tables ourselves. And related to your question, should I create a separate task about setting up the replicas with those filters?

you technically can create the tables if you have access to mwmaint[1], That shouldn't be an issue (and we generally encourage it) once the DBA sign-off is done. Schema change is a different story.

Also if you want to get the views, I suggest creating a patch in puppet: https://github.com/wikimedia/puppet/blob/production/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml

@Marostegui to my knowledge the feature not deployed yet so it won't have any issues (yet)

[1] Just make sure you connect to the master: sql officewiki --write, otherwise it'll be a mess and replication might break.

you technically can create the tables if you have access to mwmaint, That shouldn't be an issue (and we generally encourage it) once the DBA sign-off is done. Schema change is a different story.

WHAT? :D I didn't realize we could do that, but now that I read the docs again, it seemed pretty obvious... That said:

  • I don't have 'deployment' access, but I am in the 'restricted' group and can access mwmaint, run scripts, etc. So maybe the documentation could be reworded a bit to not mention deployments
  • I know how to run maintenance scripts and queries in prod from mwmain, having done that several times in the past, but I've never actually modified the schema. I may perhaps use a little guidance. I think it might also be a great idea to do that together with @cmelo and @MHorsey-WMF, so that we all learn more about the process (and there'd be more brains that could react to incidents).

Also if you want to get the views, I suggest creating a patch in puppet: https://github.com/wikimedia/puppet/blob/production/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml

Sure, I can do that.

you technically can create the tables if you have access to mwmaint, That shouldn't be an issue (and we generally encourage it) once the DBA sign-off is done. Schema change is a different story.

WHAT? :D I didn't realize we could do that, but now that I read the docs again, it seemed pretty obvious... That said:

  • I don't have 'deployment' access, but I am in the 'restricted' group and can access mwmaint, run scripts, etc. So maybe the documentation could be reworded a bit to not mention deployments
  • I know how to run maintenance scripts and queries in prod from mwmain, having done that several times in the past, but I've never actually modified the schema. I may perhaps use a little guidance. I think it might also be a great idea to do that together with @cmelo and @MHorsey-WMF, so that we all learn more about the process (and there'd be more brains that could react to incidents).

Also if you want to get the views, I suggest creating a patch in puppet: https://github.com/wikimedia/puppet/blob/production/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml

Sure, I can do that.

Yes, that will be great, we need to share this knowledge as much as possible, thanks @Ladsgroup, @Daimona

Also if you want to get the views, I suggest creating a patch in puppet: https://github.com/wikimedia/puppet/blob/production/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml

Sure, I can do that.

@Ladsgroup on second thought, I realized I've also never written replication rules for things not in the core DBs. Does it make any difference that this will be on x1? Or should I just update maintain-views and filtered_tables as normal?

We don't have replication from x1 to the cloud: https://orchestrator.wikimedia.org/web/cluster/alias/x1

We can set it up but it'll take time. How important the replication to the cloud is?

We don't have replication from x1 to the cloud: https://orchestrator.wikimedia.org/web/cluster/alias/x1

We can set it up but it'll take time. How important the replication to the cloud is?

i.e. let's do it if there is an actual need.

We don't have replication from x1 to the cloud: https://orchestrator.wikimedia.org/web/cluster/alias/x1

We can set it up but it'll take time. How important the replication to the cloud is?

Aha, got it. I don't think we discussed this as a team, which also means this is probably not important, and that there's no need to do it now. CC'ing @vyuen and @cmelo for a definitive answer, though.

Does this also mean that we can go ahead and create the new tables?

Probably. Let's double check with Manuel tomorrow.

We don't have replication from x1 to the cloud: https://orchestrator.wikimedia.org/web/cluster/alias/x1

We can set it up but it'll take time. How important the replication to the cloud is?

Aha, got it. I don't think we discussed this as a team, which also means this is probably not important, and that there's no need to do it now. CC'ing @vyuen and @cmelo for a definitive answer, though.

Does this also mean that we can go ahead and create the new tables?

Thanks, @Daimona and @Ladsgroup, and yes, I think we don't need it for now.

I realized I've also never written replication rules for things not in the core DBs.

If I may, it wouldn't hurt to add filtering rules to tables, in case in the future x1 tables are replicated to cloud. AKA, the answer to Should this table be replicated to wiki replicas (does it not contain private data)? -> should inform the updates to https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/files/mariadb/filtered_tables.txt?blame=off to prevent future accidents, specially if there are known private columns, even if it is a noop now. I can even do it myself if you indicate the list of tables and colums and how public they are.

I realized I've also never written replication rules for things not in the core DBs.

If I may, it wouldn't hurt to add filtering rules to tables, in case in the future x1 tables are replicated to cloud. AKA, the answer to Should this table be replicated to wiki replicas (does it not contain private data)? -> should inform the updates to https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/files/mariadb/filtered_tables.txt?blame=off to prevent future accidents, specially if there are known private columns, even if it is a noop now. I can even do it myself if you indicate the list of tables and colums and how public they are.

Yep, that is what I meant at T318595#8338100
Even if they won't arrive to clouddb hosts (as x1 isn't present) we must add them to either that or even realm.pp

It is not yet clear to me which tables are private, apart from what was mentioned at https://phabricator.wikimedia.org/P35370 but those are columns for https://gerrit.wikimedia.org/g/mediawiki/extensions/CampaignEvents/+/4bb97dda96a9a77d00280282f96fabbf4d759bae/db_patches/mysql/tables-generated.sql

@Daimona which tables at https://gerrit.wikimedia.org/g/mediawiki/extensions/CampaignEvents/+/4bb97dda96a9a77d00280282f96fabbf4d759bae/db_patches/mysql/tables-generated.sql should not be replicated at all?

Change 849029 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mariadb: Add production-side filters for CampaignEvents extension tables

https://gerrit.wikimedia.org/r/849029

I realized I've also never written replication rules for things not in the core DBs.

If I may, it wouldn't hurt to add filtering rules to tables, in case in the future x1 tables are replicated to cloud.

Ah, in that case, sure! Just to explain my thoughts: I was under the impression that cloud replication is currently unavailable for x1, AND that whenever it becomes available, tables would get added to it incrementally (and optionally), and that the replication rules could be written at that point, if someone actually wants their tables to be replicated.

I can even do it myself if you indicate the list of tables and colums and how public they are.

It is not yet clear to me which tables are private, apart from what was mentioned at https://phabricator.wikimedia.org/P35370

@Daimona which tables at https://gerrit.wikimedia.org/g/mediawiki/extensions/CampaignEvents/+/4bb97dda96a9a77d00280282f96fabbf4d759bae/db_patches/mysql/tables-generated.sql should not be replicated at all?

P35370 actually answers those questions: no table should be completely private, and rules for individual columns are in the paste. Every column not specifically mentioned in the paste should be public.

I realized I've also never written replication rules for things not in the core DBs.

If I may, it wouldn't hurt to add filtering rules to tables, in case in the future x1 tables are replicated to cloud.

Ah, in that case, sure! Just to explain my thoughts: I was under the impression that cloud replication is currently unavailable for x1, AND that whenever it becomes available, tables would get added to it incrementally (and optionally), and that the replication rules could be written at that point, if someone actually wants their tables to be replicated.

Your assumption is totally correct, but we've been trying to do this along with tables creation requests, as it is easier to track them slowly than all at once where we could forget things :)

I can even do it myself if you indicate the list of tables and colums and how public they are.

It is not yet clear to me which tables are private, apart from what was mentioned at https://phabricator.wikimedia.org/P35370

@Daimona which tables at https://gerrit.wikimedia.org/g/mediawiki/extensions/CampaignEvents/+/4bb97dda96a9a77d00280282f96fabbf4d759bae/db_patches/mysql/tables-generated.sql should not be replicated at all?

P35370 actually answers those questions: no table should be completely private, and rules for individual columns are in the paste. Every column not specifically mentioned in the paste should be public.

Excellent, thank you!

Change 849029 merged by Marostegui:

[operations/puppet@production] mariadb: Add production-side filters for CampaignEvents extension tables

https://gerrit.wikimedia.org/r/849029

I have merged Jaime's patch, so we should be good. However, if we ever replicate x1 and we need to create the views for them to be queried, they should follow these rules

1==Field-level rules==
2===campaign_events===
3All fields should be public except for:
4- event_chat_url
5- event_meeting_url
6
7===ce_participants===
8All fields should be public
9
10===ce_organizers===
11All fields should be public
12
13===ce_address===
14All fields should be public
15
16===ce_event_address===
17All fields should be public
18
19
20==Additional rules==
21===campaign_events===
22- A row should be completely hidden if its 'event_deleted_at' is NOT NULL.
23
24===ce_participants===
25- A row should be completely hidden if its 'cep_private' is NOT NULL.
26- A row should be completely hidden if its 'cep_unregistered_at' is NOT NULL.
27
28===ce_organizers===
29- A row should be completely hidden if its 'ceo_deleted_at' is NOT NULL.
30
31===ce_address===
32None
33
34===ce_event_address===
35None

Your assumption is totally correct, but we've been trying to do this along with tables creation requests, as it is easier to track them slowly than all at once where we could forget things :)

FWIW, I agree :)

I have merged Jaime's patch, so we should be good. However, if we ever replicate x1 and we need to create the views for them to be queried, they should follow these rules

Cool, so I guess we are really unblocked now?

From all the filtering side of things you should be good to go!

Hi @Ladsgroup, @Marostegui, @jcrespo! After talking with the other engineers, we would have the following questions for you:

  • Do we need to add these tables to createExtensionTables.php and addWiki.php? Since we're only rolling out the extension to a limited number of wikis, with potentially different configurations, and given some open questions we have about central vs local db (T318381), I'm not sure if it's a good idea to add these tables to automatic scripts.
  • Is the checklist also valid for tables on x1, or do we need to do something differently?
    • In particular, I think if those tables are added to createExtensionTables.php, it may choose the right cluster and DB for us, but again, I'm not sure if we should add the tables there.

Thank you!

Sorry for late response.

Hi @Ladsgroup, @Marostegui, @jcrespo! After talking with the other engineers, we would have the following questions for you:

  • Do we need to add these tables to createExtensionTables.php and addWiki.php? Since we're only rolling out the extension to a limited number of wikis, with potentially different configurations, and given some open questions we have about central vs local db (T318381), I'm not sure if it's a good idea to add these tables to automatic scripts.

Yes. Makes sense. Let's not add it.

  • Is the checklist also valid for tables on x1, or do we need to do something differently?
    • In particular, I think if those tables are added to createExtensionTables.php, it may choose the right cluster and DB for us, but again, I'm not sure if we should add the tables there.

For x1 deployment, it shouldn't make a difference. If you're not sure where to put the table yet, I think it's okay to skip this step for now and later add it once the team comes to an agreement.

Mentioned in SAL (#wikimedia-operations) [2022-11-02T16:12:37Z] <Daimona> Creating schema for the CampaignEvents extension on testwiki, test2wiki and officewiki # T318595

Daimona claimed this task.
Daimona updated the task description. (Show Details)

This was done!