Page MenuHomePhabricator

Create the ce_tracking_tools table in production
Closed, ResolvedPublic

Description

Once the database review (T318593) is completed, we would like that the extension tables be created in production.

Partial checklist (from mw:Creating_new_tables)

  • Create a Phabricator task to track the table creation
  • Design the table schema
  • Write the gerrit patch and get it merged
  • Request DBA signoff
    • See information below in the Details section.
    • Move the task to the Triage column on the DBA workboard.
  • Create the views with appropriate restrictions (T337377)

Details

  • Should this table be replicated to wiki replicas (does it not contain private data)?
    • The data is public, but in order to respect (soft) deletion of events, rows should not be visible if cett_event corresponds to a row in campaign_events with event_id=cett_event and event_deleted_at IS NOT NULL.
  • Will you be doing cross-joins with the wiki metadata?
    • No
  • Size of the table (number of rows expected).
    • Starts off as 0, then it'll grow as explained in the next section.
  • Expected growth per year (number of rows).
  • Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok).
    • Roughly the same as the existing ce_address table, see P35379.
  • Examples of queries that will be using the table.
    • SELECT * FROM ce_tracking_tools WHERE cett_event = 123 [FOR UPDATE]
    • SELECT * FROM ce_tracking_tools WHERE cett_event IN (123, 456, 789, ...)
    • DELETE FROM ce_tracking_tools WHERE cett_id IN (123, 456, 789, ...)
    • INSERT IGNORE INTO ce_tracking_tools $rows
  • The release plan for the feature (are there specific wikis you'd like to test first etc).
    • The extension is only enabled on testwiki, test2wiki, officewiki, and metawiki, and the schema should be created on these wikis.

Deployment

  • Make sure the patch which introduces the tables got merged, and confirm files were deployed to production (will be in 1.41.0-wmf.11)
  • Schedule table creation at a time that doesn't conflict with the deployment schedule; note the time & date here. -- scheduled for 2023-06-01 12:00 UTC
  • Create the tables for each wiki. Note that we don't have a standalone .sql file, so the SQL below should be run interactively for each wiki:
    • mwscript sql.php --wiki=testwiki --cluster=extension1
    • mwscript sql.php --wiki=test2wiki --cluster=extension1
    • mwscript sql.php --wiki=officewiki --cluster=extension1
    • mwscript sql.php --wiki=metawiki --cluster=extension1 --wikidb=wikishared
CREATE TABLE ce_tracking_tools (
  cett_id BIGINT UNSIGNED AUTO_INCREMENT NOT NULL,
  cett_event BIGINT UNSIGNED NOT NULL,
  cett_tool_id INT NOT NULL,
  cett_tool_event_id VARBINARY(512) NOT NULL,
  cett_sync_status INT NOT NULL,
  cett_last_sync BINARY(14) DEFAULT NULL,
  UNIQUE INDEX cett_event_tool_teid (
    cett_event, cett_tool_id, cett_tool_event_id
  ),
  PRIMARY KEY(cett_id)
);

Event Timeline

Remember that as DBAs we do not create the tables, as that can be done during the deployment window.
@Ladsgroup can you review the table and usage?

Also, we need to create a subtask for cloud-services-team as the views have restrictions (per the comment: The data is public, but in order to respect (soft) deletion of events, rows should not be visible if cett_event corresponds to a row in campaign_events with event_id=cett_event and event_deleted_at IS NOT NULL )

Remember that as DBAs we do not create the tables, as that can be done during the deployment window.

Yup, thanks for the reminder, the Campaigns team will take care of this once we have the green light.

Also, we need to create a subtask for cloud-services-team as the views have restrictions (per the comment: The data is public, but in order to respect (soft) deletion of events, rows should not be visible if cett_event corresponds to a row in campaign_events with event_id=cett_event and event_deleted_at IS NOT NULL )

Right, and also: these tables are on x1, and IIRC, replication from x1 doesn't work yet (although we'd still want to write the replication rules to be future-proof).

Remember that as DBAs we do not create the tables, as that can be done during the deployment window.
@Ladsgroup can you review the table and usage?

I looked at it and looks okay to me.

Also, we need to create a subtask for cloud-services-team as the views have restrictions (per the comment: The data is public, but in order to respect (soft) deletion of events, rows should not be visible if cett_event corresponds to a row in campaign_events with event_id=cett_event and event_deleted_at IS NOT NULL )

I think it's data engineering now? But also I suggest simply making it a private table and when we start replication from x1 and community wants it, we can partially enable it.

Remember that as DBAs we do not create the tables, as that can be done during the deployment window.
@Ladsgroup can you review the table and usage?

I looked at it and looks okay to me.

Also, we need to create a subtask for cloud-services-team as the views have restrictions (per the comment: The data is public, but in order to respect (soft) deletion of events, rows should not be visible if cett_event corresponds to a row in campaign_events with event_id=cett_event and event_deleted_at IS NOT NULL )

I think it's data engineering now? But also I suggest simply making it a private table and when we start replication from x1 and community wants it, we can partially enable it.

That's not how private tables work on replication filtering. If we make it private it means we are placing a replication filter so the table won't be replicated at all. Which means, if we want to enable it at a later stage (if we ever do it for x1) we'd need to reimport it again, which is painful.

The code has been merged and I plan to create the new table at some point next week, once the code has reached meta. @Marostegui I'd be happy to create the task for cloud-services-team, is there a template I can use?

@Daimona there is no template as far as I know

Mentioned in SAL (#wikimedia-operations) [2023-06-01T12:03:23Z] <Daimona> Creating ce_tracking_tools table for the CampaignEvents extension on testwiki, test2wiki, officewiki, and metawiki # T336365

Daimona claimed this task.
Daimona updated the task description. (Show Details)

Tables created, and I verified that they look good in all of the 4 wikis.