Page MenuHomePhabricator

Create the tables for participant questions in prod
Closed, ResolvedPublic

Description

Acceptance criteria

  • The new schema defined in T335526 is created in production

Partial checklist (from mw:Creating_new_tables)

  • Create a Phabricator task to track the table creation
  • Design the table schema
  • Write the gerrit patch and get it merged
  • Request DBA signoff
    • See information below in the Details section.
    • Move the task to the Triage column on the DBA workboard.
  • Create the views with appropriate restrictions -- not needed, these tables will be in x1

Details

  • Should this table be replicated to wiki replicas (does it not contain private data)?
    • The ce_event_questions table is public, but in order to respect (soft) deletion of events, rows should not be visible if ceeq_question_id corresponds to a row in campaign_events with event_id=ceeq_question_id and event_deleted_at IS NOT NULL.
    • The other two tables (ce_question_answers and ce_question_aggregation) must be private because they will contain PII or information that is otherwise hidden to unprivileged users.
  • Will you be doing cross-joins with the wiki metadata?
    • No
  • Size of the table (number of rows expected).
    • They'd start off as 0, then they'll grow as explained in the next section.
  • Expected growth per year (number of rows).
    • According to P35379, some (over)estimates could be:
      • ce_event_questions: 5 * number_of_events, i.e., 30 000
      • ce_question_answers: Assuming that 3/5 of the participants will answer all questions, it would be 5*300*500*12 = 9 000 000. But 3/5 of the answers would be deleted after 90 days, so a better estimate would be 2*300*500*12 + 3*300*500*3 = 4 950 000
      • ce_question_aggregation: 5 * number_of_events, i.e., 30 000
  • Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok).
    • According to P35379, queries per month:
      • ce_event_questions:
        • read: every time the registration form is opened, events_per_month * 1 = 500
        • write: roughly the same as the reads, events_per_month * 1 = 500
      • ce_question_answers:
        • read: register or edit your registration, assuming 1 every 2 participants does this once: participants_per_month * 1.5 * 2 queries = 500 * 300 * 1.5 * 2 = 450 000; aggregate data: 500 * 300 = 150 000.
        • write: register or edit your registration, assuming 1 every 3 participants will change something: participants_per_month * 1/3 * 2 queries = 100 000
      • ce_question_aggregation:
        • read: every time organizer opens stats: assuming they'll do so 20 times per event: events_per_month * 20 = 500 * 20 = 10 000
        • write: every time participant answers are aggregated: roughly 500 * 300 = 150 000.
  • Examples of queries that will be using the table.
  • The release plan for the feature (are there specific wikis you'd like to test first etc).
    • The extension is only enabled on testwiki, test2wiki, officewiki, and metawiki, and the schema should be created on these wikis.

Deployment

  • Make sure the patch which introduces the tables got merged, and confirm files were deployed to production (will be in 1.41.0-wmf.15)
  • Schedule table creation at a time that doesn't conflict with the deployment schedule; note the time & date here.
  • Log the table creation in the SAL (#wikimedia-operations): !log Creating new DB tables for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # T340000
  • Create the tables for each wiki. Note that we don't have a standalone .sql file, so the SQL below should be run interactively for each wiki:
    • mwscript sql.php --wiki=testwiki --cluster=extension1
    • mwscript sql.php --wiki=test2wiki --cluster=extension1
    • mwscript sql.php --wiki=officewiki --cluster=extension1
    • mwscript sql.php --wiki=metawiki --cluster=extension1 --wikidb=wikishared
CREATE TABLE /*_*/ce_event_questions (
  ceeq_id BIGINT UNSIGNED AUTO_INCREMENT NOT NULL,
  ceeq_event_id BIGINT UNSIGNED NOT NULL,
  ceeq_question_id INT UNSIGNED NOT NULL,
  UNIQUE INDEX ceeq_event_question (ceeq_event_id, ceeq_question_id),
  PRIMARY KEY(ceeq_id)
) /*$wgDBTableOptions*/;


CREATE TABLE /*_*/ce_question_answers (
  ceqa_id BIGINT UNSIGNED AUTO_INCREMENT NOT NULL,
  ceqa_event_id BIGINT UNSIGNED NOT NULL,
  ceqa_user_id INT UNSIGNED NOT NULL,
  ceqa_question_id INT UNSIGNED NOT NULL,
  ceqa_answer_option INT UNSIGNED DEFAULT NULL,
  ceqa_answer_text BLOB DEFAULT NULL,
  UNIQUE INDEX ceqa_event_user_question (
    ceqa_event_id, ceqa_user_id, ceqa_question_id
  ),
  PRIMARY KEY(ceqa_id)
) /*$wgDBTableOptions*/;


CREATE TABLE /*_*/ce_question_aggregation (
  ceqag_id BIGINT UNSIGNED AUTO_INCREMENT NOT NULL,
  ceqag_event_id BIGINT UNSIGNED NOT NULL,
  ceqag_question_id INT UNSIGNED NOT NULL,
  ceqag_answer_option INT UNSIGNED NOT NULL,
  ceqag_answers_amount INT UNSIGNED NOT NULL,
  UNIQUE INDEX ceqag_event_question (
    ceqag_event_id, ceqag_question_id
  ),
  PRIMARY KEY(ceqag_id)
) /*$wgDBTableOptions*/;

Event Timeline

The table is small enough not to need through review IMO (specially given that it's in x1)

The table is small enough not to need through review IMO (specially given that it's in x1)

Thanks! We will create the tables when the patch is merged and deployed, then.

Mentioned in SAL (#wikimedia-operations) [2023-06-29T15:06:26Z] <Daimona> Creating new DB tables for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # T340000