Page MenuHomePhabricator
Paste P35379

CampaignEvents database queries estimation
ActivePublic

Authored by Daimona on Oct 7 2022, 3:48 PM.
What follows is an estimation, done by the Campaigns team, of the number of queries that could be expected for the tables added by the CampaignEvents extension. Note that these are overestimates, and we don't really expect these numbers, especially not immediately after the release.
==Monthly event data==
Based on data provided by @ifried, we estimate:
- 500 events per month
- 300 participants per event
- 5 organizers per event
==Cost of each operation that involves DB access==
This is the number of queries (on average) of each operation that can happen in the CampaignEvents extension. We may have forgotten something, but since these are overestimates, it shouldn't really matter.
- W: create registration:
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address, 1 ce_organizers → 1 + 0.5 + 0.5 + 1 = 3
- W: update registration:
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address -> 1 + 0.5 + 0.5 = 2
- W: disable registration
-- 1 campaign_events -> 1
- W: participant registers
-- 1 ce_participants -> 1
- W: participant unregisters
-- 1 ce_participants -> 1
- W: participants removed
-- 1 ce_participants -> 1
- W: participant changes visibility
-- 1 ce_participants -> 1
- W: organizer added
-- 1 ce_organizers -> 1
- W: organizer removed
-- 1 ce_organizers -> 1
- R: participant opens Special:EventDetails
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address, 2 ce_participants (assuming that participants will always stop after the initial batch + count), 1 ce_organizers -> 1 + 0.5 + 0.5 + 2 + 1 = 5
- R: organizer opens Special:EventDetails
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address, (num_participants / batch_size) ce_participants (assuming they will always scroll to bottom), 1 ce_organizers -> 1 + 0.5 + 0.5 + num_part / 20 + 1 = 3 + num_part / 20
- R: user opens Special:MyEvents and interacts with it
-- Not accounting for interaction: 1 campaign_events+ce_participants+ce_organizers -> 1
- R: someone opens event page
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address, 2 ce_organizers, 2 ce_participants -> 1 + 0.5 + 0.5 + 2 + 2 = 6
- R: organizer opens Special:EditEventRegistration
-- 1 campaign_events, 0-1 ce_event_address, 0-1 ce_address -> 1 + 0.5 + 0.5 = 2
- W+R: timezone script runs!
-- R: num_rows / batch_size on campaign_events = num_rows/500
-- W: ignore: not significant (only happens when timezone rules change, and for events in that timezone)
==Total queries per month==
These were obtained by combining the two sections above, plus some additional assumptions.
Total costs per month:
- W: create registration
-- Events_per_month * 3 = 500 * 3 = 1500
- W: update registration
-- assuming each event is modified once after creation, on average: events_per_month * 1 * 2 = 500 * 2 = 1000
- W: disable registration
-- Assuming it will happen for 5% of events, so events_per_month * 0.05 = 500 * 0.05 = 25
- W: participant registers
-- Participants_per_event * events_per_month * 1 = 300 * 500 = 150000
- W: participant unregisters
-- Assuming 10% of participants will unregister (overestimate): participants_per_event * events_per_month * 1/10 * 1 = 300 * 500 / 10 = 15000
- W: participants removed
-- Assuming that for each event, the organizer will remove 1+ users 5 times: events_per_month * 5 * 1 = 500 * 5 = 2500
- W: participant changes visibility
-- Assuming that 10% of participants will do it: events_per_month * participants_per_event * 1/10 * 1 = 300 * 500 / 10 = 15000
- W: organizer added
-- Assuming that it’ll happen 5 times per event: events_per_month * 5 * 1 = 500 * 5 = 2500
- W: organizer removed
-- Assuming that it’ll happen 5 times per event: events_per_month * 5 * 1 = 500 * 5 = 2500
- R: participant opens Special:EventDetails
-- Assuming that each participant will open the page 10 times (and the number of queries assumes that they’ll always stop after the initial batch): events_per_month * participants_per_event * 10 * 5 = 500 * 300 * 10 * 5 = 7500000
- R: organizer opens Special:EventDetails
-- Assuming that each organizer will open the page 20 times (and the number of queries assumes that they’ll always scroll to the bottom of the list of participants): events_per_month * organizers_per_event * 20 * (3 + num_part / 20) = 500 * 5 * 20 * ( 3 + 300 / 20 ) = 900000
- R: user opens Special:MyEvents and interacts with it
-- Assuming that users will open the page twice a day, i.e. 60 times per month, and each time they’ll “interact” (change page, change sorting, filter) 5 times with it. Also assuming that the number of people who have reasons to open that page is gonna be 2000. 2000 * 60 * 5 * 1 = 600000
- R: someone opens event page
-- Assuming each event page will be visited by three times the number of average participants to each event. Events_per_month * participants_per_event * 3 * 6 = 500 * 300 * 3 * 6 = 2700000
- R: organizer opens Special:EditEventRegistration
-- As above, assuming each event is modified once after creation, on average: events_per_month * 1 * 2 = 500 * 2 = 1000
- W+R: timezone script runs!
-- Assuming that we will set it to run daily, so 30 times per month. It scans the whole table though, so the cost increases each month. We can estimate the reads as events_per_month * months_elapsed / 500 = months_elapsed
==Queries per table==
This is the number of queries, as obtained in the previous section, but per-table rather than per-action. The reason why this wasn't done before is that at first I didn't realize the question was per-table.
- campaign_events: 500 + 500 + 25 + 1500000 + 50000 + 600000 + 450000 + 500 + months_elapsed = 2600000 cca
- ce_address: 250 + 250 + 750000 + 25000 + 225000 + 250 = 1000000 cca
- ce_event_address: assumed same as ce_address: 1000000
- ce_organizers: 500 + 2500 + 2500 + 1500000 + 50000 + 600000 + 900000 = 3100000 cca
- ce_participants: 150000 + 15000 + 2500 + 15000 + 3000000 + 750000 + 600000 + 900000 = 5500000 cca