Ensure that all tables have all the necessary indexes
Closed, ResolvedPublic3 Estimated Story Points
Actions

Description

AC

Tables should have all the indexes recommended by Amir: T308738#8103421
Those indexes should also be created (manually) on beta

NOTE: Since we're not doing this in a backwards-compatible way, you will have to create the index manually on your local once the patch is merged

Details

	Subject	Repo	Branch	Lines +/-
	Add indexes to the schema	mediawiki/extensions/CampaignEvents	master	+62 -2

Customize query in gerrit

Related Objects

Mentioned In: T350947: Performance review of Extension:WikimediaCampaignEvents
T318595: Create database schema for the CampaignEvents extension on testwiki, test2wiki, and officewiki
T318593: Review database schema of the CampaignEvents extension
T302858: Performance review of CampaignEvents extension
T316943: Request access to deployment-prep
T316765: investigation: preliminary performance assessment
rUCAM14a487e04cd0: Add indexes to the schema
Mentioned Here: T311126: Implement strategy to account for DST variations
T312870: Add timestamps to ce_organizers table [V1]
T308694: Sorting in TablePager doesn't work for aggregates
T312229: Make the search on Special:MyEvents case-insensitive

Event Timeline

Daimona created this task.May 19 2022, 10:29 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 19 2022, 10:29 AM

Discussed: this should come after we have created all our special pages (probably a simple task, but we don't have enough information). Essentially, all V0 features should be in place.

ldelench_wmf moved this task from Backlog to To be Estimated/Discussed on the Campaign-Tools board.Jun 30 2022, 3:48 PM

ldelench_wmf moved this task from Backlog to V0 on the Campaign-Registration board.

ldelench_wmf triaged this task as High priority.Jun 30 2022, 4:31 PM

ldelench_wmf set the point value for this task to 3.

ldelench_wmf edited projects, added Campaign-Tools (Campaign-Tools-Sprint-16); removed Campaign-Tools.

I think the indexes we already have are fine, I just would add more 3:

event_meeting_country_and_meeting_type
- Would be used when querying events by country and meeting type (Can be useful for the event calendar in the future).
event_meeting_country
- Would be used when querying events by country (Can be useful for the event calendar in the future).
event_status
- Used when querying events by status.

cmelo changed the task status from Open to In Progress.Jul 15 2022, 5:45 PM

cmelo claimed this task.

cmelo moved this task from Ready 🎬 to In Progress 💻 on the Campaign-Tools (Campaign-Tools-Sprint-16) board.

cmelo moved this task from In Progress 💻 to Review/Feedback 💬 on the Campaign-Tools (Campaign-Tools-Sprint-16) board.Jul 15 2022, 5:56 PM

In T308738#8081199, @cmelo wrote:

I think the indexes we already have are fine, I just would add more 3:

event_meeting_country_and_meeting_type

Would be used when querying events by country and meeting type (Can be useful for the event calendar in the future).

event_meeting_country

Would be used when querying events by country (Can be useful for the event calendar in the future).

The thing about the country is that we're likely to change the schema anyway for V1, so I'm not sure if we need to account for a future use case now. Also, the second index would be redundant because the country is already the leftmost column of the first index.

event_status

Used when querying events by status.

Slightly unsure about this one, the column can only have 2 values right now (and even in future versions its cardinality is going to remain very low), so I'm not sure if an index would help much.

I'm creating a prod-like set of tables with thousands of events locally and will run some queries on it to see what indexes would make a difference to the optimizer. I'll work on it in the weekend and I expect it'll take some time to come back with the results.

Ladsgroup subscribed.Jul 16 2022, 3:00 PM

I played with the dataset a little bit, and here are my conclusions. This is not an extensive analysis, but I felt like it wasn't worth investing more time into it. I also think we shouldn't consider this a blocker for the beta deployment for the following reasons:

We don't expect huge datasets on beta, especially not for the first days after the deployment.
Schema updates will have to be run manually anyway because we don't use the local wiki database, so we can still ignore backwards compatibility-related concerns (e.g., writing actual schema change files). This is the reason why I wanted to do this before the deployment, before realizing that it's actually a non-concern.

However, we should still address this as soon as possible. @cmelo, I asked @Ladsgroup if he could give us his recommendations on the schema, and he kindly agreed to meet with us in the upcoming two weeks. Amir, below you can find information about the schema, the queries, and the questions I have. This comment is long, I know it and I apologize for it. I tried to divide it in multiple sections, so that you can skip whatever you believe is not relevant to you. Thanks in advance for your help and for reading this wall of text.

Current schema

See the db_patches directory. See autogenerated schema for MySQL. The purpose of the tables is fairly self-explanatory:

campaign_events contains information about an event (name, event page, meeting links etc.). Note that currently we can only have an event per page, and this won't be changed anytime soon if at all.
ce_participants contains a row for each (event,user) pair. cep_event_id references campaign_events.event_id (PK)
ce_organizers contains a row for each (event,user,role) 3-uple. For now there's only a single role, so you can ignore that part, but in the future the same user could have more than one role in organizing a given event. ceo_event_id references campaign_events.event_id (PK)

We currently don't have many indexes, except for primary keys and some uniqueness constraints.

Test dataset

As I wrote in T308738#8081669, I wanted to create a large dataset locally with production-like data so that I could play with it and see which changes were worth implementing. If you want the code that I used to generate the data, please ask for it, but note that I'll have to rewrite it first because I made some mistakes that had to be later accounted for in the code. Here are the relevant specifications given the simple use cases we have so far:

The campaign_events table has a total of 102000 event rows
The ce_organizers table has one row for each event (so 102000 rows), and I tried a few different distributions:
- Only two different ceo_user_id, divided as 60/40
- An average of 6 events by organizer, generated with ceo_user_id = mod(ceo_id, 20000)
For the ce_participants table, I added an average of ~20 participants for each event, using an N(20, 100) distribution, rounded and capped at 0.

Schema access in the code

Each table is managed by (almost) a single class:

campaign_events: EventStore.php
ce_participants: ParticipantsStore.php
ce_organizers: OrganizersStore.php

I said "almost" because we also have EventsPager.php which queries all tables directly. The (current) purpose of EventsPager is to list the events organized by a given user. Also, EventStore needs to query ce_participants and ce_organizers as well.

Current queries

This is a list of queries for the current version of the extension. We will definitely add more in the future, but those should not be optimized now.

Get an event by its unique ID. This is a simple PK lookup and can be ignored.
Get an event by page:

SELECT * FROM campaign_events WHERE event_page_namespace=X AND event_page_title=Y AND event_page_wiki=Z

Get all the events organized by a given user:

SELECT * FROM campaign_events JOIN ce_organizers ON (event_id=ceo_event_id) WHERE ceo_user_id=X

Get all events that a user is actively participating in:

SELECT * FROM campaign_events JOIN ce_participants ON (event_id=cep_event_id) WHERE cep_user_id=X AND cep_unregistered_at IS NULL

Add a participant to an event. This can either insert a new row for a new participant (no need to optimize), or update cep_unregistered_at or cep_registered_at for an existing but soft-deleted record. This is done via IDatabase::upsert, so the actual query will vary depending on the DBMS. For MySQL, this is:

INSERT INTO ce_participants (cep_event_id,cep_user_id,cep_registered_at,cep_unregistered_at) VALUES (X,Y,Z,NULL) ON DUPLICATE KEY UPDATE cep_unregistered_at = NULL,cep_registered_at = Z

Remove one or more participants from an event:

-- Single participant
UPDATE ce_participants SET cep_unregistered_at = X WHERE cep_event_id = Y AND cep_unregistered_at IS NULL AND cep_user_id = Z
-- Multiple participants
UPDATE ce_participants SET cep_unregistered_at = X WHERE cep_event_id = Y AND cep_unregistered_at IS NULL AND cep_user_id IN ( x_1, ..., x_n )

Get the participants of an event; supports basic pagination, parts in square brackets are optional:

SELECT cep_id, cep_user_id, cep_registered_at FROM ce_participants WHERE cep_event_id = X AND cep_unregistered_at IS NULL [ AND (cep_id > Y) ] ORDER BY cep_id LIMIT Z

Check whether a user participates in an event

SELECT * FROM ce_participants WHERE cep_event_id = X AND cep_user_id = Y AND cep_unregistered_at IS NULL LIMIT 1

Count participants of an event

SELECT COUNT(*) FROM ce_participants WHERE cep_event_id = X AND cep_unregistered_at IS NULL LIMIT 1

Get the organizers of an event; normally used with LIMIT:

SELECT ceo_user_id,ceo_role_id FROM ce_organizers WHERE ceo_event_id = X LIMIT Y

Check whether a user organizes an event

SELECT * FROM ce_organizers WHERE ceo_event_id = X AND ceo_user_id = Y LIMIT 1

Count organizers of an event

SELECT COUNT(*) FROM ce_organizers WHERE ceo_event_id = X LIMIT 1

Adding a new organizer is a simple insert and can be ignored.

Then we have EventsPager, used on Special:MyEvents to display a list of event that you organize. The events are sortable by start date, name, and number of participants. You can also filter them by status. The basic query is:

SELECT event_id,event_name,event_page_namespace,event_page_title,event_page_prefixedtext,event_page_wiki,event_status,event_start,event_meeting_type,num_participants FROM (SELECT event_id,event_name,event_page_namespace,event_page_title,event_page_prefixedtext,event_page_wiki,event_status,event_start,event_meeting_type,COUNT(cep_id) AS `num_participants` FROM `campaign_events` LEFT JOIN `ce_participants` ON ((event_id=cep_event_id)) LEFT JOIN `ce_organizers` ON ((event_id=ceo_event_id)) WHERE event_deleted_at IS NULL AND cep_unregistered_at IS NULL AND ceo_user_id = X GROUP BY cep_event_id,event_id,event_name,event_page_namespace,event_page_title,event_page_prefixedtext,event_page_wiki,event_status,event_start,event_meeting_type ) `tmp` ORDER BY Y LIMIT Z

We use a temporary table because otherwise we could not sort by the number of participants, due to limitations in the core pager logic and for postgres support (see T308694). Also, note that the query above is valid under ONLY_FULL_GROUP_BY in MySQL < 5.7.5, where it still didn't detect functional dependency, which explains why all fields are appearing in the GROUP BY even if they're redundant.

The list can be filtered by name and status; in both cases we change the WHERE condition of the inner query;

Filtering by event name is done with event_name LIKE '%abc%'. I know, this sucks and cannot use any indexes. But it will be improved as part of T312229.
Filtering by event status is a simple event_status = X. Currently, the possible values are 1 and 2.

The possible orderings are:

By start date, default: ORDER BY event_start,event_name,event_id
By name: ORDER BY event_name,event_start,event_id
By number of participants: ORDER BY num_participants,event_start,event_id

Potential performance issues

Below I'm listing the potential issues I noticed with my questions. @Ladsgroup I'd appreciate if you could answer those.

The first issue I noticed is when selecting events organized by a given user (either as a simple list, or the paginated version for Special:MyEvents). The current index is ceo_event_id, ceo_user_id, ceo_role_id, so it cannot be used for the JOIN (because ceo_user_id is needed first). As such, the whole campaign_events table is scanned, and then filtered later. This could be resolved by indexing the ceo_user_id column to result in a better query plan where the join condition uses the index and cuts down the number of rows. However, I think we will still need both indexes, because ceo_event_id can be used without ceo_user_id when retrieving the count/list of organizers of a specific event.

Getting all the events that a user participates in shows similar issues and would benefit from having an index on the user ID. Again though, we also need cep_event_id to be indexed, as it's used e.g when counting participants.

The "get the participants of an event" one seems good, but it filesorts because the PK index is not used (it uses the one on event+user instead). I'm not sure if it's worth adding the ID to a composite index, since for now only low LIMITs can be used.

Now for the pager one. First of all, this query would also benefit from the index on ceo_user_id proposed above. Then, the inner query has Using temporary; Using filesort for ce_organizers. I'm sure this is caused by the GROUP BY but I'm not sure how to fix that. The outer query (on the derived table) also filesorts, but I'm not sure if we can avoid that. Overall, this query is noticeably slow if an organizer has many events (which is maybe not very realistic, but still worth noting). I don't think adding indexes to campaign_events for the where condition would make any difference, because ce_organizer is scanned first and only the corresponding rows in campaign_events are returned. Also, I was wondering at T308738#8081669 if an index on event_status would help at all, since the cardinality is just 2. As for indexes on the ORDER BY columns: since it's the derived table which is ordered, I'm not sure if they would help. I think MySQL should be able to use indexes from inner tables in some situations, but this does not seem to be the case here.

As a final doubt/question: the current schema forces us to use joins + group by + temp tables in the pager query for the number of participants. I was wondering if it would make sense to denormalize the value, adding a campaign_events.event_num_participants field that is updated whenever the ce_participants table is updated. This would greatly simplify the query, but I'm not sure if it's really worth doing.

Adding indexes requires not just the schema and queries but also data and sometimes optimizer bugs and so on. At some point it basically turns into an art than actual engineering/architecture work but I doubt we would get there.

We also have the space/speed tradeoff. We are rather okay in space in x1 so we can err on side of more indexes.

Here is rule of thumb:
This should be the order of importance for index (and the ones that must get on the leftmost part of the index):

columns used to join
columns used for ordering. This is important in our infra as I've seen mariadb basically throwing away everything and sticking to the index that's using order by columns
columns used for conditions
- columns that have higher selectivity take precedence. e.g. Age takes precedence over gender.
- High Performance MySQL says you should swap the columns order in some special cases. When almost exclusively in all select queries you have condition on that column. E.g. gender can take precedence if you always condition it (vs. you might condition on age in some queries)
columns that are used in select fields.
- It's called "covering index" and speeds up the select but it's more of a nice to have (unless the pressure on read is quite high which is not the case here)

With these notes, let me take a look at your schema and queries. I saw you already have index on ceo_event_id which is great.

ce_organizers doesn't need more indexes for performance. it might be a good idea to add a unique index on ceo_event_id, ceo_user_id to avoid one user having multiple rows. I leave that to you.
ce_participants needs index on cep_event_id, cep_unregistered_at for the count measurement.
I'm generally not a big fan of upsert. It can easily start wasting auto increment values in updates. I suggest doing a query first and if it exists, just going with update. If not, then upsert (to handle rare race conditions).

I'm looking at campaign_events and its queries now.

Ladsgroup added a project: Data-Persistence (work done).Jul 19 2022, 11:33 AM

On joining to campaign_events table, you definitely need to add these indexes (I know it's opposite of the join column but join order is important, these will be the first tables in the join order so they should have the join on where condition):

ceo_user_id, ceo_event_id on ce_organizers
cep_user_id, cep_unregistered_at, cep_event_id on ce_participants

Add index on event_id, event_deleted_at on campaign_events and I think you'd be good to go.

ldelench_wmf edited projects, added Campaign-Tools (Campaign-Tools-Sprint-17); removed Campaign-Tools (Campaign-Tools-Sprint-16).Jul 20 2022, 1:28 AM

ldelench_wmf moved this task from Ready 🎬 to Review/Feedback 💬 on the Campaign-Tools (Campaign-Tools-Sprint-17) board.Jul 20 2022, 1:33 AM

Amir, thanks for your detailed replies. Sorry if I didn't get back to you sooner, but with the release happening last week I had so many things on my mind.

In T308738#8087224, @Ladsgroup wrote:

ce_organizers doesn't need more indexes for performance. it might be a good idea to add a unique index on ceo_event_id, ceo_user_id to avoid one user having multiple rows. I leave that to you.

Right now we already have a unique index on ceo_event_id, ceo_user_id, ceo_role_id (in the future, each user could have multiple roles for the same event).

I'm generally not a big fan of upsert. It can easily start wasting auto increment values in updates. I suggest doing a query first and if it exists, just going with update. If not, then upsert (to handle rare race conditions).

Yup, we're already doing that, see here. Now that I think about it, this is also something I wanted to ask you: right now we're doing that in an atomic section + a FOR UPDATE lock. I think the lock should be enough and the atomic section wouldn't be necessary though, am I right?

Summing up, here are the final indexes to have (including those that already exist) if I understood your comments:

-- On campaign_events
PRIMARY KEY(event_id),
UNIQUE INDEX event_page ( event_page_wiki, event_page_namespace, event_page_title ),
INDEX event_id_deleted (event_id, event_deleted_at)

-- On ce_organizers
PRIMARY KEY(ceo_id),
UNIQUE INDEX ceo_event_user_role ( ceo_event_id, ceo_user_id, ceo_role_id ), -- Used when retrieving organizers for a single event
INDEX ceo_user_event ( ceo_user_id, ceo_event_id ) -- Used for JOINs

-- On ce_participants
PRIMARY KEY(cep_id),
UNIQUE INDEX cep_event_participant (cep_event_id, cep_user_id), -- Uniqueness constraint
INDEX cep_event_unregistered ( cep_event_id, cep_unregistered_at ), -- For COUNT
INDEX cep_user_unregistered_event ( cep_user_id, cep_unregistered_at, cep_event_id ) -- For JOIN

Is this correct? I'm also wondering if we could avoid one of those indexes on ce_participants.

Thank you again!

Sorry it took me a bit to respond, I was out sick (and half-time for a while due to sickness)

In T308738#8103421, @Daimona wrote:

Yup, we're already doing that, see here. Now that I think about it, this is also something I wanted to ask you: right now we're doing that in an atomic section + a FOR UPDATE lock. I think the lock should be enough and the atomic section wouldn't be necessary though, am I right?

I doubt we would ever hit race conditions like that. Still you can do a read with a replica for getting the id (to avoid insert) and then do a read with FOR UPDATE if needed. The thing is that specially if you select ranges or if the id doesn't exist, it might lock way more rows (e.g. gap lock) than you intend to. So I'm not sure you'd FOR UPDATE first, try replica (which would catch most cases I assume), then master with FOR UPDATE

Summing up, here are the final indexes to have (including those that already exist) if I understood your comments:

-- On campaign_events
PRIMARY KEY(event_id),
UNIQUE INDEX event_page ( event_page_wiki, event_page_namespace, event_page_title ),
INDEX event_id_deleted (event_id, event_deleted_at)

-- On ce_organizers
PRIMARY KEY(ceo_id),
UNIQUE INDEX ceo_event_user_role ( ceo_event_id, ceo_user_id, ceo_role_id ), -- Used when retrieving organizers for a single event
INDEX ceo_user_event ( ceo_user_id, ceo_event_id ) -- Used for JOINs

-- On ce_participants
PRIMARY KEY(cep_id),
UNIQUE INDEX cep_event_participant (cep_event_id, cep_user_id), -- Uniqueness constraint
INDEX cep_event_unregistered ( cep_event_id, cep_unregistered_at ), -- For COUNT
INDEX cep_user_unregistered_event ( cep_user_id, cep_unregistered_at, cep_event_id ) -- For JOIN

Is this correct? I'm also wondering if we could avoid one of those indexes on ce_participants.

Yes, it looks good. If you are worried about storage (which I'm not much given that it's going to be on x1), you can avoid adding cep_event_unregistered and for counts, it would scan all rows of an event which should be just slightly more than the rows it would scan with the index.

HTH

ldelench_wmf edited projects, added Campaign-Tools (Campaign-Tools-Current-Sprint); removed Campaign-Tools (Campaign-Tools-Sprint-17).Aug 14 2022, 9:30 PM

ldelench_wmf moved this task from Needs Input / Design in progress to Code Review on the Campaign-Tools (Campaign-Tools-Current-Sprint) board.Aug 14 2022, 9:37 PM

Thanks Amir, this definitely helps! @cmelo Would you like to implement the suggestions above, or should I?

• vyuen moved this task from V0 to Darkship on the Campaign-Registration board.Aug 24 2022, 11:27 AM

• vyuen reassigned this task from cmelo to Daimona.Aug 25 2022, 1:15 PM

• vyuen changed the task status from In Progress to Open.Aug 25 2022, 5:04 PM

Daimona updated the task description. (Show Details)Aug 26 2022, 2:07 PM

Change 826877 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/CampaignEvents@master] Add indexes to the schema

https://gerrit.wikimedia.org/r/826877

gerritbot added a project: Patch-For-Review.Aug 26 2022, 2:30 PM

Daimona moved this task from Needs Input / Design in progress to Code Review on the Campaign-Tools (Campaign-Tools-Current-Sprint) board.Aug 26 2022, 2:30 PM

I've implemented the suggestion above, and can confirm that the queries are really fast on the test dataset.

• vyuen changed the task status from Open to In Progress.Aug 30 2022, 12:13 PM

Change 826877 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@master] Add indexes to the schema

https://gerrit.wikimedia.org/r/826877

Daimona mentioned this in rUCAM14a487e04cd0: Add indexes to the schema.Aug 31 2022, 3:54 AM

ReleaseTaggerBot added a project: MW-1.39-notes (1.39.0-wmf.28; 2022-09-05).Aug 31 2022, 4:00 AM

Maintenance_bot removed a project: Patch-For-Review.Aug 31 2022, 4:30 AM

Commands that I will run on beta:

ALTER TABLE campaign_events ADD INDEX event_id_deleted (event_id, event_deleted_at);
ALTER TABLE ce_participants ADD INDEX cep_event_unregistered ( cep_event_id, cep_unregistered_at );
ALTER TABLE ce_participants ADD INDEX cep_user_unregistered_event ( cep_user_id, cep_unregistered_at, cep_event_id );
ALTER TABLE ce_organizers ADD INDEX ceo_user_event (ceo_user_id, ceo_event_id);

Mentioned in SAL (#wikimedia-releng) [2022-08-31T16:39:06Z] <Daimona> Applying schema change to the wikishared DB on beta for the CampaignEvents extension # T308738

Daimona moved this task from Code Review to QA 🐛 on the Campaign-Tools (Campaign-Tools-Current-Sprint) board.Aug 31 2022, 4:42 PM

Daimona mentioned this in T316765: investigation: preliminary performance assessment.Aug 31 2022, 5:23 PM

vaughnwalters mentioned this in T316943: Request access to deployment-prep.Sep 2 2022, 3:45 PM

• vyuen mentioned this in T302858: Performance review of CampaignEvents extension.Sep 6 2022, 1:03 PM

See notes at T312870#8222116

Index recommendations:

-- On campaign_events
PRIMARY KEY(event_id),
UNIQUE INDEX event_page ( event_page_wiki, event_page_namespace, event_page_title ),
INDEX event_id_deleted (event_id, event_deleted_at)

SHOW INDEX FROM campaign_events;

MariaDB [wikishared]> SHOW INDEX FROM campaign_events;
+-----------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name          | Seq_in_index | Column_name          | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| campaign_events |          0 | PRIMARY           |            1 | event_id             | A         |         114 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          0 | event_page        |            1 | event_page_wiki      | A         |          57 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          0 | event_page        |            2 | event_page_namespace | A         |          57 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          0 | event_page        |            3 | event_page_title     | A         |         114 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          1 | event_id_deleted  |            1 | event_id             | A         |         114 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          1 | event_id_deleted  |            2 | event_deleted_at     | A         |         114 |     NULL | NULL   | YES  | BTREE      |         |               |
| campaign_events |          1 | event_timezone_id |            1 | event_timezone       | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| campaign_events |          1 | event_timezone_id |            2 | event_id             | A         |         114 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Index recommendations:

-- On ce_organizers
PRIMARY KEY(ceo_id),
UNIQUE INDEX ceo_event_user_role ( ceo_event_id, ceo_user_id, ceo_role_id ), -- Used when retrieving organizers for a single event
INDEX ceo_user_event ( ceo_user_id, ceo_event_id ) -- Used for JOINs

SHOW INDEX FROM ce_organizers;

MariaDB [wikishared]> SHOW INDEX from ce_organizers;
+---------------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name            | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| ce_organizers |          0 | PRIMARY             |            1 | ceo_id       | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
| ce_organizers |          0 | ceo_event_user_role |            1 | ceo_event_id | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
| ce_organizers |          0 | ceo_event_user_role |            2 | ceo_user_id  | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
| ce_organizers |          0 | ceo_event_user_role |            3 | ceo_role_id  | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
| ce_organizers |          1 | ceo_user_event      |            1 | ceo_user_id  | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
| ce_organizers |          1 | ceo_user_event      |            2 | ceo_event_id | A         |         133 |     NULL | NULL   |      | BTREE      |         |               |
+---------------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Index recommendations:

-- On ce_participants
PRIMARY KEY(cep_id),
UNIQUE INDEX cep_event_participant (cep_event_id, cep_user_id), -- Uniqueness constraint
INDEX cep_event_unregistered ( cep_event_id, cep_unregistered_at ), -- For COUNT
INDEX cep_user_unregistered_event ( cep_user_id, cep_unregistered_at, cep_event_id ) -- For JOIN

SHOW INDEX FROM ce_participants;

MariaDB [wikishared]> SHOW INDEX FROM ce_participants;
+-----------------+------------+-----------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name                    | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-----------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| ce_participants |          0 | PRIMARY                     |            1 | cep_id              | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
| ce_participants |          0 | cep_event_participant       |            1 | cep_event_id        | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
| ce_participants |          0 | cep_event_participant       |            2 | cep_user_id         | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
| ce_participants |          1 | cep_event_unregistered      |            1 | cep_event_id        | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
| ce_participants |          1 | cep_event_unregistered      |            2 | cep_unregistered_at | A         |         276 |     NULL | NULL   | YES  | BTREE      |         |               |
| ce_participants |          1 | cep_user_unregistered_event |            1 | cep_user_id         | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
| ce_participants |          1 | cep_user_unregistered_event |            2 | cep_unregistered_at | A         |         276 |     NULL | NULL   | YES  | BTREE      |         |               |
| ce_participants |          1 | cep_user_unregistered_event |            3 | cep_event_id        | A         |         276 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+-----------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

AC passes as index recommendations match what are currently in the tables on betacluster. One note though, campaign_events table also has the event_timezone_id index with columns event_timezone and event_id which are not mentioned in the index recommendations AC. @Daimona is this acceptable and correct? Wanted to check on this before marking this ticket as done.

In T308738#8225467, @vaughnwalters wrote:

AC passes as index recommendations match what are currently in the tables on betacluster. One note though, campaign_events table also has the event_timezone_id index with columns event_timezone and event_id which are not mentioned in the index recommendations AC. @Daimona is this acceptable and correct? Wanted to check on this before marking this ticket as done.

Yes, I just added that index for T311126.

In T308738#8225504, @Daimona wrote:

In T308738#8225467, @vaughnwalters wrote:

AC passes as index recommendations match what are currently in the tables on betacluster. One note though, campaign_events table also has the event_timezone_id index with columns event_timezone and event_id which are not mentioned in the index recommendations AC. @Daimona is this acceptable and correct? Wanted to check on this before marking this ticket as done.

Yes, I just added that index for T311126.