Page MenuHomePhabricator

Performance review of CampaignEvents extension
Closed, ResolvedPublic

Description

Description

We are creating an on-wiki registration system for campaigns in the Wikimedia movement (e.g. Wiki Loves Monuments, WikiGap, #1Lib1Ref, etc.). For this iteration, we are focusing on the campaign organisers' experience to create/edit registration, as well as for participants' experience to sign up for campaigns.

Please find user flow here: for campaigns organisers, and for campaigns participants

Additional references:

Preview environment
Which code to review

The code repository is hosted on: https://gerrit.wikimedia.org/g/mediawiki/extensions/CampaignEvents

Performance assessment

Please initiate the performance assessment by answering the below:

What work has been done to ensure the best possible performance of the feature?

We tried to adhere to the performance best practices, making use of the abstraction layers provided by MediaWiki. We evaluated each feature in edge case scenarios (e.g., what happens if the number of X records is one million times the expected number of average records), to ensure that it was built in a scalable way. We use pagination or infinite scrolling when presenting potentially-"unlimited" lists of records. We made sure to use DeferredUpdates where applicable (right now the only example is handling the PageMoveCompleteHook). We consulted DBAs to make sure that our queries could use indexes and remain fast (T308738); the schema was tested with a big random dataset to make sure that reality reflected expectations. We tried bundling client-side features in fewer modules, to avoid bloating the RL registry.

What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?

The extension will be configured to use a central database (on x1) for Wikimedia production. I'm not sure if this would be slower than just using the wiki database.

We currently do not have many features, and consequently not many possible bottlenecks. I would say the main weak area right now is database reads, especially when many records are potentially scanned/retrieved at the same time (e.g. Special:MyEvents).

Also, the fact that our extension is "global" (i.e., uses a central DB, works with central user accounts and references pages cross-wiki) means that some operations could be slower than normal. For instance, Special:EventDetails has an input for filtering a list of users by name. Since we only store the central user IDs, we need to run an unfiltered query on the whole table, then convert IDs to names with CentralAuth and filter the list on the PHP side. Another example is that we cannot preload event pages en masse on Special:MyEvents, because LinkBatch does not support cross-wiki page references (so we only use it for local pages). Looking up a central user account happens in many places throughout the codebase.

Are there potential optimisations that haven't been performed yet?

One thing that we didn't think about (yet) is caching DB records. We use neither in-object caching nor object cache. We haven't looked into which places would benefit from this, and what the appropriate cache type would be for each of them.

Also, this is not really an optimization per se, but we don't have any performance measurement in place. We could maybe use some help in determining which operations would need to be measured.

Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: performance-team@wikimedia.org.

As explained above, no measurements are in place. The only thing we kinda measured is the impact of adding the proposed indexes; this was tested on a random dataset, see T308738.

Event Timeline

vyuen changed the task status from Open to Stalled.Mar 2 2022, 9:16 AM
vyuen moved this task from Backlog to V0 on the Campaign-Registration board.

@vyuen: Assuming this task is about the CampaignEvents code project, hence adding that project tag so other people who don't know or don't care about team tags can also find this task when searching via projects. Please set appropriate project tags when possible. Thanks!

Krinkle subscribed.

Thanks for filing this and letting us know about this. We'll plan this in accordingly.

Feel free to ask any questions before and during development as well. Either here, through our team mail, in #wikimedia-perf on Libera Chat, or on specific change during code review.

Krinkle triaged this task as Medium priority.Apr 11 2022, 5:55 PM

Hi @Krinkle, the project is now ready for a performance review! I have updated the task descriptions to include relevant info. Please let me know if I can help with clarifying! Thanks.

vyuen changed the task status from Stalled to Open.Sep 20 2022, 8:32 AM
Krinkle renamed this task from Performance review of CampaignEvents Registration to Performance review of CampaignEvents extension.Sep 26 2022, 5:22 PM

Hey @Krinkle , wanted to check in and see if 1) you had an estimated completion date in mind for this review; 2) any detail is missing from our side. Please let me know if a discussion with Campaigns engineers would be helpful at all. Thanks!

I'm looking at the backend code today.

Since I expect this extension to have low traffic and modest table sizes, there isn't a lot of risk. The the rest APIs are fairly narrow as well. The hook handlers only apply to Event pages and modules.

Here are my notes:

  • Why are event_chat_url, event_tracking_tool_event_id, event_meeting_url BLOBs instead of BINARY and why is the max size so high (especially event_tracking_tool_event_id)?
  • It seems like events only have one address, but there are separate ce_address and ce_event_address tables. Is this suppose to change in the future? A lot of these updates would be simplified if it's 1:1.
  • updateStoredAddresses() uses a DELETE/INSERT pattern that's prone to deadlocks. One workaround is to use GET_LOCK, non-locking SELECT, and INSERT/UPDATE based on the SELECT, but it could be left as is since it will be called rarely.
  • ce_address should at least have something like a cea_sha1 field that acquireAddressID() could make use of (with FOR UPDATE or INSERT IGNORE if the field is UNIQUE).
  • onPageDeleteComplete and onPageMoveComplete should load data from DB_PRIMARY, not DB_REPLICA
  • The deleteJoin() in updateStoredAddresses() is strange. The table with full addresses is shared between evens, so two events use the same address, the DELETE seems problematic.

Also:

  • CampaignsDatabaseHelper seems like it can be removed, using the core ConnectionManager class (we have too many extensions with duplicative wrappers)
  • MWDatabaseProxy seems like it can be removed in favor of direct use of IDatabase methods (we have too many extensions with paper-thin wrappers around LoadBalancer/Database methods)
  • Would be nice to define getGroupName() and i18n for the entries at Special:SpecialPages

It would be nice to track the request rate and execution time of the various REST entrypoints in statsd. That metric data really should just be sent by MediaWiki core like it is for api.php (and graphed at https://grafana-rw.wikimedia.org/d/000000559/api-requests-breakdown?orgId=1). I filed T321969.

Since I expect this extension to have low traffic and modest table sizes, there isn't a lot of risk. The the rest APIs are fairly narrow as well. The hook handlers only apply to Event pages and modules.

Thanks for the review! I can explain some of the choices we made, and that you pointed out. In fact, I really want to apologize if these weren't clear, and I believe they should be documented somewhere...

  • Why are event_chat_url, event_tracking_tool_event_id, event_meeting_url BLOBs instead of BINARY and why is the max size so high (especially event_tracking_tool_event_id)?

I wasn't sure what data type and size to use here, so just went with blob. Is there a recommended data type for URLs? As for event_tracking_tool_event_id, the possible values for that field depend on the tracking tool, so we wanted something that will probably work for most (or all) tools.

  • It seems like events only have one address, but there are separate ce_address and ce_event_address tables. Is this suppose to change in the future? A lot of these updates would be simplified if it's 1:1.

Yes, we plan to have a many:many relation in the future, see T321811. It really used to be a 1:1 relation, but we changed that in T316128 to make the future change easier.

  • ce_address should at least have something like a cea_sha1 field that acquireAddressID() could make use of (with FOR UPDATE or INSERT IGNORE if the field is UNIQUE).

We want to add geocoding support in the near future using Pelias, see T316126. The details are still TBD, but our idea is to add more fields to the ce_address table containing structured data about an address. One of these fields would be a unique address ID; we hope to get that from Pelias, or we may use a hash instead as you suggested. So yeah, it's definitely on our radar.

  • The deleteJoin() in updateStoredAddresses() is strange. The table with full addresses is shared between evens, so two events use the same address, the DELETE seems problematic.

The DELETE only happens on the ce_event_address table (first argument), though, no? Records in ce_address are never deleted; only record in the bridge table are deleted.

  • CampaignsDatabaseHelper seems like it can be removed, using the core ConnectionManager class (we have too many extensions with duplicative wrappers)
  • MWDatabaseProxy seems like it can be removed in favor of direct use of IDatabase methods (we have too many extensions with paper-thin wrappers around LoadBalancer/Database methods)

The context for these is that we originally considered having the backend code outside MW (so not as a MW extension). For several reasons, we eventually decided to implement it inside MediaWiki, but with a thin abstraction layer around several MW classes (like UserIdentity, Authority, PageIdentity, Database, etc.). We did that in the hope that moving the code outside MW in the future (if we decide to do so) would be easier, if the extension is not too tightly coupled with MediaWiki. This is not perfect, of course, and those classes are in some sort of limbo until we make a decision on that. The whole MWEntity directory serves this purpose.

@aaron thanks so much for your review! Looks like we're OK to close this? Please give a shout if there are followup tasks we want to capture elsewhere.

Re-opening per @Daimona's request; there are still some open questions - apologies!

Hi @aaron @Krinkle , thanks for your support with this so far. The Campaigns team is seeking some clarity around our 'definition of done' for this review. Looks like there are some things that need to be fixed on our side (for which @Daimona has asked for clarification above). Can those issues be followup tasks? Or do they need to be fixed before we can consider the review closed?

Given @aaron's note in T302858#8354617,

Since I expect this extension to have low traffic and modest table sizes, there isn't a lot of risk.

... it seems like we can consider this review closed, and open up separate tasks for the improvements? Please let us know what you would recommend.

We have enabled the CampaignEvents extension on testwiki, test2wiki, and office-wiki, and we are planning to release to meta-wiki the week of November 28.

I see us using BLOB for other URL fields and this is a small table, so it's fine (though varbinary can go pretty high, depending on the size of other such columns on the table).

The deleteJoin() is fine btw, I misread how we implemented it. I need to update the comments to clarify how it works.

I'd strongly suggest removing any needless DB layers, though that doesn't have to happen right now.

Re-opening per @Daimona's request; there are still some open questions - apologies!

Replied. No blockers.