Page MenuHomePhabricator

Investigation: Can we display event registration actions in event page history?
Open, LowPublic

Description

As a Campaigns team member, I want to know what technical options are available to display event registration actions in the event page history, so that Wikimedians can all easily watch and monitor event activity in the way that they do other actions related to wiki pages.

Background: Right now, an event page can be created and monitored like other wiki pages. This means that their creation and any actions performed related to the event page are displayed in the wiki page history, and users can watch these changes via their watchlist or RecentChanges. However, actions that are tied to event registration (such as the organizer enabling or disabling registration, or a participant registering for an event) are not displayed in the page history, so users currently cannot see such changes in their watchlist or RecentChanges. We would like to improve this experience, so that there are easier ways for users to see changes related to event activity. One solution is to implement logging in Special:Log, which we have already written tickets for (see log when registration enabled: T321018, log when registration disabled: T318160, and log when someone registers publicly for an event : 321019). However, we want to see if it is also possible to add this information into the page history itself, since this could be especially helpful to users.

Acceptance Criteria:

  • Investigate if/how we can display event registration actions in event page history
  • Event registration actions include:
    • Organizer enables registration on an event page
    • Organizer edits registration information (such as date, time, location)
    • Organizer or admin disables registration
    • Participant publicly registers for event (maybe not)
    • Participant cancels their public registration (maybe not)
    • Organizer removes public registrant from an event (maybe not)
    • (in the future) Organizer or admin restores registration
  • Share findings that includes:
    • Any potential solutions (the "how") for how we can display event registration actions in event page history
    • Any potential risks, blockers, or dependencies to flag
    • If you have any general recommendations or commentary based on your findings

Event Timeline

ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)

As anticipated, the option I investigated for this task is MCR. I think it could provide some of the features we're looking for; maybe not all of them, but the remaining features could be implemented via logging or something similar. Note: some of the documents linked below are private. There isn't much I can do about it unfortunately, but I copied all the relevant parts here.

Glossary

Before anything else, I think it's important to include a glossary for some concepts that will be discussed below.

  1. Multi-Content Revisions (in short, MCR): a technology which allows wiki pages to have more than one section of content (called slots); for instance, normal pages only have a wikitext slot, but with MCR, a single page could have a wikitext slot and a JSON slot. See documentation
  2. Slot: an entity that contains part of the content of a page. Each page can have multiple slots, though most pages only have one.
  3. Content model: defines the "type" of the slot content. This could be wikitext, plaintext, JSON, JavaScript, etc. See the documentation
  4. Event page: a normal wikipage that contains information about an event. Presently, these are normal pages with a single wikitext slot.
  5. Registration form: the form that lets you create or edit a registration, currently on Special:EnableEventRegistration and Special:EditEventRegistration respectively.

Previous work

This is not the first time the Campaigns team investigates MCR as a potential approach. In the very early stages of development, we considered building the registration form on the event page. The event data would be stored in a JSON-like slot alongside the main wikitext slot, and both slots would be editable in action=edit. The final decision, recorded here, was not to do this, because the "editing interface" part of MCR does not exist yet: editing multiple slots at the same time is not supported TTBOMK, and EditPage is what it is, making it very hard to customize the editing interface the way we wanted to. As such, since this option would require a huge refactoring, we decided to discard it.

We then wondered if it would be possible to rescope the original proposal in the following way: use the backend/storage part of MCR, but leave the registration form on a special page, instead of trying to integrate it with EditPage. However, we didn't investigate this fully, and eventually other things took priority and this was never discussed again.

Current proposal

We would define a new content model that stores the event data as JSON. The structure would be pretty simple, something like:

{
   "is_online": true,
   "meeting_url": "https://....",
   "address": "..."
   ...
}

We would also define a new slot, that uses this content model to store event data alongside the wikitext slot. The editing interface for this data, i.e. the registration form, would remain in a special page as it is now.

Is MCR good for this use case?

This is one of the questions we had. We met with @daniel a couple times to discuss this, while we were still actively considering the MCR approach. Those meetings gave us a better understanding of MCR, and below I'm copying the relevant notes. I've skipped things related to the editing interface, since it's no longer in scope, and added some new considerations.

Jan 31, 2022 and Feb 2, 2022

  • Things like (un)deleting, moving, importing/exporting, (un)protecting pages etc. would work out of the box with MCR.
  • Daniel shared this checklist that answers precisely the question in the section title. I think this is a great resource, and we should probably go through the checklist and see how much our answers align with MCR.
  • Potential limitations of MCR:
    • RecentChanges, history, and watchlist cannot be filtered by slot. This means that edits to the main slot (wikitext, corresponding with the event page as it is now) would be listed together with edits to the registration data itself (secondary slot). This can be mitigated (autosummary, tags), but in general it's tricky.
    • Layout of the page: the content from the extra slot can only be displayed at the bottom, or hidden. There are plans to change in the future. However, I don't think this would be an issue, we can probably just hide the slot itself and continue displaying the registration data like we're already doing.
  • We should write the code for displaying a diff between two version of an event registration. We can take a look at the diff/diff composer package and how Wikibase uses it.
  • It's possible to access data from the extra slot in Lua (Scribunto), but there might be complications in the details. I think this is very appealing, because someone could write a template that pulls the data from the extra slot and puts it inside the wikitext, in case people want to repeat the information there.
  • Even if we represent our structured data as JSON, we should not use JsonContent. That class is only meant for JSON content that users should be able to edit manually. Instead, subclass AbstractContent and implement the tiny bits of JSON Parsing that we need. See T275976 and r674572.
  • Potential issue: implementing different permission levels for each slot (e.g., everyone can edit the event page, but only organizer can edit event data). This would have to be investigated more carefully.

April 26, 2022

  • MCR is a good candidate if we find ourselves re-implementing typical "wiki features" (history, blocks, watchlist, ...) from scratch.
  • However, it wouldn't be a good candidate if we don't want those features for event registrations.
  • Potential integration: partial blocks. If a user is blocked on the event page, they would not be able to register. Namespace-wide ban to ban someone from all events. (this doesn't really require MCR, but it reinforces the idea of event page and registration being closely related)
  • Hard to say how complex it would be to migrate to MCR once we already have existing event registrations.

Demos

Currently, the only usage of MCR in production is for structured data on commons. You can see it in action by finding a file with structured data (example), and clicking the "structured data" label below the image. The data that you see there is stored in an extra slot on the page, as the API shows. Edits to that data can be found in the page history, see example diff.

I also adapted some code that I wrote months ago when first exploring MCR and prepared a demo of how this would look like for event pages. This is the patch, and here is a patchdemo instance where you can try it out. You can check out the history of the event I created, and create your own event and see how it works. Note that this is really just a POC, and many details of what you see are just for testing.

What it could (not) help us with

From the task description:

  • Organizer enables registration on an event page: This would appear in the page history as a normal page edit. Maybe we can explain that the registration was enabled in the auto-generated summary. We could also have a separate log on Special:Log.
  • Organizer edits registration information (such as date, time, location): That's the main idea behind using MCR: this information would appear in the page history, RecentChanges, watchlist, etc.
  • Organizer or admin disables registration: This depends on the implementation. I assume that when the registration is deleted, we would delete the extra slot. I'm not sure how this would appear in the page history; the MCR checklist says this is possible but "UI is lacking". Nonetheless, we could have a separate log on Special:Log for this (just like the deletion log).
  • Participant publicly registers for event: This is a huge question. It would be possible only if we decide to store the list of participants on the event page, alongside the event data. I'm not sure if it would be a good idea. It would make things more complex, and could potentially clutter the history. And this is without considering private participants, that would make this even more difficult given the view restrictions that I'll mention in the a later section. At any rate, this could be logged in Special:Log.
  • Participant cancels their public registration: Same as the one above.
  • Organizer removes public registrant from an event: Idem.
  • (in the future) Organizer or admin restores registration: Similar to when registration is deleted, it depends on the implementation.

Advantages of using MCR

Aside from what's requested in the task description, there could be some additional advantages in using MCR; some were already mentioned above, and I'm repeating them here.

  • Content of the extra slot could be accessed in lua, and used in the main wikitext slot (templates etc.).
  • Organizers could enter a summary when making changes to an event registration, and that summary would appear in the edit history.
  • When MCR will support displaying the slot content in places other than the bottom, we could streamline the code that renders the registration header.
    • At that point, it would be much easier to implement the "preview" functionality requested in T317690.
  • It's closer to Everything is a wiki page (still not quite there though)

Caveats

While MCR is great and worth exploring, there are a few things to keep in mind if we want to use it:

  • The extra slot is attached to a single page (i.e., the event page), not subpages, associated talk, etc. I'm not sure if this can be a problem, but I thought it was important to mention.
  • Data for the latest version of a registration would be stored twice: in the extension's tables, and in the revision store. This is not a huge problem, but still, it's redundant and one more thing to keep in mind. There are plans to fix this at T209044.
  • If we decide to change something in the event registration data, like removing "country" (T317579) or implementing multi-events (T321811), these changes would have to be made to the custom content model as well, and they should be backwards-compatible.

And finally, what I think is the biggest caveat: permission checks. First, we need to make sure that only organizers can edit the extra slot. I think this would be doable, as long as we make sure that the slot can only be edited through our code, and that our code performs the permission checks. Second, and most important: some of the data in the extra slot probably shouldn't be public. Private participants, if we choose to include them, would be an example. But even now, the meeting link and chat invite link should only be visible to participants. Since the extra slot is part of the page, this is roughly equivalent to asking that the content of the page should not be public. This is just not how MediaWiki works, see https://www.mediawiki.org/wiki/Manual:Preventing_access#Restrict_viewing. The content is generally public, and restricting specific parts of it is very hard; not just in the interface, but also API, search, dumps etc. Maybe we could avoid putting those fields in the extra slot, but this is also not ideal.

Open questions

These are some questions that I have, and for which I could not find a good answer:

  • Can we add some constraints, like:
    • Prevent direct editing of the extra slot, via interface or edit API. This seems doable with ContentHandler::supportsDirectEditing, but I want to double-check.
    • Make sure that the content model is not used outside of the extra slot in event pages.
    • Prevent changing the content model of the extra slot.
    • Make sure that the extra slot can only be added to event pages.
  • IIUC, it's possible to delete a single slot from a page. How would this work exactly, especially in the UI? Is there any caveat? Is this really fully supported?
  • What can we do about the caveats above, particularly visibility?
  • Some things I was wondering while writing the code for the demo:
    • I'm not entirely sure how some methods in Content should be implemented, things like getSize() and isCountable()
    • We're not really interested in displaying content from the extra slot at the moment, but fillParserOutput() wants some text from us?!
    • Huge mess in core's diff classes (DifferenceEngine, DiffFormatter): unsure what belongs to which class, and more generally, is there a way to reuse the table structure without stealing the code from there (like Wikibase already does)?
    • More generally, I'd appreciate it if someone more familiar with Content(Handler) could take a look at the patch once it's polished and finished, just to make sure that we're using the interface correctly.

Estimates and conclusions

I think MCR would bring some advantages, not just for the page history, but for potential future plans as well. However, there are some questions that need to be answered first. So I would propose the following two things as the next steps:

  1. Get answers to the "open questions" above
  2. Go through the checklist, maybe in a sync group conversation, and make sure that MCR fits our use case

These could be done in parallel, and the ideal outcome of these steps is that we can make an informed decision on whether to use MCR.

Finally, some estimates. Since we wouldn't be changing anything in the interface, this proposal would be easier to implement than the original idea where the registration form was on the event page. The patch I linked above is very experimental, but it's a working POC that took me just an hour or two to write. Many details are still to be implemented, but it shouldn't be hard. For me, the main source of complexity is that I don't know much about Content(Handler), so I had to experiment a bit. All in all, I would say the implementation wouldn't take too long, and it may be completed in a single sprint, assuming that all the blockers are cleared beforehand.

And also an idea about migrating later: we could have a maintenance script that checks every event page to see if it has the extra slot, and if not, it makes an edit to that page (using a dummy system user) to fill it with the current data. This would ensure consistency, and reinforce the assumption that all event pages have the extra slot.

ldelench_wmf subscribed.

Discussing this on Dec 1 & will log decisions/next steps here.

Prevent direct editing of the extra slot, via interface or edit API. This seems doable with ContentHandler::supportsDirectEditing, but I want to double-check.

Yes, that should work.

Make sure that the content model is not used outside of the extra slot in event pages.
Prevent changing the content model of the extra slot.
Make sure that the extra slot can only be added to event pages.

This should be covetred by SlotRoleHandler::isAllowedModel and ContentHandler::canBeUsedOn

IIUC, it's possible to delete a single slot from a page. How would this work exactly, especially in the UI? Is there any caveat? Is this really fully supported?

It's supported in the storage layer. There is no generic UI or API for it at the moment.

I'm not entirely sure how some methods in Content should be implemented, things like getSize() and isCountable()

isCountable() is normally only used on the main slot. If you want the extra slot to be able to make the page "count" against the wiki's article count even when the main slot doesn't, then this would be relevant.

getSize() is tricky for structured data. The only use for it is really the indication of how "big" an edit is, i.e. how much the size changed. Wikibase uses return strlen( serialize( $this->getNativeData() ) ).

We're not really interested in displaying content from the extra slot at the moment, but fillParserOutput() wants some text from us?!

This is used when showing diffs, iirc. Just put pretty-printed json in <pre> tags, that should be ok for now.

Huge mess in core's diff classes (DifferenceEngine, DiffFormatter): unsure what belongs to which class, and more generally, is there a way to reuse the table structure without stealing the code from there (like Wikibase already does)?

Write a DiffFormatter, you should have no need to touch DifferenceEngine. The table structure is a real mess. The output of the DiffFormatter has to match what DifferenceEngine expects, IIRC.

More generally, I'd appreciate it if someone more familiar with Content(Handler) could take a look at the patch once it's polished and finished, just to make sure that we're using the interface correctly.

Sure, just ping me!

  • The extra slot is attached to a single page (i.e., the event page), not subpages, associated talk, etc. I'm not sure if this can be a problem, but I thought it was important to mention.

We want to use subpages of a "sandbox" page to create our test events, wondering if this can be a blocker.
T323299

Also, wondering... Would it be kind of blocking users to be able to use subpages as event pages?

  • The extra slot is attached to a single page (i.e., the event page), not subpages, associated talk, etc. I'm not sure if this can be a problem, but I thought it was important to mention.

We want to use subpages of a "sandbox" page to create our test events, wondering if this can be a blocker.
T323299

Also, wondering... Would it be kind of blocking users to be able to use subpages as event pages?

Subpages are just pages - whether they have the extra slot or not is up to you. They wouln't share the slot with the parent page, they would have their owen. Or none.

Sure, just ping me!

Thank you, your answers above are already useful! The team is going to discuss this proposal this week, and I'll let you know if we have more questions.

  • The extra slot is attached to a single page (i.e., the event page), not subpages, associated talk, etc. I'm not sure if this can be a problem, but I thought it was important to mention.

We want to use subpages of a "sandbox" page to create our test events, wondering if this can be a blocker.
T323299

Also, wondering... Would it be kind of blocking users to be able to use subpages as event pages?

The reason why I mentioned this is because some events use subpages to organize information. If, in the future, we use MCR to display the registration header on the event page, then it wouldn't be possible to display the header on subpages. At least not natively, though I guess we could still do that by manually invoking the rendering logic.

Update: the team discussed this on Nov 30. We went through the checklist and had some brainstorming, and here are the relevant points:

  • According to the checklist, MCR seems reasonable for our use case. The only thing that would make things harder is read permissions, as noted in T322657#8389818. This is a big pain point that we're not quite sure how to address.
  • We would be storing the data twice, in the campaign_events table and in the revision store. This means that the data is duplicated, but on the bright side, it means that the content of the revision is essentially just a copy of the original data, and we may choose to not replicate some data if we don't want it to be publicly available.
  • Is it necessary/expected to respect page protection? If so, how to do that? PermissionManager::userCan?

@daniel We would love some feedback about our current proposal, to see if it would make sense to use MCR this way. And we would also like to discuss potential ways to resolve the issue with some data not being public. Would you be available for a conversation with us? Thanks in advance!

The reason why I mentioned this is because some events use subpages to organize information. If, in the future, we use MCR to display the registration header on the event page, then it wouldn't be possible to display the header on subpages. At least not natively, though I guess we could still do that by manually invoking the rendering logic.

Yes, correct. It wouldn't pe part of the page content. It can still be part of how the page is displayed. You just have to be careful about caching.

Update: the team discussed this on Nov 30. We went through the checklist and had some brainstorming, and here are the relevant points:

  • According to the checklist, MCR seems reasonable for our use case. The only thing that would make things harder is read permissions, as noted in T322657#8389818. This is a big pain point that we're not quite sure how to address.

I wasn't aware that you were planning to restrictreads. That isindeed tricky, for all wiki content. MediaWiki just isn't made for it.
It's easy enough to have some data that is not rendered per default, and is only available in the edit interface, after additional permission checks. But such data would still be in the dumps, and accessible via Special:Export.

It's not safe to manage sensitive information as page content on a public wiki. If you have such data (e.g. email addresses), the only option I see is to store them elsewhere, and reference them by ID.

Write permissions are easy to implement, you just check them in the Special page or API you use for updating the slot data.

  • We would be storing the data twice, in the campaign_events table and in the revision store. This means that the data is duplicated, but on the bright side, it means that the content of the revision is essentially just a copy of the original data, and we may choose to not replicate some data if we don't want it to be publicly available.

This isn't a bug, it's a feature - think of it as custom indexing, or strategic de-normalization. We do this for all aspects of pages (template usage, categories, rendered HTML, search index, etc); it's a deliberate aspect of MW architecture, and it's working nicely. If you do it right (using DataUpdates from ContentHandler), the derived structured data gets updated automatically and atomically, there is a single source of truth, compact archival, and fast queries.

  • Is it necessary/expected to respect page protection? If so, how to do that? PermissionManager::userCan?

Any UI/API you write for updating the slot data shoujld apply the appropriate permission checks. The correct (new-ish) way to do that is to use the Authority interface (you can call getAuthroity() on the RequestContext to get one). it offers probablyCan(), definitelyCan(), authorizeRead() and authorizeWrite() checks.

If you check the ''edit'' permission, page protection and user blocks are automatically taken into account. Soon, rate limits will automatically apply as well.

@daniel We would love some feedback about our current proposal, to see if it would make sense to use MCR this way. And we would also like to discuss potential ways to resolve the issue with some data not being public. Would you be available for a conversation with us? Thanks in advance!

Sure, send me an invite for January!

The reason why I mentioned this is because some events use subpages to organize information. If, in the future, we use MCR to display the registration header on the event page, then it wouldn't be possible to display the header on subpages. At least not natively, though I guess we could still do that by manually invoking the rendering logic.

Yes, correct. It wouldn't pe part of the page content. It can still be part of how the page is displayed. You just have to be careful about caching.

Caching is one of the reasons why I'd like to leverage MCR for the rendering logic in the future. Right now we're using the ArticleViewHeader hook to show the data, and I'm not sure how that behaves re caching. When SlotRoleHandler will support displaying content at the top of the page, we're definitely going to use that.

  • According to the checklist, MCR seems reasonable for our use case. The only thing that would make things harder is read permissions, as noted in T322657#8389818. This is a big pain point that we're not quite sure how to address.

I wasn't aware that you were planning to restrictreads. That isindeed tricky, for all wiki content. MediaWiki just isn't made for it.
It's easy enough to have some data that is not rendered per default, and is only available in the edit interface, after additional permission checks. But such data would still be in the dumps, and accessible via Special:Export.

It's not safe to manage sensitive information as page content on a public wiki. If you have such data (e.g. email addresses), the only option I see is to store them elsewhere, and reference them by ID.

Yup, unfortunately I know how hard it would be to restrict reads... I'm sure there might be reasonable workarounds, so I think the focus would be on finding the best (tm) one.

  • We would be storing the data twice, in the campaign_events table and in the revision store. This means that the data is duplicated, but on the bright side, it means that the content of the revision is essentially just a copy of the original data, and we may choose to not replicate some data if we don't want it to be publicly available.

This isn't a bug, it's a feature - think of it as custom indexing, or strategic de-normalization. We do this for all aspects of pages (template usage, categories, rendered HTML, search index, etc); it's a deliberate aspect of MW architecture, and it's working nicely. If you do it right (using DataUpdates from ContentHandler), the derived structured data gets updated automatically and atomically, there is a single source of truth, compact archival, and fast queries.

Indeed, I did realize how this can actually be of help. I don't really know about the "do it right" part, but I agree that the duplication could be beneficial.

  • Is it necessary/expected to respect page protection? If so, how to do that? PermissionManager::userCan?

Any UI/API you write for updating the slot data shoujld apply the appropriate permission checks. The correct (new-ish) way to do that is to use the Authority interface (you can call getAuthroity() on the RequestContext to get one). it offers probablyCan(), definitelyCan(), authorizeRead() and authorizeWrite() checks.

If you check the ''edit'' permission, page protection and user blocks are automatically taken into account. Soon, rate limits will automatically apply as well.

Right, I didn't realize that we could use Authority for that.

@daniel We would love some feedback about our current proposal, to see if it would make sense to use MCR this way. And we would also like to discuss potential ways to resolve the issue with some data not being public. Would you be available for a conversation with us? Thanks in advance!

Sure, send me an invite for January!

Thank you for the availability, I'll do that!

Update: here are the key points from the meeting between Daniel and the Campaigns team.

  • Main issue with MCR is event data that should not be publicly visible, like the meeting link. Aside from that, MCR seems reasonable for our use case, so that was the main focus of our conversation.
  • Putting private info in the page content and hoping to later restrict its visibility is a lost cause -- abandon all hope ye who enter here.
  • Generally speaking, we have 2 ways of doing things: 1) the wikipage is the source of truth, and the extension tables would contain a copy of that data that is only there for querying etc; 2) the extension tables are the source of truth, and the page content is a read-only copy of that data that is only there for display and logging.
    • Originally, we wanted to implement option 2 because of the private data: an option we considered was to only include a subset of the event details in the wiki page (i.e., the public data), which means the page can't be the source of truth.
    • However, option 2 may not work well with MCR because it's a different paradigm. For instance, Wikibase uses option 1. Going with option 2 means that things like reverts/rollbacks wouldn't work, and we would actually have to actively prevent them. On top of that, option 2 may not even need MCR at all: we could simply create a fake revision for the event page (like that one that is created when a page is moved or protected).
    • All in all, if we choose to use MCR we should most likely go with option 1, not option 2.
  • Then, assuming that we choose option 1, we need to put all the event data inside the page, including private data. One way to do that is to encrypt this information, and decrypt it upon display if user can see it. The data itself could be stored elsewhere. For instance, it could use a dedicated DB table, perhaps append-only, and the page revision would only contain a pointer to a record in that table. This is doable but a bit messy, also depending on how we implement the storage itself. OTOH, this option would probably be more convenient than option 2 or fake revision if there are only a couple pieces of data that have to be private (in our case it should be meeting link and chat link only).
  • In the future, we may want to store participants with MCR, so not only the event details.
    • This could (and probably should) use a third slot on the page
    • Again, the main issue is that this data should not always be public; specifically, private registrations should not be displayed publicly.
    • An alternative is to only store the number of participants, not the names. Would this be useful, though?
    • The other approach for participants is to use Special:Log.
    • A potential issue of both approaches is that in this case, the data that we may want to hide is the username of the performer (of the edit or log action), and possibly no the content of the page itself. This is not really doable in a clean way.
    • To complicate things further, the visibility of each data point could change over time. For instance, someone may register publicly and then switch to private, in which case the previous entry, that used to be public, should also become private.
    • In general, this seems harder to implement, and it's not obvious whether it would be a useful feature.

I feel like using MCR for event data would've been a small/medium effort, if it weren't for the private data. That definitely increases the complexity of this task. I'm still unable to quantify the added complexity, though. Also, there might be more blockers that we didn't think of, that may or may not have to do with private data. Another thing to keep in mind is that even if we don't want to use MCR, hiding data could still be difficult. For instance, I don't think you can easily hide log entries selectively from certain people if using Special:Log. And building a custom log would also be quite complex, in addition to being less standard and less integrated with other workflows (e.g., it would not appear in page histories etc.).

Daimona lowered the priority of this task from Medium to Low.Jul 13 2023, 2:50 PM