Page MenuHomePhabricator

PageUpdatedEvent: model edit flags
Closed, ResolvedPublic

Description

WikiPage and PageUpdater make use "edit flags" represented as a bitfield using the EDIT_XXX constants. At least some of these flags need to be represented in the PageUpdatedEvent somehow.

The contract of these flags are defined in the documentation of WikiPage::doUSerEditContent:

	 * @param int $flags Bitfield:
	 *      EDIT_NEW
	 *          Article is known or assumed to be non-existent, create a new one
	 *      EDIT_UPDATE
	 *          Article is known or assumed to be pre-existing, update it
	 *      EDIT_MINOR
	 *          Mark this edit minor, if the user is allowed to do so
	 *      EDIT_SUPPRESS_RC
	 *          Do not log the change in recentchanges
	 *      EDIT_FORCE_BOT
	 *          Mark the edit a "bot" edit regardless of user rights
	 *      EDIT_AUTOSUMMARY
	 *          Fill in blank summaries with generated text where possible
	 *      EDIT_INTERNAL
	 *          Signal that the page retrieve/save cycle happened entirely in this request.

The significance of these flags to event consumers is as follows:

  • EDIT_NEW and EDIT_UPDATE are irrelevant. They are basically used as a "check and set" feature to prevent race conditions during edits.
  • EDIT_AUTOSUMMARY is not relevant, it just instructs PageUpdater to automatically create the edit summary.
  • EDIT_SUPPRESS_RC seems to be the must relevant: it's needed in ChangeTrackingInventIngress to decide whether to add an entry to recentChnages, and it's set by a large number of use cases.
  • EDIT_MINOR and EDIT_FORCE_BOT are needed to create RecentChanges entries, at the very least.
  • EDIT_INTERNAL indicates that the edit was not interactive. It is used to decide whether to check the EditStash. Note however that it cannot be used to decide whther an edit should be counted against the user's edit count, see (T382592). We need a separate flag for that.

The following questions need to be answered for modeling these flags in PageUpdatedEvent:

  • Should the flags be looped through blindly, without the event class being aware of their meaning?
    • DECISION: no, but we will try to avoid event flags that are 'instructions' for specific consumers, and instead model them as things that happened.
  • Alternatively, should the flags be only accessible through individual getters, and not as a bit field?
    • DECISION: the flag bitfield should not be directly accessible.
  • How should these flags be serialized for use outside of MediaWiki?
    • TBD: but likely either as booleans or a list of strings.
  • Should we add a flag that indicates whether an edit should contribute to the user edit count?
    • DECISION: yes, as an 'implicit edit'.

Notes

Event Timeline

Change #1100392 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] PageUpdatedEvent: explicitly model cause and performer

https://gerrit.wikimedia.org/r/1100392

Conversation from https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1100392, regarding FLAG_SILENT resp. EDIT_SUPPRESS_RC:

Andrew:

I don't love this flag.
How an event should be used by a consumer should be up to the consumer.
In what cases is this flag set? I see when "Update derived slots in an existing revision"?
Instead of a 'silent' instruction, is there something more meaningful here?
Perhaps FLAG_DERIVED was better?

Daniel:

No, it's not for derived slots ( actually now think derived slot updates shouldn't trgger a PageUpdatedEvent at all). It's the equivalent of EDIT_SUPPRESS_RC, which is used a lot: https://codesearch.wmcloud.org/search/?q=EDIT_SUPPRESS_RC
I see you point, but I see no alternative. The emitter needs to be able to signal to the listeners that the update shouldn't be brought to the user's attention. If we use a listener to put things into recentchanges, I see no alternative. We need a way to loop this flag through.
I think the concept of "silent updates" makes sense.

Andrew:

The emitter needs to be able to signal to the listeners that the update shouldn't be brought to the user's attention.

The emitter should include information about the event that happened, not instructions for what listeners should do with the event. Instructions for listeners add semantic coupling between emitter and listeners; it means the emitter has some knowledge of how a listener will use the event.

If we use a listener to put things into recentchanges, I see no alternative.

I think this can be accomplished with a different semantic than 'silent'.
Edit Suppression is something that happens as part of the page update, yes?. Is this 'silent' flag only used for suppressions? If so, then a FLAG_SUPPRESSED kind of flag could make sense.

Daniel:

it means the emitter has some knowledge of how a listener will use the event.

Only very broadly: it means the emitter knows that users can be notified about changes, and it wants to have some control over whether this will happen.

Edit Suppression is something that happens as part of the page update, yes?

This is not about edit supression at all, it's about supressing the entry to recentchanges (it does not affect the page history).
The desire is avoiding redundant notifications to the user. If we don't convey the desire to make the update silent as a flag, all relevant listeners have to know about all the use cases that should be silent: https://codesearch.wmcloud.org/deployed/?q=EDIT_SUPPRESS_RC. That doesn't seem feasible, especially since most of them are in extensions.
I think the fact that all these callers set the EDIT_SUPPRESS_RC flag demonstrates that "silent update" is a concept defined by PageUpdater. It's a property of the update, similar to a tag (but not stored).

Andrew:

Ah okay, this is a specific instruction for RC consumers that this update should not go in the RC log. Seems like explicit listener coupling to me! ;)
Hm, but, looking at https://gerrit.wikimedia.org/g/mediawiki/core/+/c2c98ca061e3962650db212af827566688611c35/includes/Storage/PageUpdater.php#289,
It seems these are very explicit flags that are set by the caller when using PageUpdater. In that case, they do indeed seem to be part of the PageUpdatedEvent (even if they are not explicitly persisted in the page table).
Instead of making a new 'silent' concept, would it be worth just passing the PageUpdater flags used through, just as we do for other properties of the event? I don't think the caller cares if whether or not a property is persisted as part of the page table?
E.g. can we make 'PageUpdatedFlags' equivalent to rev_timestamp, performer, etc? Can/should we just use the same semantics from PageUpdater $flags as in PageUpdatedEvent $flags ?

Ah okay, this is a specific instruction for RC consumers that this update should not go in the RC log. Seems like explicit listener coupling to me! ;)

Yes, my goal was to make it less specific, be making the contract a bit more semantic: instead of saying "don't put this update into RecentChanges", make it "this update is silent, don't poke the user". But perhaps it's better to stick with the stablished meaning, of that's what the listsner needs, and the emitter provides.

I don't think the caller cares if whether or not a property is persisted as part of the page table?

Yes, we should include relevant information even if it's not persistet.

Instead of making a new 'silent' concept, would it be worth just passing the PageUpdater flags used through, just as we do for other properties of the event?

That's what I originally had. I ran into a couple of problems that made me want to use an array:

  • when serializing the event for external consumers, we wouldn't want this to be a bit field I think. We could map back and forth, but that means the event *does* know about the meaning of each bit.
  • local listeners can call nice isXyz() methods for some things, but have to check bitmaps for other things, with no obvious reson for this difference.
  • I needed additional flags in the event, but didn't want to make them available for use with PageUpdater (in particular, FLAG_AUTOMATED. FLAG_REVERTED may not be needed, since it's redundant to the info in EditResult).

when serializing the event for external consumers, we wouldn't want this to be a bit field I think.

Agree. We made a similar decision for the revision visibility (aka revision deleted) flags.

means the event *does* know about the meaning of each bit.

I don't actually mind if the event has properties about the edit that caused the page to update, that is good. My wariness is about when we make the event do something special for specific consumers. (not saying that is the case here, I think we are onto something...)

just as we do for other properties of the event?

local listeners can call nice isXyz() methods for some things, but have to check bitmaps for other things, with no obvious reason for this difference.

Ah! Sorry, I didn't meant to not use a bitfield flag for these properties, I just meant that things like EDIT_SUPPRESS_RC seemed similar to other regular things about the event that happened. As I interpret it, SILENT isn't about the thing that happened (the editor did not specify that the event was silent), it is something that the emitter is deciding to set on the event in order to instruct a specific listener. 'silent edit' isn't a thing the editor specified.

Maybe my wariness is more about how EDIT_SUPPRESS_RC is an instruction for a specific consumer (RecentChanges).

I needed additional flags in the event, but didn't want to make them available for use with PageUpdater (in particular, FLAG_AUTOMATED. FLAG_REVERTED may not be needed, since it's redundant to the info in EditResult).

Makes sense!

I think what I am pushing for is to prefer properties about things that happened rather than instructions for specific consumers.

Yes, my goal was to make it less specific, be making the contract a bit more semantic: instead of saying "don't put this update into RecentChanges", make it "this update is silent, don't poke the user". But perhaps it's better to stick with the stablished meaning, of that's what the listsner needs, and the emitter provides.

Okay, this makes more sense then. Your intention was to make 'silent edit' be a new concept that hopefully abstracted away coupling with RecentChanges.

Now that you put it that way, I think this might be a good idea after all. Can we either:

  • codify this concept of 'silent' (or 'quiet'?) above the event in MW core, and then make EDIT_SUPPRESS_RC (and other things?) result in EDIT_SILENT?

or

  • Just document very explicitly this intention for FLAG_SILENT and what it means, in the event?

Should the flags be looped through blindly, without the event class being aware of their meaning?

Generally: I think it is okay to have the event class present a better API to users, as long as the differences are very clearly documented, and the properties presented are directly connected with something that happened in the past, and ideally presented to the users that way.

When to do so is sometimes hard to decide though. I guess it depends on how nasty the original concept is. E.g. 'revision deleted' was just so confusing as a name and as a way of viewing what has happening. revision visibility settings made much more sense, so we went with that for mediawiki/page/change.

Daniel and I discussed in today's DomainEvents Working Session, notes here.

We decided

  • We will use PageUpdatedEvent specific flags, rather than just passing edit flags through, in order to make a better API.
  • EDIT_SUPPRESS_RC can just renamed something like EDIT_SILENT, and we can codify the concept of a 'silent' edit in MW core. We can then just use 'silent' as a flag on the event.
  • We will add an 'implicit' flag on the event which will indicate that the event was usually 'automated' in someway, meaning that it should not count towards a user's edit count. I don't recall if @daniel wanted to codify the 'implicit' edit concept above the event in MW core.

Other suggested PageUpdatedEvent flags are uncontroversial so go for it! :)

Thanks!

Change #1110858 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] PageUpdatedEvent: improve modeling of flags

https://gerrit.wikimedia.org/r/1110858

Change #1119474 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] DomainEvents: More clearly model null edits and dummy revisions

https://gerrit.wikimedia.org/r/1119474

Change #1110858 merged by jenkins-bot:

[mediawiki/core@master] PageUpdatedEvent: improve modeling of flags

https://gerrit.wikimedia.org/r/1110858

Change #1119474 merged by jenkins-bot:

[mediawiki/core@master] DomainEvents: More clearly model null edits and dummy revisions

https://gerrit.wikimedia.org/r/1119474

Closing sprint 4 and marking as resolved!