Page MenuHomePhabricator

Turn edit summary hashtags into change tags
Open, Needs TriagePublicFeature

Description

Hashtags have become the standard way of marking edits as belonging to campaigns, but there is no easy way to filter for them (hashtags.wmcloud.org is handy but does not help with filtering recent changes, watchlists etc), they cannot be searched efficiently in the database, and there is no way to add or fix them after the edit is made. Change tags provide all of this functionality, but they are meant be added by software, not manually.

A simple way to bridge the two would be to automatically turn hashtags into change tags, e.g. if an edit summary contains the word #wpwp, it would automatically receive the change tag hashtag-wpwp (or something like that; the hash character cannot be used in change tag names and they need to be namespaced somehow). This could either be unrestricted, or restricted to hashtags that campaign organizers have pre-registered in some way; in the latter case, hashtag registration would have to be global as some campaigns run on many wikis, and manually declaring the hashtag on dozens of wikis isn't really feasible. A possible low-effort approach would be to only create the change tag if the i18n message for the change tag name exists, and campaign organizers could use Translatewiki (e.g. via WikimediaMessages) to register and translate hashtag-related change tags.

Such a scheme would have several advantages, while only taking a few lines of code to implement:

  • hashtags would become filterable inside MediaWiki (e.g. recent changes, watchlists, user contributions, page history all support change tag filters)
  • hashtags would become easily searchable by external tools relying on the replica DB (substring searches on edit summaries are inefficient, but change tags are indexed)
  • hashtags would become more visually prominent on various MediaWiki interfaces, and more self-explanatory (as you can associate a description with the hashtag, link from the description to the project page etc, although for cross-wiki campaigns it would take extra tooling to make this option useful - see the Translatewiki suggestion above)
  • it would be possible for wiki administrators to add/remove hashtags for an existing edit (though this doesn't seem like there would be much use for it in practice, it's nice to have the option)

Event Timeline

(The high-effort alternative approach would probably involve doing something with the comment_data DB field which might be more appropriate for this as the data really depends on the comment, not the revision - see T215637: Implement translatable edit summaries / multilingual comments using comment_data for a similar use case. But there is no UI or built-in filtering functionality for comment data, and I don't think anything uses it to date.)

Maybe should be merged into T123529.

That task is about adding editor support for automatically appending hashtags to the edit summary.

Bawolff subscribed.

I really like this idea and think we should do it.

In particular, i think this is better then alternative ideas around having a distinct section for change tags, or allowing change tags from a closed vocabularly.

I think the world & twitter has proven that users like the ability to just add a #tagname to the end of the text box. I also think allowing the vocabularly to be open instead of just a pre-approved list of allowed things has also proven important to the world.

The fact that people have implemented complex toolforge tools to do this speaks to the fact that the existing system of manual change tags & wpChangeTags url parameter does not meet the users' needs.

This is an easy change engineering-wise but in terms of functionality significant enough that I think a product owner should get a chance to veto it. Normally I think this would be Editing team territory, but @ppelberg is on leave - @Esanders do you have an idea who should look at it?

I agree with you, it is a good idea to convert tags for a better use on Recent Changes or Watchlists.

Creating tags can be done by any user, even accidentally. We could see databases with dozens of hashtags only used once.

Removing tags from an edit could be possible, but it would be one more task to put on admins' already full plates.

Hence, I would restrict the conversion to:

  • hashtags that campaign organizers have pre-registered; they can be local or trans-wiki, and could support i18n (maybe just at the campaign level?)
  • hashtags created by the communities, through MediaWiki-extensions-CommunityConfiguration. These would just be local, and needing less i18n.

Ideally, these hashtags should be recognized somehow when a user needs to use one. While suggesting or autocompleting the tags would be nice, there is always the possibility of seeing user using hashtags out of their initial intent, or understanding them the wrong way, with a usage comparable to social media hashtags.

How users can discover the right hashtag for their edit when they add an edit for a valid campaign would require a little bit of design. But it is incidental, as campaigns participants are supposed to know which tag to use (hopefully, they won't misspell it). What would be cool is to suggest tags that match the edit, by analysing the content. Another option would be to suggest tags to users who registered to a campaign or a wikiproject that defined some hashtags.

I would continue to have a distinction between the tags created by Mediawiki, and these hashtags.

Chiming in as a major supporter of this ticket. Turning semi-structured data (hashtags) into structured data (edit tags) enables so so much more tooling / discovery. I also had some of @Trizek-WMF's concerns that converting every hashtag could create overload on database side or Special:Tags pages though perhaps making these tags hidden by default or some of those existing options would make that less of an issue at least from the UI side.

How users can discover the right hashtag for their edit when they add an edit for a valid campaign would require a little bit of design.

If we put more support behind WikiProjects, this would be a fantastic way to bridge this gap. A given WikiProject could register hashtags somewhere and these could be suggested to anyone editing articles that are that WikiProject's worklist. Community Tech already built the PageAssessments extension so you have a quick way to get the relevant WikiProjects for a page (for wikis where it's enabled) and then you'd just need a way to retrieve hashtags registered by a WikiProject, which I assume is a solve-able problem.

converting every hashtag could create overload on database side or Special:Tags pages though perhaps making these tags hidden by default or some of those existing options would make that less of an issue at least from the UI side.

I would go with Special:Hashtags, to make a distinction, with a link between the two. the special page would also allow local hashtags management.

Creating tags can be done by any user, even accidentally. We could see databases with dozens of hashtags only used once.

Per the task description:

A possible low-effort approach would be to only create the change tag if the i18n message for the change tag name exists, and campaign organizers could use Translatewiki (e.g. via WikimediaMessages) to register and translate hashtag-related change tags.

That would mean 1) adding the tag-<tag> and tag-<tag>-description messages to an extension's i18n file if the hashtag is statically defined in the extension (which seems like an unlikely scenario), 2) defining them in the extension's MessagesPreLoad hook if the hashtag is dynamically defined in the extension (maybe read from some community configuration for campaigns), 3) adding them to WikimediaMessages i18n files for Wikimedia-wide campaigns, 4) adding them to the wiki's MediaWiki namespace for local campaigns.

Ideally, these hashtags should be recognized somehow when a user needs to use one. While suggesting or autocompleting the tags would be nice, there is always the possibility of seeing user using hashtags out of their initial intent, or understanding them the wrong way, with a usage comparable to social media hashtags.

How users can discover the right hashtag for their edit when they add an edit for a valid campaign would require a little bit of design. But it is incidental, as campaigns participants are supposed to know which tag to use (hopefully, they won't misspell it). What would be cool is to suggest tags that match the edit, by analysing the content. Another option would be to suggest tags to users who registered to a campaign or a wikiproject that defined some hashtags.

Those sound like neat ideas that are quite complex to do and are unlikely to happen without the WMF resourcing it, so best discussed in another task.

I also had some of @Trizek-WMF's concerns that converting every hashtag could create overload on database side or Special:Tags pages though perhaps making these tags hidden by default or some of those existing options would make that less of an issue at least from the UI side.

In terms of number of revisions, I think it would be marginal compared to e.g. change tags reflecting what editor one is using. In terms of number of change tags, I think Special:Tags is pretty scalable since the creation of the change_tag_def table. In RecentChanges and similar interfaces you can always hide tags by setting the tag name i18n message to -.

We could see databases with dozens of hashtags only used once.

Is that a bad thing? I think one of the benefits of hashtags is you don't need permission to create a new one. Its just open, and natural selection determines what is useful and what is not.

A possible low-effort approach would be to only create the change tag if the i18n message for the change tag name exists, and campaign organizers could use Translatewiki (e.g. via WikimediaMessages) to register and translate hashtag-related change tags.

That would mean 1) adding the tag-<tag> and tag-<tag>-description messages to an extension's i18n file if the hashtag is statically defined in the extension (which seems like an unlikely scenario), 2) defining them in the extension's MessagesPreLoad hook if the hashtag is dynamically defined in the extension (maybe read from some community configuration for campaigns), 3) adding them to WikimediaMessages i18n files for Wikimedia-wide campaigns, 4) adding them to the wiki's MediaWiki namespace for local campaigns.

I'm not sure to get all the subtilities, but I have the feeling that it is a little bit complicated to create a new tag, as it is based on translations. It works for the many cross-languages campaigns, but it is not for the local ones.

Community Configuration could be a place where local tags would be defined and described, very simplily. These tags would then be completed by the ones created by the Campaigns team tools, locally or cross-wiki.

We could see databases with dozens of hashtags only used once.

Is that a bad thing? I think one of the benefits of hashtags is you don't need permission to create a new one. Its just open, and natural selection determines what is useful and what is not.

I can see benefits of natural selection, but I also see a potential ratio of noise/utility.

For some users, our wikis aren't social medias, and hashtags are perceived as social-medias's DNAs. As I mentioned earlier in this task, we could see users adding hashtags because they saw them on other diffs. It could lead to some #unexpectedtagging with #links reading nowhere. Not counting cases where tags would be perceived as a new way to either add unsolicited content in edit summaries, or worse, vandalism. Tags shouldn't be an incentive to move article contents to the edit summary (users adding promotional contents in both).

The usefulness of tags will be easier to perceive if we provide some boundaries (Campaigns, Community configuration...), while natural selection will not necessarily be perceived as good thing.

Just FYI, I started an extension to do something like this https://www.mediawiki.org/wiki/Extension:Hashtags . The extension is not necessarily aimed towards wikimedia. I personally believe that hashtags being ad-hoc (Be bold!) is where their primary value proposition is, but there is a config option in the extension to make it only work with specific hashtags.

Anyways, if nothing else, the extension might work as a good base for what is being discussed here.

I think Special:Tags is pretty scalable since the creation of the change_tag_def table.

Special:Tags has no paging, so that is a bit :S . Also the various backend methods aren't paged (e.g. ChangeTagsStore::tagUsageStatistics(), but also the way software defined tags work assume you will load them all at once). All that is probably fixable, but I'd be a little concerned about the current system if there was > 20,000 tags defined in the system.