Instead of just showing the type and action, use the existing messages to clarify
Additionally, use the log parameters (when relevant)
Rather than trying to implement a version of LogFormatter in javascript, just send the display to use in the api
Instead of just showing the type and action, use the existing messages to clarify
Additionally, use the log parameters (when relevant)
Rather than trying to implement a version of LogFormatter in javascript, just send the display to use in the api
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Handle rendering edits and log entries separately | mediawiki/extensions/GlobalWatchlist | master | +335 -107 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T260220 Use proper messages for log entries | |||
Resolved | DannyS712 | T260227 Watchlist api: Include log entry action text | |||
Open | Feature | None | T287929 Show page move destination | ||
Resolved | DannyS712 | T288385 Add objects to represent data for entry rows |
...so it turns out that each log type has its own handling of the log parameters, etc. Rather than recreating the handling in javascript, I suggest adding an additional property to the watchlist api for the log display instead.
Hmm, started looking into doing this now, and it seems that its not the most compatible with how we're currently formatting things - doesn't give a good place to add the links to the log entry and pages logs, or to unwatch. Might not be able to use the logdisplay returned by the api after all
Dumped a bunch of notes at T287929#7265674, but some more:
There are some log entry messages that will be easy to deal with, because they don't have any more than the 3 basic parameters (user, user's gender, target page), and others that will be much harder. It'll probably start with slowly adding support for some log entries, and using the existing format for the others.
We'll want to load the messages we need via the api, not via ResourceLoader modules, at least for non-core log entries, to avoid a bunch of warnings for missing messages. We should add some logic to avoid trying to load a missing message multiple times, based on the existing messages stuff at https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/8d5ed3572dbd642ebd0788c6441e683d460b2da4/resources/src/mediawiki.api/messages.js
Log entries that will be easy (just those three params) (basically https://codesearch.wmcloud.org/deployed/?q=%22logentry-%5CS%2B%22%3A%20%22%5B%5E4%5D%2B%22&i=nope&files=%5C%2Fen%5C.json%24&excludeFiles=&repos= but only listing the most common ones for now)
More complicated
We will also need some logic to figure out which message to use (move over redirect or without leaving a redirect is for the same log type and action)
Basic sketch of gender support (needed for all log entries):
Since we only need it for log entries, filter entries to get unique names of users having performed a log action. Ideally, we will cache this between sites in MultiSiteWrapper, with the assumption that a user's gender is consistent across wikis, but at least should be cached for a single site. Anonymous users are assumed to have unknown.
Processing (examples use $2 but this will need to be done for other variables too in some messages) (https://gerrit.wikimedia.org/g/mediawiki/core/+/8d5ed3572dbd642ebd0788c6441e683d460b2da4/languages/Language.php#3965):
// '{{GENDER:$2}}' -> '' if ( message.match( /{{GENDER:\$2}}/ ) ) { return message.replace( /{{GENDER:\$2}}/, '' ); }
// '{{GENDER:$2|foo}}' -> 'foo' if ( message.match( /{{GENDER:\$2\|[^|]+}}/ ) ) { return message.replace( /{{GENDER:\$2\|([^|]+)}}/, '$1' ); }
// '{{GENDER:$2|bar|baz}}' -> '{{GENDER:$2|bar|baz|bar}}' if ( message.match( /{{GENDER:\$2\|[^|]+\|[^|]+}}/ ) ) { message = message.replace( /{{GENDER:\$2\|([^|]+)\|([^|]+)}}/, '{{GENDER:$$2|$1|$2|$1}}' ); }
// '{{GENDER:$2|foo|bar|baz}}' -> 'foo' or 'bar' or 'baz' return message.replace( /{{GENDER:\$2\|([^|]+)\|([^|]+)\|([^|]+)}}/, function ( match, form1, form2, form3, offset, string ) { switch ( gender ) { case 'male': return form1; case 'female': return form2; default: return form3; } } );
Initial sketch of PLURAL support:
"$1 {{GENDER:$2|changed}} visibility of {{PLURAL:$5|a revision|$5 revisions}} on page $3: $4" = "$1 {{GENDER:$2|changed}} visibility of {{PLURAL:1|a revision|1 revisions}} on page $3: $4" OR = "$1 {{GENDER:$2|changed}} visibility of {{PLURAL:3|a revision|3 revisions}} on page $3: $4"
extract the relevant part
"{{PLURAL:1|a revision|1 revisions}}" or "{{PLURAL:3|a revision|3 revisions}}"
and convert to forms like mediawiki.jqueryMsg.js, then call mw.language.convertPlural which should handle things.
Looking at some of the more complicated uses of PLURAL in deployed log entry messages (https://codesearch.wmcloud.org/deployed/?q=%22logentry-.*PLURAL&i=nope&files=&excludeFiles=&repos=) it seems Arabic has the most forms and we should try and run tests with that too:
"logentry-delete-revision" -> "غيّر{{GENDER:$2||ت}} $1 إمكانية مشاهدة {{PLURAL:$5||مراجعة واحدة|مراجعتين|$5 مراجعات|$5 مراجعة}} في صفحة $3: $4"
For general collection of log entries to investigate parameters, viewers watchlist: https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=watchlist&formatversion=2&wllimit=50&wlprop=ids%7Ctitle%7Cuser%7Ctimestamp%7Ccomment%7Cloginfo&wltype=log
Initial skitch of replacing non-string variables:
var message = "$1 deleted page $3"; // logentry-delete-delete for English after GENDER processing var groups = message.split( /(\$\d+)/ ); // groups = [ "", "$1", " deleted page ", "$3", "" ] // handle messages starting and ending with variables, don't create extra text nodes (TODO is this needed?) if ( groups[0] === '' ) { groups.shift(); } if ( groups[ groups.length - 1 ] === '' ) { groups.pop(); } // groups = [ "$1", " deleted page ", "$3" ] // TODO do we care about leading and trailing whitespace in the array elements? if ( groups.indexOf( '$1' ) !== -1 ) { // userLink variable is what we want to have for $1 groups[ groups.indexOf( '$1' ) ] = userLink; } if ( groups.indexOf( '$3' ) !== -1 ) { // targetPage variable is what we want to have for $3 groups[ groups.indexOf( '$3' ) ] = targetPage; } // And so on for any other parameters, message-specific return groups; // returns array of string or (jQuery or VNode) objects to pass to jQuery.append() or a custom render function
Overall for the simple case:
I'm not inventing my own, I'm going to use the existing messages, but they need to be formatted correctly - eg for page deletion the message is "$1 {{GENDER:$2|deleted}} page $3"
Sketch of gender fetching:
class GenderLookup { // Cache is for a specific wiki, at least for now - it would be pretty complicated to support a single cache for multiple wikis that are being processed in parallel, especially with dealing with users who may not have set their gender on all sites // Cache assumes gender doesn't change for any user during the viewing, including between refreshes - it is unlikely that a user, who has made a log entry that is being shown, changes their gender preference in the time between the initial fetching of their gender and any subsequent refresh, and it makes things a whole lot easier this.cache = mw.Map; // key is username, value is gender "male", "female", or "unknown" this.api = mw.ForeignApi; // api to use for this wiki saveResults: function ( apiResult ) { // Convert raw api result to the format that mw.Map.set will accept toSet = {}; apiResult.query.users.forEach( u => toSet[ u.name ] = u.gender ) this.cache.set( toSet ); } loadGenders: function ( usernames ) : Promise { // Like SiteBase.actuallyGetWatchlist() except simpler recursion, load the first 50 and // then call saveResults(), either resolve immediately or call recursively with the next, // Promise doesn't include the actual genders, this is called after we analyze all of the // log entries to figure out which genders we needed to fetch } getGender: function ( username ) : string { // Fallback to "unknown" in case we missed a lookup return this.cache.get( username, 'unknown' ); } }
I realized that in order to simplify things, and to easily handle variables other than $2 without needing to create RegExp objects, we should do some simplifications of the message when its first loaded, and then handle needing multiple variables with genders by substituting them and then using a regex with a global modifier:
// Handle {{GENDER}} tags that don't actually depend on the gender given // 1. For translations where there is no form specified, replace with empty string: '{{GENDER:$2}}' -> '' message = message.replace( /{{GENDER:\$\d}}/g, '' ); // 2. For translations where there is only one form specified, use that: '{{GENDER:$2|foo}}' -> 'foo' message = message.replace( /{{GENDER:\$\d\|([^|]+)}}/g, '$1' ); // 3. For translations that only specify 2 forms, add the 3rd form using male for unknown: '{{GENDER:$2|bar|baz}}' -> '{{GENDER:$2|bar|baz|bar}}' message = message.replace( /{{GENDER:\$(\d)\|([^|]+)\|([^|]+)}}/g, '{{GENDER:$$$1|$2|$3|$2}}' );
// Starting from the pre-processed message above, substitute all gender variables, which should each have a value of "male", "female", or "unknown" (code should enforce this) // Then: message = message.replace( /{{GENDER:(male|female|unknown)\|([^|]+)\|([^|]+)\|([^|]+)}}/g, function ( match, gender, form1, form2, form3, offset, string ) { switch ( gender ) { case 'male': return form1; case 'female': return form2; case 'unknown': return form3; } } );
This should be able to handle all valid cases, and works for messages with multiple gender variables, eg block logs, user right logs.
For substituting gender and plural variables, use
// Where paramName is '$2', '$5' or similar, and paramValue is the result to replace it with, and message is the source string // Need to replace all matches, string.replaceAll is not always available, so use a regex var regex = new RegExp( mw.util.escapeRegExp( paramName ), 'g' ); return message.replace( regex, paramValue );
For extracting plural into a form usable by mw.language.convertPlural, based on jqueryMsg
// Where message is a string like '{{PLURAL:10|one edit|12=a dozen edits|10 edits}}' from {{PLURAL:$5|one edit|12=a dozen edits|$5 edits}} message = message.replace( /{{PLURAL:(\d+)\|([^}]+)}}/g, function ( match, num, formsStr, offset, string ) { // We know that `num` should be a numerical string, because we just substituted it. Convert to a number for mw.language num = parseInt( num, 10 ); // Split up form to either explicit (use a specific message if the number is exactly the one specified) or normal (use a specific message based on the plural rules for the language) var forms = formsStr.split( '|' ); var explicitForms = {}; var normalForms = []; forms.forEach( function ( form ) { if ( /^\d+=/.test( form ) ) { // Explicit plural forms like 12=a dozen var explicitFormNumber = parseInt( form.split( /=/ )[ 0 ], 10 ); explicitForms[ explicitFormNumber ] = form.slice( form.indexOf( '=' ) + 1 ); } else { normalForms.push( form ); } } ); return mw.language.convertPlural( num, normalForms, explicitForms ); } );
For block logs that are partial blocks for a namespace, we need to fetch the local names of each namespace, plus the local version of the blanknamespace message, since different wikis have different names for their main namespace and have different configured extra namespaces. Use the query
{ "action": "query", "meta": "siteinfo|allmessages", "siprop": "namespaces", "ammessages": "blanknamespace", "format": "json", "formatversion": "2" }
This should be the only message we need to fetch per wiki, the rest we will use the central wiki for. Will probably add block log support not in the first round since this will take some extra work. It appears that namespace names other than the main namespace are always in English? Based on looking at zhwiki's Special:BlockList for partial blocks in Chinese, even if the rest of the description is in Chinese the namespace names are in English.
Change 714147 had a related patch set uploaded (by DannyS712; author: DannyS712):
[mediawiki/extensions/GlobalWatchlist@master] Handle rendering edits and log entries separately
Hacks to test new javascript in production by bypassing ResourceLoader's logic
1) Visit w/load.php?modules=ext.globalwatchlist.specialglobalwatchlist on patchdemo or wherever the new code is, and copy 2) Visit https://meta.wikimedia.org/w/load.php?modules=ext.globalwatchlist.specialglobalwatchlist, identify the current version (ext.globalwatchlist.specialglobalwatchlist@12789 -> 12789) and in the new code change the implemented version 3) delete from current module registry: `delete mw.loader.moduleRegistry['ext.globalwatchlist.specialglobalwatchlist']` 4) delete from current store: `delete mw.loader.store.items['ext.globalwatchlist.specialglobalwatchlist@12789']` 5) run the implement script from 1) with the changed version from 2) in the console 6) save the changed version `mw.loader.store.add( 'ext.globalwatchlist.specialglobalwatchlist' )`
But, that seems to have broken the messages, and I'm not sure how to load those properly...
Add a snippet (new mw.Api()).loadMessages(Object.keys(mw.loader.moduleRegistry['ext.globalwatchlist.specialglobalwatchlist'].messages)); to fix that, plus a similar snippet to mw.load dependencies just added (mw.loader.load(['mediawiki.language','mediawiki.util']);), both put directly at the start of the declaration of MultiSiteWrapper or similar, also remove the line only enabling it in dev mode
Reveals that
Part of the reason I was looking at re-implementing the GENDER and PLURAL handling was to avoid needing to depend on the mediawiki.jqueryMsg ResourceLoader module. But, I just realized that the mediawiki.api module, which we already depend on and use heavily, itself loads the mediawiki.jqueryMsg, so we can add an explicit dependency with no additional cost and then make use of the message parsing there for the non-node replacements