Page MenuHomePhabricator

Reference types stop working when there are more CSS classes
Closed, ResolvedPublic2 Estimated Story Points

Description

As of now, the reference type in <cite class="journal"> is only detected as long as there is no other class. The only exception is the class citation. This means that <cite class="citation journal"> will show the journal icon, but <cite class="journal whatever"> will show the generic one.

As of now, this behavior is by design. It's strictly modeled after the previously existing (now discontinued, even removed) RESTBase-API endpoint for references.

Relevant code in the Popups extension:

gateway/reference.js
/**
 * This extracts the type (e.g. "web") from one or more <cite> elements class name lists, as
 * long as these don't conflict. A "citation" class is always ignored. <cite> elements without
 * another class (other than "citation") are ignored as well.
 *
 * Note this might return multiple types, e.g. <cite class="web citation paywalled"> will be
 * returned as "web paywalled". Validation must be done in the code consuming this.
 *
 * This duplicates the strict type detection from
 * @see https://phabricator.wikimedia.org/diffusion/GMOA/browse/master/lib/transformations/references/structureReferenceListContent.js$93
 *
 * @param {JQuery} $referenceText
 * @return {string|null}
 */
function scrapeReferenceType( $referenceText ) {
	let type = null;

	$referenceText.find( 'cite[class]' ).each( function ( index, el ) {
		const nextType = el.className.replace( /\bcitation\b\s*/g, '' ).trim();

		if ( !type ) {
			type = nextType;
		} else if ( nextType && nextType !== type ) {
			type = null;
			return false;
		}
	} );

	return type;
}

The comment refers to the strict === 1 comparison in https://phabricator.wikimedia.org/diffusion/GMOA/browse/master/lib/transformations/references/structureReferenceListContent.js;40bd75693a8fb9f66555c861d5238edc63a2e265$93:

structureReferenceListContent.js
const getCitationType = (citations) => {
    if (citations.size === 1) {
        const value = citations.values().next().value;
        if (CITATION_TYPES.includes(value)) {
            return value;
        }
    }
    return DEFAULT_CITATION_TYPE;
};

This overly strict behavior is not obvious, even confusing. We want to relax it. Suggestions:

  1. Just use the first known class. This means that class="book journal" will be shown as a book, but class="journal book" will be shown as a journal.
  2. Be more strict, drop all unknown classes, and only use the remaining one if it's a single one.

Event Timeline

Tobi_WMDE_SW set the point value for this task to 2.Feb 17 2021, 9:46 AM

Decision from story time meeting:

  • Implement option 2: Be more strict, drop all unknown classes, and only use the remaining one if it's a single one.
  • For the edge case: if two known classes are detected, use the last class (following CSS semantics)

Change 665139 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Popups@master] Much more relaxed reference type detection

https://gerrit.wikimedia.org/r/665139

Change 665139 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Much more relaxed reference type detection

https://gerrit.wikimedia.org/r/665139

Good news, we made the branch cut! Thiemo's patch will automatically go out with MW-1.36-notes (1.36.0-wmf.32; 2021-02-23), this week.