Page MenuHomePhabricator

Implement mediawiki TagModes for CodeMirror 6
Closed, ResolvedPublic

Description

Background

Other extensions can register tags to be syntax highlighted wherever CodeMirror is used. This is done using extension attributes, whereby in an extension.json, you can specify CodeMirrorTagModes with an object having the tag name as the key, and the value being the MIME content type for which we want syntax highlighted. For example, the Cite extension registers its <ref> tag to be highlighted with like the following:

"CodeMirrorTagModes": {
	"ref": "text/mediawiki"
}

This task is to track the work that goes into making this system work in CodeMirror 6, which is JavaScript module-based (as opposed to CommonJS-style that ResourceLoader supports).

Acceptance criteria

  • Any extension that supplies a CodeMirrorTagModes for text/mediawiki (or just mediawiki) should highlight the specified tag, and highlight the contents therein like normal wikitext.
  • Support for languages other than MediaWiki wikitext will be saved for a separate task, and likewise for PluginModules. (T357480)

Event Timeline

On closer investigation, this should probably wait for T348019.

Some immature thoughts: Since CodeMirror6 prefers a large bundle of all possibly used resources, we may have to dynamically generate (part of) the source scripts to determine which legacy language modes should also be imported in the bundle.

For example, we can generate one script (pluginModules.js) like this:

// php legacy mode is unavailable
export { css } from '@codemirror/legacy-modes/mode/css';
export { javascript } from '@codemirror/legacy-modes/mode/javascript';

And then in another script (e.g., codemirror.mode.mediawiki.js),

import * as pluginModules from './pluginModules';

for ( const [ language, parser ] of Object.entries( pluginModules ) ) {
    Object.defineProperty( CodeMirrorModeMediaWiki.prototype, language, {
        get() {
            return parser;
        }
    } );
}

IMG_4427.jpeg (710×1 px, 396 KB)

Attaching a screen shot from my cellphone. Note that the styles of the embedded javascript mode is also influenced by mediawiki.less

I haven't gotten that far, but my thoughts were to use the Lezer packages for other languages, at least when the CM instance is only dealing with one language. I.e. if we were to replace CodeEditor (which I hope to do), we wouldn't use CodeMirrorModeMediaWiki at all, or even WikiEditor for that matter. Just load the Lezer package, and instantiate the EditorView. Lezer packages won't use our styling, and will be more performant than the stream parser implementation. At any rate, I'm saving that effort for later.

For now, I'm trying to just focus on getting the wikitext-based TagModes or simple tag registration working, as with Cite, MediaWiki-extensions-Phonos, TemplateStyles, etc.

<nowiki> and <pre> are currently registered as TagModes in CM5, but I don't see the need for that in CM6. Rather, we just treat them as such and bundle the logic into our stream parser, as opposed to pretending they're not actually part of the MediaWiki language. Patch incoming for that!

Change 987489 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: add highlighting for <nowiki> and <pre>

https://gerrit.wikimedia.org/r/987489

I haven't gotten that far, but my thoughts were to use the Lezer packages for other languages, at least when the CM instance is only dealing with one language. I.e. if we were to replace CodeEditor (which I hope to do), we wouldn't use CodeMirrorModeMediaWiki at all, or even WikiEditor for that matter. Just load the Lezer package, and instantiate the EditorView. Lezer packages won't use our styling, and will be more performant than the stream parser implementation.

In this case, do you suggest multiple ~300kB bundles, e.g., one for Wikitext and the other for JavaScript/CSS? By the way, as far as I know, there is no Lezer package for Lua.

My hope was to use dynamic imports so the necessary PluginModules can be loaded at runtime, but I haven't gotten this to work yet. In the end, it may not matter much; We don't have many use cases for mixed languages within wikitext, anyway. SyntaxHighlight could use CodeMirror, but it has a backend implementation, so I don't think we'll integrate CodeMirror there. As a practical example, take MediaWiki-extensions-Score which doesn't yet have a mode. We could write one (it'd have to be a StreamParser I think, not Lezer), and since we know Score is used before page load, Score can use a PHP hook to force CodeMirror to load i.e. CodeMirrorModeScore, which extends CodeMirrorModeMediaWiki, adding in the eatScoreTag() method or whatever and providing it as the tokenizer. So essentially Score tells CodeMirror to use a different endpoint, but no code is duplicated, and caching is consistent for that page. Essentially I suppose this still boils down to what you say -- multiple bundles, but the difference is the other extension provides the customized bundle, rather than tossing it all in CodeMirror. In the end, the storage footprint should be about the same if not lower than what we have now (hopefully!).

Another idea which mayyybe will work is to use the mw.hook system. If we pass around the scope, it seemingly won't matter if the code is built separately. So instead of Score extending CodeMirror internals (and thus delivering a separate package with the whole CM library), it ships one smaller package using just hooks. That might actually be better... if it works.

I do realize PHP is not available as a legacy stream parser, which means MediaWiki-extensions-PhpTags will lose its CM integration. I feel bad about that, as it stomps on the original developer's integration, but I see no way around it sadly. It seems PhpTags isn't maintained much anymore as it is. It emits warnings on REL1_41.

As for pages with different content types, I'd say it's similar to the similar to the MediaWiki-extensions-Score example, where Scribunto could extend just the CodeMirror class (or use hooks), and we leverage all the fancy code editing features that CM provides as extensions, and solve things like T261118: Save code editor settings / preferences across pages in the process. A pity that Lua isn't available as Lezer, but we can still use the legacy stream parser. This again would live in the extension repo, not in CodeMirror. It should be drastically smaller, as opposed to loading it alongside WikIEditor like we have to do for wikitext. It depends on the use case, I guess. T301615: Show syntax highlighting on View Source/protected pages should be easy peasy and lightweight as it doesn't require even an editor. Replacing CodeEditor is separate effort that needs more thought. I want to do it, but it's outside the scope of the CM6 upgrade project and more just a personal interest of mine. I do want to deliberately leave room for it, and avoid making it harder to do down the road, so keep sharing your thoughts! :)

Change 987489 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: add highlighting for <nowiki> and <pre>

https://gerrit.wikimedia.org/r/987489

Change 989232 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: add support for mediawiki TagModes

https://gerrit.wikimedia.org/r/989232

I'm probably missing something obvious, but it seems like there's special handling for ampersands, e.g. <nowiki>&</nowiki>. This gives an "Unknown highlighting tag undefined" error.

Change 989232 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: add support for mediawiki TagModes

https://gerrit.wikimedia.org/r/989232

I guess this is due to type coercion from undefined to 'undefined'. Maybe returning an empty string will solve this problem.

MusikAnimal renamed this task from Implement TagModes and PluginModules for CodeMirror 6 to Implement mediawiki TagModes for CodeMirror 6.Feb 13 2024, 9:59 PM
MusikAnimal updated the task description. (Show Details)
MusikAnimal closed this task as Resolved.EditedFeb 13 2024, 10:03 PM

I have trimmed down the requirements for this task to be just for TagModes that contain MediaWiki wikitext. Doing the same for other languages is something we still need to figure out, but currently it would seem only MediaWiki-extensions-PhpTags would be effected. That integration is likely to break anyway, as there's currently no legacy stream parser available for PHP.

I think the MVP here is mainly to get CM6 working with wikitext only, and that is now complete. I'll create other tasks for PluginModules and mixed languages. QA efforts can be bundled into T259059. Resolving!

I'm probably missing something obvious, but it seems like there's special handling for ampersands, e.g. <nowiki>&</nowiki>. This gives an "Unknown highlighting tag undefined" error.

Not missing anything at all! Noting for the record that @Bhsd fixed this with r991613.

Mixed languages within wikitext is tracked at T357480, but I'm not considering that a "must" for the CM6 upgrade.