Page MenuHomePhabricator

Port MediaWiki mode to CodeMirror 6 stream-parser
Closed, ResolvedPublic

Description

Background

In CodeMIrror 6, the "proper" way to introduce syntax highlighting for a language is to write a language pack. Multiple people have investigated this and concluded it would be a very daunting challenge, so we have instead settled on porting our existing MediaWiki mode to CodeMirror 6's stream-parser interface. Efforts to build a custom parser for MediaWiki should be tracked in a separate task.

Acceptance criteria

  • With $wgCodeMirrorV6 set to true (following T317243), or by passing in the cm6enable=1 in the URL query string, we should now have actual syntax highlighting
  • All syntax highlighting should work identical to as what's currently in prod, with exception of extensions (i.e. <ref>, <phonos>, etc.) (T348684)
  • Add bracket matching to keep feature parity with CodeMirror 5

Event Timeline

Noting that we received interest from a non-WMF engineer about writing a CM6 language package for MediaWiki. I'm waiting to hear back, but when someone picks this task up, they should probably focus on porting the stream parser (much easier!) for the time being.

I'm going to take a stab at porting the existing stream parser. Any effort to build a language pack can proceed at any time (also happy to create a separate task, if desired). The language pack presumably will take a while to perfect.

I've looked deeply into creating language pack for wikitext and I have some conclusions. Lezer grammar is impossible to be written for wikitext thus resulting in necessity of writing completely custom parser (like is done for Markdown). This is a lot of work and unfortunately cannot be done alone. I recommend to fully focus on stream-parser way for now.
From my observations downsides of using stream-parser are with nested styling (maybe not a case here) as e.g. in Fandom we had headings that have equal signs in blue color and this heading was having bigger font size, I couldn't achieve having that bigger font size without having text between equal signs in blue color instead of white.

I have also looked into Lezer since CM6 becomes stable, and I completely agree with MrVanosh's comment on the difficulty of writing a language pack.

I've looked deeply into creating language pack for wikitext and I have some conclusions. Lezer grammar is impossible to be written for wikitext thus resulting in necessity of writing completely custom parser (like is done for Markdown). This is a lot of work and unfortunately cannot be done alone. I recommend to fully focus on stream-parser way for now.
From my observations downsides of using stream-parser are with nested styling (maybe not a case here) as e.g. in Fandom we had headings that have equal signs in blue color and this heading was having bigger font size, I couldn't achieve having that bigger font size without having text between equal signs in blue color instead of white.

I have also looked into Lezer since CM6 becomes stable, and I completely agree with MrVanosh's comment on the difficulty of writing a language pack.

Thanks for investigating! A pity that we'd have to write our own parser if we want a proper language pack, but I'm still interested in that effort in the long-term. For now, we will indeed be porting our existing stream parser, which I'm nearly done with.

The situation you describe with nested styling is one of the downsides for us too, but we have various hacks in place to make it work a bit better, and more can be added if there's enough demand for it. I believe our users are mostly content with the status quo, so it's not terrible that we stick with the same. However without a structured syntax tree, we may be limited in what new features CodeMirror 6 that we can take advantage of, such as code folding and autocompletion (but I could be wrong).

I'm going to rewrite this task to focus on just porting the stream parser. Writing a new language pack can be made into a separate ticket, once we are committed to that effort.

MusikAnimal renamed this task from Port MediaWiki mode to CodeMirror 6 stream-parser, or build new MediaWiki language pack to Port MediaWiki mode to CodeMirror 6 stream-parser.Nov 6 2023, 8:49 PM
MusikAnimal updated the task description. (Show Details)

Change 972438 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] [WIP] Implement MediaWiki stream parser for CodeMirror 6

https://gerrit.wikimedia.org/r/972438

Okay, allllmost done with this! I have discovered there is one CodeMirror 5 feature that we may have to do away with in the new version. That is, the changing of the font size and line height of level 1, 2 and 3 section headings, as seen here:

Screenshot from 2023-11-20 20-06-47.png (244×291 px, 13 KB)

I've been trying hard to reverse-engineer the old stream parser code, and I see nothing that should make these styles apply to the whole line and not just the individual tokens. There's some magic somewhere that converts .line- CSS classes into full line styles. This is normally done with doc.addLineClass() but I see we aren't using that method.

Anyway, in CodeMIrror 6, we need to use Decorations and apply them to the EditorView, which isn't available within the scope of the stream parser code. This means while possible, it's very difficult to replicate the same behaviour in CodeMirror 6.

I'm hoping it's okay that we do without the styling, like so:

Screenshot from 2023-11-20 20-06-52.png (244×291 px, 11 KB)

I'm not even sure the section heading styling is widely desired; to me personally, it's only a tiny bit beneficial when scanning the wikitext. It's also worth noting the 2017 wikitext editor doesn't show this styling either, so at least some users are used to section headings having the same font size and line height as everything else.

EDIT: Now tracked at T351686

Change 980541 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorWikiEditor: add bracketMatching as default extension

https://gerrit.wikimedia.org/r/980541

This is now ready for review! I tried to break up into as many smaller patches as possible. Note there are known issues which I hope to tackle separately (and I made need some help!):

Change 981420 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWii: don't detect additional { as template name

https://gerrit.wikimedia.org/r/981420

Change 980541 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorWikiEditor: add bracketMatching as default extension

https://gerrit.wikimedia.org/r/980541

Change 987271 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] mode.mediawiki: rename mnemonic to html-entity and deprecate variants

https://gerrit.wikimedia.org/r/987271

Change 972438 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] Implement core MediaWiki stream parser for CodeMirror 6

https://gerrit.wikimedia.org/r/972438

Change 981420 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: don't detect additional { as template name

https://gerrit.wikimedia.org/r/981420

Change 987271 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] mode.mediawiki: rename mnemonic to html-entity and deprecate variants

https://gerrit.wikimedia.org/r/987271

This is done. QA efforts can be consolidated into T259059 and/or new tasks can be created for any outstanding issues.

Change 1007029 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWikiConfig: add missing tokens for nested templates

https://gerrit.wikimedia.org/r/1007029

Change 1007029 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWikiConfig: add missing tokens for nested templates

https://gerrit.wikimedia.org/r/1007029

Change #1014700 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWikiConfig: add missing TagStyle for cm-mw-link

https://gerrit.wikimedia.org/r/1014700

Change #1014700 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWikiConfig: add missing TagStyle for cm-mw-link

https://gerrit.wikimedia.org/r/1014700