Page MenuHomePhabricator

Attempting to substitute #var_final results in strip markers
Closed, ResolvedPublicBUG REPORT

Description

Per title. Previewing or saving a page with a substituted #var_final results in strip markers like ?'"UNIQ--finalizedvar-0--QINU`"'?` being emitted; see https://yugipedia.com/wiki/User:Dinoguy1000/sandbox/var for test cases.

Event Timeline

MGChecker added subscribers: thiemowmde, MGChecker.

Well, this does not really surprise me, as substitution does not really like this extension. (T201857) I guess this won't be easy to fix, but I believe the reason why this is happening is quite simple: While ExtVariables->reqestFinanlizedVar is called without any problems and the strip markes inserted, the InternalParseBeforeSanitize function is not called in the subst parser run, so ExtVariables->reqestFinanlizedVar is never called.

Because Parser->internalParse is only called for OT_HTML, while substitutions are done for OT_WIKI. The latter seems to go like this:

  1. Entry point is Parser->preSaveTransform
  2. Some initialization
  3. Parser->pstPass2 is called. I assume $options->getPreSaveTransform() is true here for normal parsing, as this is the default.
    1. substitutions are done by Parser->replace Variables and the subst values inserted. As this works, I did not look further here.
  4. In Parser->internalParse, at a similar place the InternalParseBeforeSanitize hook is called. There is no hook here that could be used for a similar purpose.
  5. The general (and the nowiki) unstrip of Parser->mStripState happens, but this does not replace the Variables strip markers as they use their own strip state.
  6. As after this run, there still is the normal parser run, no sanitizing seems to be necessary here.

There are two possible ways out of this, I imagine. Either we add a hook into core to mirror InternalParseBeforeSanitize, PreSaveTransformBeforeUnstrip or something like that. Or we abandon the FinalizedVarStripState altogether and are using the default Parser StripState instead (as everyone else does, not a single other extension instantiates their own StripState), which might be possible as well and looks cleaner than the current approach anyway. However, I am unsure if this will work, I would have to test it.

Change 504488 had a related patch set uploaded (by MGChecker; owner: MGChecker):
[mediawiki/extensions/Variables@master] WIP: Use default StripState

https://gerrit.wikimedia.org/r/504488

As the documentation in insertFinalizedVars suggested, wikitext is not parsed properly if we use the default strip process. I don't think we can fix this, as even parsing the wikitext ourselves, which might be possible, will probably yield further ugly issues. However, the solution I implemented here works just fine for subst, as this is done before wikitext is parsed.

We could switch the behavior #var_final depending on Parser->mOutputType, but I consider this quite ugly. However, I have no better idea at the moment (besides maybe just accepting that subst does not work).

What sorts of changes would be required to StripState to get it to work? Would these be incompatible with current behaviors? If not, it might be worth instead trying to get a patch merged for it, since then potentially other uses could benefit as well. (If we investigate that route at all, it would probably be worth adding whatever Phab project(s) covers StripState as tags on this task, to get input from people who are more likely to be familiar with it.)

Unstripping using mStripState is done after wikitext is parsed to HTML, but after parser functions and templates are evaluated (first, signatures and substs are parsed to the actual wikitext to evaluate, then the preprocessor inserts the wikitext produced by the templates, than this wikitext is translated to HTML).

For using var_final, we require a unstripping process before parsing wikitext happens, but after templates are evaluated, and this does not exist in Core as far as I know. The only things happening between the preprocessing and the actual HTML parsing are sanitizing and two hooks before and after it (InternalParseBeforeSanitize and InternalParseBeforeLinks), so there isn't much that can be done different there.

If you are aware of any extension that does a similar thing with parser functions, which are inserted in place (References, for example, are not inserted in place, and are using tags), this would be helpful, but I doubt it is possible to do this without handling subst and other uses separately.

MGChecker changed the subtype of this task from "Task" to "Bug Report".Jun 5 2020, 11:44 PM

Change 504488 merged by jenkins-bot:

[mediawiki/extensions/Variables@master] Use the default StripState for '#var_final'

https://gerrit.wikimedia.org/r/504488