Page MenuHomePhabricator

Parsoid: DOM for template objects should expose template name
Closed, ResolvedPublic

Description

{{foo}} -> Template:Foo
{{:foo}} -> foo
{{:user:Foo}} -> User:Krinkle
{{Template:foo}} -> Template:Foo
etc.

VisualEditor needs the name of the page that ends up being transcluded by the template invocation.

From conversation with Gabriel I gather that currently the entire tranclusion system (at least in production) is deferred to the PHP parser (including the invocation itself). Which means Parsoid is unable to determine it itself right now.

Gabriel suggested Parsoid will provide the name the best it can for simple cases that don't dynamically construct the page name (e.g. not something like:
{{ {{getTemplateName|x=foo}} | bar }}
).


Version: unspecified
Severity: normal

Details

Reference
bz48663

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:20 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz48663.

Related URL: https://gerrit.wikimedia.org/r/64651 (Gerrit Change Ie0114a81eead86d7a3b3e3a7a5b10d25c457b524)

Related URL: https://gerrit.wikimedia.org/r/64987 (Gerrit Change I3145a4168a2347cecc573dcc07bb4128962a5b11)

https://gerrit.wikimedia.org/r/64987 (Gerrit Change I3145a4168a2347cecc573dcc07bb4128962a5b11) | change APPROVED and MERGED [by jenkins-bot]

We are exposing an 'url' format for static targets now. In a follow-up patch we plan to make that editable, and remove the 'wt' value for static targets. For templated targets we will initially still return 'wt', but in the longer term we will expose either 'url' for static targets or 'html' for templated ones.

(In reply to comment #4)

We are exposing an 'url' format for static targets now. In a follow-up patch
we
plan to make that editable, and remove the 'wt' value for static targets. For
templated targets we will initially still return 'wt', but in the longer term
we will expose either 'url' for static targets or 'html' for templated ones.

So while discussing WikiData integration today, we noticed that parser functions aren't recognized very well. {{#property:capital|of=Germany}} returns {'target':{'wt':'#property:captial'}, 'params':{'of':'Germany'}}

Will this by improved in this URL change you're talking about?

This looks sane to me, and won't change much with the url patch. Transclusions are all treated the same in Parsoid. Would you like us to split on the colon and provide a special-case 0th parameter if something looks like a parser function?

We were just looking at the wiki config to see if we can recognize something as a parser function (instead of a template) and ran into trouble:

  • Looks like parser functions usage comes in 2 flavours: (a) with the #-prefix: ex: {{#if ..}}, etc. (b) without the #-prefix: ex: {{lc:...}}
  • config.functionhooks provides a list of parser functions, but that list doesn't distinguish between (a) and (b) above. So, the functionhooks list has an entry for "lc" and an entry of "if" (not #if).

So, ostensibly we could have tested if (a) the name starts with a # in which case it is a parser function; OR (b) the name (lc/if/padleft) is present in config.functionhooks.

But (b) gets us into trouble because you also have templates like "if". So, {{if...}} ought to be treated as a template and not a parser function.

We could then use an additional check to see if there is a ":" in the name to distinguish between {{lc:foo}} and {{lc|foo}}.

This could work, but seems a bit hacky and maybe fragile wrt edge cases. If knowing that something is a parser function is essential for VE, then we could implement this check and maybe add a flag in data-mw or add some other way of marking up a parser function. data-mw.target.isPF = true possibly?

(In reply to comment #7)

We were just looking at the wiki config to see if we can recognize something
as
a parser function (instead of a template) and ran into trouble:

  • Looks like parser functions usage comes in 2 flavours: (a) with the

#-prefix:
ex: {{#if ..}}, etc. (b) without the #-prefix: ex: {{lc:...}}

  • config.functionhooks provides a list of parser functions, but that list

doesn't distinguish between (a) and (b) above. So, the functionhooks list
has
an entry for "lc" and an entry of "if" (not #if).

This is controlled by the SFH_NO_HASH flag in CoreParserFunctions.php. Is this flag not exposed in config.functionhooks?

If
knowing that something is a parser function is essential for VE, then we
could
implement this check and maybe add a flag in data-mw or add some other way of
marking up a parser function. data-mw.target.isPF = true possibly?

It's not essential for July, but it's required in order to be able to support Wikidata editing. For that purpose, we'd like for parserfunctions to be marked as something like typeof="mw:Object/ParserFunction/#if" , as opposed as a template with a special data-mw bit (we could deal with the latter too, but the former would be much more convenient).

https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&format=jsonfm&siprop=functionhooks doesn't expose the flag.

We were thinking of replacing mw:Object/Template with mw:Object/Transclusion to make it clear that the object being annotated is a transclusion. We want to retain a Transclusion type so all transclusions can be recognized for those cases where you dont need to distinguish between types. But, I suppose we could make this: "mw:Object/Transclusion/Template" or "mw:Object/Tranclusion/ParserFunction". Right now, the target is not part of this, since this is a type but we are being inconsistent since extensions provide the target name as part of the typeof. So, we should probably make all those typeofs consistent in some way.

Since this is not necessary for July, let us take a bit more time to work through these (and any other) details rather than make a quick fix.

Given that transclusion targets can be dynamic ({{ {{echo|lc}}:FOO}} and such), it wont be possible to always provide a statically determined tpl/pf target. In the general case, the user shouldn't be editing parameters for these dynamic cases without knowing that the tpl. target is specific to the current editing session and arbitrary editing of the parameters might break the transclusion in other scenarios (day-of-week or month-of-year based selection of tpl targets).

So, if we cannot always provide the template or parser function target, it seems we shouldn't add that to the typeof field and leave the typeof as: "mw:Object/Transclusion/Template" or "mw:Object/Transclusion/ParserFunction" and rely on the data-mw.target json property to communicate info about the target (fully resolved url, if static OR html with transclusion wrappers, if templated).

Related URL: https://gerrit.wikimedia.org/r/65622 (Gerrit Change I3b2fbf3b533b14680d2b881ac9fb43218d4f023a)

Latest proposal that is now pending review and merge (based on discussions on IRC).

All transclusions now have "mw:Transclusion" typeof.

data-mw.target.href will point to the template source page, if the template target is known statically. Will not be set set if target is templated.

data-mw.target.function will provide the name of the parser-function/magic-word, if the function target is known statically. Will not be set otherwise.

https://gerrit.wikimedia.org/r/65622 (Gerrit Change I3b2fbf3b533b14680d2b881ac9fb43218d4f023a) | change APPROVED and MERGED [by jenkins-bot]