Page MenuHomePhabricator

CX2: Highlight (and skip) references with a template that could not be adapted
Open, HighPublic

Description

References content is often formed by a template that gives structure to the data. Mapping these templates across languages is not always easy since these templates may be different in each language, and they may lack templateData (T200314) to map their parameters between the equivalent templates.

In general, we want to communicate when templates cannot be mapped (T192271), but the particular case of templates in references may need special attention. This case is especially relevant since (a) references are important in Wikipedia, (b) issues adapting them are easy to miss for translators if they don't open them and check their details, and (c) becomes complex to fix afterwards.

This ticket proposes to explore ways in which references for which their content template is not properly mapped can be (a) highlighted to the user and (b) skipped from the final article if it does not get fixed. In this way, users will be aware of the references they need to fill manually and the generation of empty references (and the associated work to clean them) will be avoided.

This ticket focuses on cases where the reference template could not be adapted at all, resulting in an empty reference (if the target template could not be found), or a reference with an empty template (if parameters could not mapped for the target template).

Example of the issue

When translating "Jony Ive" from English to Catalan. The resulting list of references showed references 2nd, 3rd, and 5th as empty (" .") and the 4th showing a warning message:

The problem in both cases was the same. Content translation generated a reference, and mapped the template to the equivalent one, but could not map the parameters due to the lack of templateData. As a result, an empty template was generated.

For the case of the 4th reference an empty London Gazzete template was created, which includes a warning message in its rendering when there are mandatory parameters missing:

The other empty references (2nd, 3rt, and 5th) were using the Ref Notícia template, which is the equivalent to the corresponding Cite News template in the original English article. The reference created by Content Translation also resulted in a template with no parameters, which in this case just renders as " ." (a space plus a dot).

After manually filling the parameters the list of references looked as shown below, which is the expected result:

Proposed solution

The proposed solution combines the following ideas:

  • Represent the unadapted reference in grey (similar to unadapted links, T193233).
    • If the user adds information to the reference (or the template it contains), it will be rendered normal (blue) and the warning will disappear.
  • Show a warning to communicate the reference content could not be adapted. Similar to T192271, but customising the text for this specific case:

Missing reference
A reference could not be transferred to the translation since it uses a template with a different structure.
Please, edit the reference in the translation to fill the missing information.
[Learn more]

  • "Learn more" will link to the user documentation help page describing how to work with references.
  • "Replace with a new citation" will open the new citation dialog (in the same way as if the user clicked on "Cite"), to allow inserting a new citation that will replace the missing reference.
  • When publishing the translation, the missing references will be skipped if they are still empty or contain only an empty template. That is, for cases where the reference is still marked as missing (grey) and the user has not edited it.


This ticket focuses on cases where the reference template could not be adapted at all, resulting in an empty reference (if the target template could not be found), or a reference with an empty template (if parameters could not mapped for the target template).For references with partially adapted templates, there is a relevant ticket: T206310: CX2: Highlight references with a template that is missing mandatory parameters after being adapted

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Pginer-WMF updated the task description. (Show Details)Oct 31 2018, 10:41 AM
Pginer-WMF updated the task description. (Show Details)Nov 6 2018, 12:17 PM
Pginer-WMF updated the task description. (Show Details)Nov 6 2018, 12:33 PM

Change 472416 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/services/cxserver@master] Reference adaptation: If content is template, use its result

https://gerrit.wikimedia.org/r/472416

Change 472925 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] WIP: Show adaptation status for references

https://gerrit.wikimedia.org/r/472925

santhosh added a subscriber: Catrope.

The subclassing of ve.dm.MWReferenceNode to achieve this feature does not look working. The code changes in https://gerrit.wikimedia.org/r/472925 is giving this error:

jQuery.Deferred exception: Cannot read property 'indexOrder' of undefined TypeError: Cannot read property 'indexOrder' of undefined
    at VeDmInternalList.ve.dm.InternalList.getIndexPosition 
    at Object.ve.dm.MWReferenceNode.static.getIndex (<anonymous>:506:792)
    at Object.ve.dm.MWReferenceNode.static.getIndexLabel (<anonymous>:507:116)
    at VeDmCXReferenceNode.ve.dm.MWReferenceNode.getIndexLabel (<anonymous>:508:592)
    at VeCeCXReferenceNode.ve.ce.MWReferenceNode.update (<anonymous>:516:601)
    at VeCeCXReferenceNode.ve.ce.CXReferenceNode.update (<anonymous>:461:248)
    at VeCeCXReferenceNode.VeCeMWReferenceNode (<anonymous>:515:313)
    at VeCeCXReferenceNode (<anonymous>:461:1)
    at VeCeNodeFactory.oo.Factory.create (<anonymous>:72:718)
    at VeCeNodeFactory.ve.ce.NodeFactory.createFromModel

'mwReference' prefix is hardcoded in many places in Cite extension code for the internal list, while the subclass has different name 'cxReference'. Is that the reason for these errors?

Need help from Cite extenion team to resolve this.

+cc @Catrope

Change 472416 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Reference adaptation: If content is template, use its result

https://gerrit.wikimedia.org/r/472416

The patch 472416 submitted for this ticket is causing the following error:

TypeError: Cannot read property 'cx' of undefined
    at isAdapted.every (\cxserver\lib\translationunits\MWReference.js:72:41)
    at Array.every (native)
    at MWReference.<anonymous> (\cxserver\lib\translationunits\MWReference.js:71:34)
    at Generator.next (<anonymous>)
    at resume (\cxserver\lib\util.js:277:21)
    at resumeNext (\cxserver\lib\util.js:287:30)
    at <anonymous>

Found this error while checking T208386. I used the leading paragraph of Hugo Kołłątaj from English to Japanese.

Found this error while checking T208386. I used the leading paragraph of Hugo Kołłątaj from English to Japanese.

Could not reproduce now with cx master. Can you tell which MT provider you used.

Need help from Cite extenion team to resolve this.

+cc @Esanders, need some help for subclassing ve.dm.MWReferenceNode for CX to resolve the above mentioned error.

Stashbot added a subscriber: Stashbot.

Mentioned in SAL (#wikimedia-operations) [2018-11-16T11:08:55Z] <kartik@deploy1001> Started deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777)

Mentioned in SAL (#wikimedia-operations) [2018-11-16T11:13:21Z] <kartik@deploy1001> Finished deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777) (duration: 04m 26s)

Found this error while checking T208386. I used the leading paragraph of Hugo Kołłątaj from English to Japanese.

Could not reproduce now with cx master. Can you tell which MT provider you used?

I have used Yandex. Still reproducible with cxserver master.

I'm not hugely familiar with your subclasses/mixins and how they relate to un-adapted references. Could be a bit more specific about what the problem is?

I'm not hugely familiar with your subclasses/mixins and how they relate to un-adapted references. Could be a bit more specific about what the problem is?

The comment above has the error details https://phabricator.wikimedia.org/T203160#4738735 and the subclassing attempt is in patch https://gerrit.wikimedia.org/r/472925

'mwReference' prefix is hardcoded in many places in Cite extension code for the internal list, while the subclass has different name 'cxReference'. Is that the reason for these errors?

Changing the name seems unnecessary as you are fully overriding the node, you could just keep the name mwReference, but that doesn't appear to be the cause of the issue. Am still looking into it.

The problem is you are mixing in CXLintableNode which itself mixes in OO.EventEmitter, however EventEmitter is already mixed-in to nodes, so by applying it twice you end up with unexpected behaviour (probably the second time you call the mixin constructor it re-initializes the bindings and wipes them).

Change 479022 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/ContentTranslation@master] Don't double-mixin EventEmitter to LintableNodes

https://gerrit.wikimedia.org/r/479022

Change 479022 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Don't double-mixin EventEmitter to LintableNodes

https://gerrit.wikimedia.org/r/479022

Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptApr 2 2019, 11:25 AM

There is a patch associated with this ticket that is still in review.

santhosh removed santhosh as the assignee of this task.Apr 17 2019, 8:27 AM

Unassigning myself since @Petar.petkovic self assigned the patch

Unassigning myself since @Petar.petkovic self assigned the patch

Ok. So I guess it is up to @Petar.petkovic to either merge the current patchset (which covers only part of the spec) and create a new patch for the res, or extend the current one.

Change 472925 had a related patch set uploaded (by Petar.petkovic; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] Show adaptation status for references

https://gerrit.wikimedia.org/r/472925

Unassigning myself since @Petar.petkovic self assigned the patch

Ok. So I guess it is up to @Petar.petkovic to either merge the current patchset (which covers only part of the spec) and create a new patch for the res, or extend the current one.

This is a misunderstanding. The reason I assigned myself is to mark myself as the main reviewer of this patch. When we agreed to have a rotating code reviewer, I thought we established the practice to assign the patch to person who's in charge of doing reviews.

I don't plan to work on this ticket nor on T206310, but I can complete that one single patch if needed.

The issue was mentioned in this comment.

Change 472925 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Show adaptation status for references

https://gerrit.wikimedia.org/r/472925

Translating "List of books considered the worst" from English to French, a grey reference is shown for the second bullet point ("the social war"). The grey reference suggests that content translation was not able to find the template for French, but inspecting it shows that the template was actually found, and several parameters (including all the mandatory ones) were actually filled:

This sends a confusing message to the user since the reference was properly adapted but we are presenting it as if it was not.

Moving the ticket back to in progress.

There seems to be some issues identifying and distinguishing the cases of templates not being found vs. templates being found but not filled. I created a separate ticket (T224437: Incomplete template adaptation presented as missing) but this may be relevant for the current one too.

Petar.petkovic removed Petar.petkovic as the assignee of this task.May 28 2019, 12:38 AM

My understanding is that for this ticket, the missing parts are:

  • If the user adds information to the reference (or the template it contains), it will be rendered normal (blue) and the warning will disappear.
  • When publishing the translation, the missing references will be skipped if they are still empty or contain only an empty template. That is, for cases where the reference is still marked as missing (grey) and the user has not edited it.

There are also a couple of issues in a similar area but those are defined in separate tickets:

  • T225716 Don't mark in grey those that were well adapted (i.e., avoid false positives).
  • T224437 Show the correct warning depending on whether the template was incomplete or missing.

Change 522010 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Skip empty references when publishing translation

https://gerrit.wikimedia.org/r/522010

My understanding is that for this ticket, the missing parts are:

  • If the user adds information to the reference (or the template it contains), it will be rendered normal (blue) and the warning will disappear.
  • When publishing the translation, the missing references will be skipped if they are still empty or contain only an empty template. That is, for cases where the reference is still marked as missing (grey) and the user has not edited it.

518922 deals with the first point, whereas 522010 covers the second. After those two are completed, I suggest opening new tickets for any loose end.

Change 522010 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Skip empty references when publishing translation

https://gerrit.wikimedia.org/r/522010

Moving this to QA to check the parts from T203160#5267819, but checking full specs would be good.

Jpita added a subscriber: Jpita.
This comment was removed by Jpita.

References often use templates on wikis. When those are being translated in Content Translation, multiple things can lead to unadapted/empty reference after translating a paragraph. By reading the description of this ticket, you can get a pretty good picture of how templates are used for references and what should happen when those templates are not well adapted. I suggest reading the description once again, and maybe ignore example from "Example of the issue" section, since it's likely outdated.

Let me sum up what is the state of requirements of this ticket. When reference template is not adapted or is empty after adaptation, following should happen:

  • Reference should turn gray. Mockup and the title also suggest some highlight (in yellow) should happen to attract user's attention, but we dropped that.
  • Warning with title "Missing reference" should be registered inside issue system.
    • Action button "Replace with a new citation" is not developed
  • Empty/Unadapted references need to be skipped when publishing an article.

I would say all three points need careful testing. Here are some of the test cases that I propose including in tests:

  • Template used inside reference in source article is not available in target language wiki
    • This is easiest to find. For example, many wikis don't have Cite podcast and we can translate second paragraph of en:Boltzmann brain to French to see this in action.
  • No parameters are successfully mapped
    • For this, I would look at Cite journal template on Esperanto Wikipedia. A lot of its parameters are using non-English names and we need to find usage where "url" and "doi" params are not specified. I suggest article about year 2014 and "Filmoj" section, with English as target language. Second reference in that paragraph fits the description.
  • Some parameters are mapped to target language reference template, but all are empty
    • We again look at Esperanto Cite journal, but this time use Esperanto as target language. I found Cite Journal in en:Vaudeville inside first paragraph under "Immigrant America" section. That usage does not specify "url", but specifies "doi", which is empty in source.

All requirements should be carefully tested with more examples. Also, keep in mind that references used in multiple places might not work perfectly, see T203772.