Page MenuHomePhabricator

Adapt Content translation's editor to support section translation on desktop
Open, HighPublic

Description

As part of Section Translation (T243495), we want to support expanding existing articles by translating new sections. On desktop we want to reuse the translation editor that Content translation provides. In this way, users can translate a specific section on desktop with a familiar tool that takes advantage of the available space.

Currently, Content translation editor loads and published a complete article. This ticket proposes to extend its capabilities to be able to work with a single section instead. For example, based on a url parameter it should be possible to load the History section of the Ukulele article.

Details

Expanding Content translation editor with a "section" mode requires some considerations (covered in separate sub-tasks):

  • T311614 Loading a single section. Load contents for a single section in a way that users can add them to the translation and edit them normally. More efficient loading may require support from the Parsing team and can be considered separately (T237614).
  • T311635 Adjust the translation title. For this case both the article title (non-editable) and the section title (editable in the translation) will be shown.
  • T311997 Publishing behaviour. Publishing will add the new section to the target article at the end of the document. This will be refined in follow-up tickets (adjusting the action and messaging to the circumstances).
    • Section-translation Content published will include the "sectiontranslation" tag in addition to the usual "contenttranslation" one.
  • Publish settings. We may need to initially remove the option to customize the target namespace when translating sections.
  • Access through the URL. As a first step, the section mode will be accessible through a URL parameter. Once the overall workflow for section translation is specified, the UI supporting other steps (e.g., letting the user pick a section to translate) will connect to the current step without the need for manually creating a URL.

Apart from the differences noted, the translation workflow should work in the same way it does when a full article is translated.

CX-translation-view.png (1×1 px, 170 KB)


Since some of the current limitations of the current database schema may apply, it may be good to keep the following tickets in mind T192065: Starting a new translation is loading contents from a previously deleted one

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Looms mostly OK.

Just a few things:

One:

Publishing will add the new section to the target article at the end of the document. This will be refined in follow-up tickets (adjusting the action and messaging to the circumstances).

Does this mean that there is a plan that in the future there will be a way to publish somewhere other than the end of the article?

Yes, but there are many possibilities. We need to observe users to learn whether the default placement is right most of the time. Depending on how frequent changing the default position is needed, we can provide a more or less prominent way to change the default. Some possibilities:

  • Users most of the time translate sections in the sequence order they expect. Providing a follow-up option after the section is published to move it, may be enough to correct the few cases where the default is not ideal.
  • Users most of the time need a section to be on a different specific place. Then an additional step to select the destination placement may be convenient.

So for the first iteration, I think it makes sense to start with a basic default.

Two:
In the current image, the name of the section has the same appearance as the name of the article in the full-article mode. The name of the article is a new element, shown in a small font at the top. I suspect that this may be confusing. It makes more sense to me to show the article name and the section name identically to how they are shown in the full-article mode.

Good point. Here I was trying to emphasize the main element the user is working on (the section), keeping the article as a secondary contextual element. However, it is true that this contradicts the usual document hierarchy, and we need to check how much distraction/confusion this may generate. In any case, I think that both pieces of information are needed and adjusting the style in one direction or another does not seem to be a blocker for the technical exploration.

Three:
It's not exactly about this visual design, but generally about section translation: How will section translation actions be counted in CXStats? This feature will need some metrics.

I'd expect section translation to be reflected as an edit with a special tag ("section-translation"). The metrics defined for the current fiscal year (T226171) are defined to allow obtaining the number of articles translated (with the current Content translation workflow), sections translated (with the future section translation workflow), or both.

Note that with this approach the articles translated with the classic Content translation workflow may contain several sections, but those are not counted as independent section translations.

This topic was discussed in details and the current undestanding is given below:

  1. The section translation workflow with in CX- from starting to publish will be a minimal CX both in terms of technology and user experience. CX Will have a newly defined mode-we can name it properly, for now "minimal"
  2. Minimal mode will be used for section translation. It will NOT HAVE the following features
    • Auto save or any kind of saving. Start, edit and publish in one go. No entries made to the CX central databse.
    • Category display and actions to add remove
    • No progress calculation
    • No translation progress based validation or abuse filter checks. Hence no error cards at all
    • No namespace selection since the article exist already
  3. CX Dashboard does not show any ongoing section translation or any statistics. No changes there
  4. CX Stats does not show anything about sections translation
  5. Published translation can have an edit tag

Change 547708 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/services/cxserver@master] Section translation test

https://gerrit.wikimedia.org/r/547708

Change 547709 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Section translation

https://gerrit.wikimedia.org/r/547709

Change 547708 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/services/cxserver@master] Section translation test

https://gerrit.wikimedia.org/r/547708

@Petar.petkovic Can you clarify why we need to fetch a single section from cxserver? Don't we need to fetch full article as I explained in above comment?

  1. Even though we want to show only on section in source article to users, in the background we need to full article so that we can resolve inter-content references.
    • Show the full source article and highlight the selected section alone. for target article, either (a) show full existing article (b) just a place holder for the selected section. Option (a) has a problem of aligning target content against source content. Not an easy one to resolve. (May be, it can be solved if the target article is not aligned at all). Pros: Seeing the whole context of source and target will help translators and may be even allow to select where the published section goes.
    • Hide everything except the selected section alone. Cons: Translator miss the context of the article.
  2. Since we need to load the full source article, there is no need for any new api at cxserver.

Change 548589 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/services/cxserver@master] Add classes to disctinct sections in an article

https://gerrit.wikimedia.org/r/548589

Change 548590 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Allow to load only a single section

https://gerrit.wikimedia.org/r/548590

Change 547708 abandoned by Petar.petkovic:
Section translation test

Reason:
Different approach taken

https://gerrit.wikimedia.org/r/547708

Change 547709 abandoned by Petar.petkovic:
Section translation

Reason:
Different approach taken

https://gerrit.wikimedia.org/r/547709

@Petar.petkovic Can you clarify why we need to fetch a single section from cxserver? Don't we need to fetch full article as I explained in above comment?

We don't necessary need to load full article. There is still no REST API which returns only one section using parsoid and maybe their team had a valid reason not to include such option.
There is a simpler version catered to mobile devices, which I wanted to try out. Loading whole article seems lazy to me. There are some challenges with cross-section content, like named references, but they can be expanded to full definition for that one section.

In order to have this feature faster, I did load full article, with hiding of unnecessary sections. My understanding is that we still don't have a concrete plan how section translation will work and many questions remain open to answer, so, as a start, we can use simpler approach you proposed.

The section translation workflow within CX - from starting to publish will be a minimal CX both in terms of technology and user experience. CX Will have a newly defined mode - we can name it properly, for now "minimal"

This will most likely be a URL flag which disables/enables certain features. In the future, section translation is planned to become more complex and we may want to have more minimal RL modules to load for such limited UX. However, RL penalizes creation of new modules, since those names are loaded on every page view, as discussed many times in the past.

Thanks, @Petar.petkovic for all the details. I captured the idea of avoiding loading extra contents for future iterations, once the initial version is completed: T237614: Explore ways to avoid loading the whole article when showing only one section

VE added support for section editing earlier this year (T76541). In your current approach, you are still building the full CE tree, and having it laid out. Setting attachedRoot to the required <section> node in ve.dm.Surface will be much faster :)

We don't necessary need to load full article.

You will need to in order support references defined elsewhere on the page, also a change to one section can affect other sections because of reference (e.g. deleting a reference can result in the contents being moved to another part of the page).

That said server-side section editing will give you less performance gains than you might think (if you are using attachedRoot instead): T206228#5330185. Parsoid HTML download is usually not a bottleneck, and more time is spent building and rendering the CE tree than the DM.

That graph applies to ArticleTarget, so it's not certain the same applies to CX, but if you have performance issues, I would suggest investigating things like loading the page content earlier in the application cycle, in parallel with the editor initialising.

Change 550050 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Split CX into minimal and full version

https://gerrit.wikimedia.org/r/550050

server-side section editing will give you less performance gains than you might think (if you are using attachedRoot instead): T206228#5330185. Parsoid HTML download is usually not a bottleneck, and more time is spent building and rendering the CE tree than the DM.

Does attachedRoot allow to use multiple adjacent <section>s or we need to wrap those which define a range between two <h2> headers?

We still haven't decided on a level of granularity we want for section translation, but for initial exploration, I went with a larger set.

Currently attachedRoot only allows a single <section> which Parsoid adds as defined here: https://www.mediawiki.org/wiki/Parsing/Notes/Section_Wrapping

It may be possible to change the code to attach multiple siblings, but it would be easier to just pre-modify the DOM to wrap all the content you want to display in a new <section> tag.

Let me write a roundup of current state of changes made for section translation support and ask questions coming out of it. Work is not merged, so this is preliminary.

  • The goal is to load articles which exist already in target wiki, so that we can expand them with one section. Entry point for starting section translation is presence of section param in URL. Which means section translation could be loaded for articles which don't have translation in target wiki. What should happen in such case?
  • Current working version removes all issues which are shown inside issue card. In full article translation, we have error that is shown when user is not allowed to publish in main namespace of the wiki. We show this only on English Wikipedia, because their community defined Abuse Filter rule to combat article publishing via Content Translation. This exact Abuse Filter rule would not catch sections published with section translation, because it is based on edit summary which we have for articles published with Content Translation. However, community might implement similar solution against section translation and without issue tracking system, there will be no upfront warning for users. I was looking at warnings displayed in issue cards and this was one of them. But, there are more important ones that we might not want missing, which leads me to the next point.
  • Do we really want to get rid of MT abuse checks? While section translation is in exploration phase, it would be nice to predict some basic needs we will have in the future and don't go through big rewrites to remove some parts of code only to bring them back in the future. Reasons why we have MT abuse checks for article translation are existing for section translation as well.
  • Mock ups in the description (F30517880) show how title of the section being translated should be edited. Me and @Pginer-WMF discussed this during Language team offsite last week. Article titles have strict rules about which characters they allow and wiki syntax is not permitted. Section titles are different and could include reference, for example. Text area element which contains article title in full article translation cannot be used for this rich wikitext editing experience. Also, text area does not make it possible to utilize MT to aid translation of the section title. Therefore, I ask to revisit the design with this in mind. Below is screenshot from the current state, where section title is displayed as one of the paragraphs of translation. Some tweaks would be needed, target article title should not be editable and we would need to make sure section title gets translated, similar to what we had in CX1.

ukulele-en-tg-section-1.png (937×1 px, 121 KB)

Let me write a roundup of current state of changes made for section translation support and ask questions coming out of it.

Thanks for the initial work in this front and surfacing these questions. Some comments below:

...section translation could be loaded for articles which don't have translation in target wiki. What should happen in such case?

For this particular case, I think the expected behavior would be to create a new article consisting of the selected section.

Most of the UI workflows for section translation would limit the user choice to sections of articles that already exist in the target wiki. However, for mobile we also plan to support the creation of new articles by using Section translation (not only expanding existing ones). The idea is to treat the lead section as a section that users can translate to start an article. So the proposed behavior would be consistent with that (and also allow for starting with a different section than the lead one). In addition, the proposed behavior can be useful for other tools that may integrate by setting the URL parameters.

Some considerations (we can create separate tickets for these):

  • Show a message to let the user know that a new article will be created.
  • Make sure redirects are dealt properly. If the target page X is a redirect to Y, we should treat Y as the destination. That is, expand Y with the new section (not overwrite X).
  • Do we really want to get rid of MT abuse checks?

The proposal to skip some checks was for simplification purposes. I think it makes sense to support translating the contents of a section in the same way we support it when translating as part of a larger article. If there are checks that are safe to apply to a fragment of the content, it is great to support them.

My current thinking:

  • If supporting the checks requires additional effort, I'd recommend creating a follow-up ticket.
  • If supporting the checks does not require any effort, verify and document that they work at the section level.

I ask to revisit the design with this in mind. Below is screenshot from the current state, where section title is displayed as one of the paragraphs of translation. Some tweaks would be needed, target article title should not be editable and we would need to make sure section title gets translated, similar to what we had in CX1.

Please let me know whether the following is correct:

  • It is possible to make the page title to be non-editable and adjust the styling to be presented differently (e.g., in a smaller size as the proposed design)
  • It is hard for the section title to be presented above the language indicators (i.e., "English - view page. Tagalog" line)

So the new design you are asking for can adjust the style but not change the placement of elements. Is that correct? Is that a hard limitation or something that can be refined as part of follow-up tickets?

Please let me know whether the following is correct:

  • It is possible to make the page title to be non-editable and adjust the styling to be presented differently (e.g., in a smaller size as the proposed design)
  • It is hard for the section title to be presented above the language indicators (i.e., "English - view page. Tagalog" line)

So the new design you are asking for can adjust the style but not change the placement of elements. Is that correct? Is that a hard limitation or something that can be refined as part of follow-up tickets?

Making target title read-only is possible. We already do that for source title. Text can be styled to be smaller as well.

On second point: Article title is presented in separate HTML element, outside VE editing surface. Section title is inside VE surface and among many other sibling section nodes, which means we cannot move it above language indicators. If we want full editing capabilities for section title, we would need separate VE surface.

Change 554971 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Add link to target article if it exists

https://gerrit.wikimedia.org/r/554971

Change 555680 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/services/cxserver@master] Extract section titles

https://gerrit.wikimedia.org/r/555680

Change 555681 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Replace article title with section title

https://gerrit.wikimedia.org/r/555681

Change 555756 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Allow publishing of section translation

https://gerrit.wikimedia.org/r/555756

Change 548589 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Add mw-section-number data attribute to distinct sections in an article

https://gerrit.wikimedia.org/r/548589

Change 548590 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Allow to load only a single section

https://gerrit.wikimedia.org/r/548590

Change 570515 had a related patch set uploaded (by KartikMistry; owner: KartikMistry):
[operations/deployment-charts@master] Update cxserver to 2020-02-05-051751-production

https://gerrit.wikimedia.org/r/570515

Change 570515 merged by jenkins-bot:
[operations/deployment-charts@master] Update cxserver to 2020-02-05-051751-production

https://gerrit.wikimedia.org/r/570515

Unassigning from me since I am not actively working on this part at this point of time

We will need to rethink the technical approach for this. Petar Petkovic had submitted a huge refactoring in CX codebase to support section translation. But that cannot be merged now because of

(a) new Vue based section translation approach,
(b) merge conflicts
(c) concerns about stability of CX2 with such big change
(d) Petar no longer works with us
(e) cannot initiate such big changes in CX2 as language team has people constraints.

Since section translation is responsive and can work for desktop, we may also consider avoiding redoing this usecase in CX2 and just use SX in desktop?

(not for immediate answers, just writing it down since Petar's big patch is not merged and if anybody wonders this is the reason)

Since section translation is responsive and can work for desktop, we may also consider avoiding redoing this usecase in CX2 and just use SX in desktop?

The plan is to:

  • Reuse the dashboard and the process to select a section from the Section Translation implementation currently available on mobile. That part of the process is responsive and intended to be used on both mobile and desktop.
  • Adapt the translation editor on desktop to load only one section when expanding an article by translating a new section. This provides the familiar editor users are used to in the platform loading the necessary contents.

The general idea is to have a single workflow with two platform-specific editors (desktop and mobile). It does not make much sense to me that on desktop users get an inconsistent experience when they translate a new article with an editor that takes advantage of the desktop characteristics but adding a new section is done on an editor adapted to the mobile screen size constraints they don't have.

I don't know whether it is preferred to continue the previous attempt or start from scratch, and I think it makes sense to wait until we can focus on this to avoid breaking changes, but I don't think that exposing the mobile translation editor on desktop is the best solution.

Change 555680 abandoned by Nikerabbit:

[mediawiki/services/cxserver@master] Extract section titles

Reason:

https://gerrit.wikimedia.org/r/555680

Change 550050 abandoned by Nikerabbit:

[mediawiki/extensions/ContentTranslation@master] Split CX into section translation and article translation

Reason:

no longer relevant

https://gerrit.wikimedia.org/r/550050

Change 554971 abandoned by Nikerabbit:

[mediawiki/extensions/ContentTranslation@master] Add link to target article if it exists

Reason:

no longer relevant

https://gerrit.wikimedia.org/r/554971

Change 555681 abandoned by Nikerabbit:

[mediawiki/extensions/ContentTranslation@master] Replace article title with section title

Reason:

no longer relevant

https://gerrit.wikimedia.org/r/555681

Change 555756 abandoned by Nikerabbit:

[mediawiki/extensions/ContentTranslation@master] Allow publishing of section translation

Reason:

no longer relevant

https://gerrit.wikimedia.org/r/555756

Pginer-WMF renamed this task from Load a single section in Content translation's editor to Adapt Content translation's editor to support section translation on desktop.Jun 28 2022, 1:42 PM
Pginer-WMF updated the task description. (Show Details)