Unable to translate an article for which another user has a deleted draft
Closed, ResolvedPublic

Description

Other-deleted translation & other-published then other-deleted:
-> Fresh start
-> Unable to save
-> Article appears in self-drafts and disappears in other-drafts
-> Opening again shows it is an ongoing translation by other

It seems that ownership is not properly updated. The warning about on-going translation is not shown initially (which is expected), but saving fails because old draft is still attempted to be loaded (and prevented since it is not ours). The main table is updated properly and the translation is usurped, given the article from then onwards appear there. But still it's impossible to continue, and since the state is now draft, we get to see the warning about on-going translation.

Amire80 triaged this task as High priority.
Amire80 added a subscriber: Amire80.

High prio—a lot of people are complaining about this.

Nikerabbit claimed this task.

Change 337405 had a related patch set uploaded (by Nikerabbit):
Gray out translation columns if draft restoration fails

https://gerrit.wikimedia.org/r/337405

Change 337409 had a related patch set uploaded (by Nikerabbit):
Big rework to cx_translations table

https://gerrit.wikimedia.org/r/337409

Change 337405 merged by jenkins-bot:
Gray out translation columns if draft restoration fails

https://gerrit.wikimedia.org/r/337405

Nemo_bis added a subscriber: Nemo_bis.

Writing down the changes that happens from https://gerrit.wikimedia.org/r/#/c/337409

  1. Before the patch, a translation is uniquely defined by source language, target language, source title. And this translation can have a started by, updated by properties. Which already implies an ownership. But there is a cx_translators table which map a user id to a translation id - again an ownership. Any inconsistancy in the mapping and the updated_by , started_by fields can cause many troubles.
  2. The above design was originally aimed to have a translation work "Transferable" to another translator just by changing the mapping(cx_translators table entry) but we now discover several issues with that.
    1. Translations that goes to draft -> delete status can have stale translation units. If this same translation was later restarted by another translator, ownership is changed(though not consistantly) but the old translation remains. Should we load it or abandon? Basically instead of starting from a clean state, we end up in partially using or atlealy getting confused by old translation.
    2. Confusing interpretation of ownership leading to inconsistancies
  3. In new approach by @Nikerabbit, A translator owns a translation and it is never getting in to the hands of another translator. A new translator, if starting translation for same source-target language pair and source title, is getting a fresh translation record. This simplifies most of the above issues. But while doing this, we just give away the idea of designing for future use cases like collaboration, transfering the translation etc. Which I think okay since we can derive alternate mechanisms when we approach those problems.

Since the patch changes some of the basic assumptions, need a thorough review and testing.

Yes, that is correct. Collaboration and similar features are still possible to add in future, and I don't think this change will make it harder.

David has looked at the patch, but obviously this is a critical code that benefits from multiple eyes looking at it.

The biggest implication of this patch is that we need to later manually clean up the existing data to restore access to drafts which have confusing ownership. We are talking about two thousands of such cases if I remember correctly.

The patch works without the schema change, so it can be (and should be) deployed first, followed by the schema change anytime after.

The biggest implication of this patch is that we need to later manually clean up the existing data to restore access to drafts which have confusing ownership. We are talking about two thousands of such cases if I remember correctly.

Unpublished translations with started translator differering from last translator :

[wikishared]> select translation_id, translation_status,  translation_last_updated_timestamp from cx_translations where translation_last_update_by != translation_started_by and translation_status = 'draft';

We have 249766 records so far in cx_translation table. and 178 records match the above query. The timestamp for last such translation is 20160623120049

[wikishared]> select translation_id, translation_status,  translation_last_updated_timestamp from cx_translations where translation_last_update_by != translation_started_by and translation_status = 'deleted';

Returns 55 records with last timestamp as 20160218085829

What would be a migration strategy to new approach? Update translation_started_by to translation_last_updated_by Since there records are basically translations abandoned(deleted) by ranslation_started_by and later restarted by translation_last_updated_by?

I don't have full answer to that. I shared two docs with you where I had been doing a little research how to clean up the data. It looks like some parts are easy to do handle (just delete all deleted stuff with no drafts) and some need more investigation, or even impossible to solve.

Also, the mismatch between cx_translations and cx_translators table:

select translation_id, translation_started_by, translation_last_update_by, translator_user_id, translation_last_updated_timestamp from cx_translations, cx_translators where translation_id = translator_translation_id and translator_user_id != translation_started_by and  translator_user_id != translation_last_update_by and translation_status = 'draft';

350 Records - This should get fixed when we read the ownership data only from cx_translations table(as done in patch).

Change 337409 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation] Big rework to cx_translations table

https://gerrit.wikimedia.org/r/337409

Arrbee moved this task from In Review to Done on the Language-2017 Sprint 3 board.Mar 7 2017, 6:59 AM

Mentioned in SAL (#wikimedia-releng) [2017-03-10T12:29:39Z] <kart_> Beta: T159800: Update DB index for T146450