Page MenuHomePhabricator

Implement external user name handling in the file importer
Closed, ResolvedPublic8 Estimated Story Points

Description

Original Question:
With the FileImporter extension we are copying the file page's history from another wiki. What should happen with the user names of the people who made the edits? Should they be linked to their Commons account or should they be linked to the wiki where they actually made that edit?

Our decision
We want to link user names to the user page of the wiki where they made the edit, since they executed the edit in this role. Furthermore, it will make it more obvious on which edit which wiki was made, if we prefix them, as shown here:

Screenshot-2018-4-24 Revision history of Module Arguments - Wikimedia Commons.png (465×1 px, 75 KB)

Task

  • Prefix and link users in the version history to the wiki where they made the edit

For technical details see T180466: Investigate possible issues with imported user names for FileImporter

Related Objects

Mentioned In
rEFLI613e495f3c52: Final NoteDb migration updates
rEFLI4dade443d498: Make side-effect of ExternalUserNames::applyPrefix() more obvious
rEFLI5e9e78bb5fa3: Always use plain username for file revisions
rEFLIa6a7f9435961: Make side-effect of ExternalUserNames::applyPrefix() more obvious
rEFLIfba031e2d3b8: Use default 'imported' prefix when prefix empty
rEFLIbb50c49fb521: Use default 'imported' prefix when prefix empty
rEFLIdae6812f38b5: Always use plain username for file revisions
rEFLIff27face3120: Always use plain username for file revisions
rEFLIea3ed975f044: Use default 'imported' prefix when prefix empty
rEFLIc870c05b210a: Add basic logging to SiteTableSourceInterWikiLookup
rEFLI133244e45c6a: Add basic logging to SiteTableSourceInterWikiLookup
rEFLIe5fdf2176a79: Always use plain username for file revisions
rEFLIfc032729db89: Use default 'imported' prefix when prefix empty
rEFLI4974027d1ebb: Add external username handling
rEFLId365a5f8a6ad: Add external username handling
rEFLI5d26f05b9c0c: Add external username handling
rEFLI362e8582c318: Add interface to allow interwiki references
rEFLIbea311793f13: Add interface to allow interwiki references
rEFLI906354b88792: Add external username handling
rEFLId124524bf241: Add interface to allow interwiki references
rEFLIed63207cd858: Add external username handling
rEFLI0c08bd1983c9: Add interface to allow interwiki references
rEFLI43b7fd83703e: Add external username handling
rEFLI3becab69bf79: Add external username handling
rEFLIffbaf556034f: Add interface to allow interwiki references
rEFLIf3cf0a3d351e: Add external username handling
rEFLI32ad31b92caa: Add interface to allow interwiki references
rEFLIba0963bb9bc3: Add external username handling
rEFLIc5194a36f3a3: Add interface to allow interwiki references
rEFLI967082c01ce1: Add interface to allow interwiki references
rEFLIcdb7fce6c0a5: Add interface to allow interwiki references
rEFLI6503ae96e558: Add interface to allow interwiki references
Mentioned Here
T180466: Investigate possible issues with imported user names for FileImporter

Event Timeline

WMDE-Design we need to consider whether we want to trust Single User Login, and link to the current wiki, or link to the source wiki user pages

Lea_WMDE set the point value for this task to 8.Apr 24 2018, 2:35 PM

User links with prefix ( reference to the source wiki ) will look like this in the history:

Screenshot-2018-4-24 Revision history of Module Arguments - Wikimedia Commons.png (465×1 px, 75 KB)

Lea_WMDE triaged this task as Medium priority.Apr 25 2018, 1:27 PM
Lea_WMDE updated the task description. (Show Details)

Change 432729 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Add interface to allow interwiki references

https://gerrit.wikimedia.org/r/432729

Change 436225 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Add external username handling

https://gerrit.wikimedia.org/r/436225

Change 432729 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Add interface to allow interwiki references

https://gerrit.wikimedia.org/r/432729

Change 436225 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Add external username handling

https://gerrit.wikimedia.org/r/436225

So the patches are merged and deployed on beta. I could run tests there because the beta configuration kind of accurately mirrors the real production cluster. All seems to work fine. I moved the following images from en.wp.beta / de.wp.beta to commons.beta using the FileExporter link and the generated interwiki prefixes and links to the user profiles are working.

https://commons.wikimedia.beta.wmflabs.org/w/index.php?title=File:MO-WMF.jpg&action=history
https://commons.wikimedia.beta.wmflabs.org/w/index.php?title=File:OOjs_UI_icon_alert.svg&action=history

This is for the text history. The file history username do not support external user name handling with prefixes. So there the plain username is displayed. Still, in these cases we at least make sure, that the user is created via CentralAuth SUL if the account is not present. This is also done via the new external username handling support.

Another hint here:

Due to the 2-level linking (here de:de) this also means that we probably can do general interwiki linking, for example with the source-files location in the comment. In some earlier ticker dealing with that I thought that this is not possible when linking from commons to - for example - fr.wp or de.wikivoyage.

We could revisit that. The foundation for this is implemented now and we just have to make use of the prefixes generated here.

Change 437272 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Use default 'imported' prefix when prefix empty

https://gerrit.wikimedia.org/r/437272

Change 437273 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Always use plain username for file revisions

https://gerrit.wikimedia.org/r/437273

Change 437280 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Add basic logging to SiteTableSourceInterWikiLookup

https://gerrit.wikimedia.org/r/437280

Change 437280 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Add basic logging to SiteTableSourceInterWikiLookup

https://gerrit.wikimedia.org/r/437280

Change 437272 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Use default 'imported' prefix when prefix empty

https://gerrit.wikimedia.org/r/437272

Change 437452 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/core@master] ExternalUserNames: Update partly incomplete documentation

https://gerrit.wikimedia.org/r/437452

Change 437453 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/FileImporter@master] Make side-effect of ExternalUserNames::applyPrefix() more obvious

https://gerrit.wikimedia.org/r/437453

Change 437273 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Always use plain username for file revisions

https://gerrit.wikimedia.org/r/437273

Change 437453 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Make side-effect of ExternalUserNames::applyPrefix() more obvious

https://gerrit.wikimedia.org/r/437453

When importing files from commons to test.wikimedia.beta.wmflabs.org this does not work. I don't get any link and the usernames are prefixed with imported>. Is this expected behavior?

https://test.wikimedia.beta.wmflabs.org/w/index.php?title=File:Snap_fastener_female_(outer)_side_components.jpg&action=history

Change 437452 merged by jenkins-bot:
[mediawiki/core@master] ExternalUserNames: Update partly incomplete documentation

https://gerrit.wikimedia.org/r/437452

When importing files from commons to test.wikimedia.beta.wmflabs.org this does not work. I don't get any link and the usernames are prefixed with imported>. Is this expected behavior?

https://test.wikimedia.beta.wmflabs.org/w/index.php?title=File:Snap_fastener_female_(outer)_side_components.jpg&action=history

So the test.wikimeidia.beta server is not part of the Interwiki map and so it's not possible to interwiki link to it. That's why no interwiki prefix can be retrieved. The default behavior of the Importer used for Wikitext imports in these cases is using imported as prefix. So that was implemented here. A discus sable alternative would be not prefixing at all. But I think that would go against the "new" policy there.

So looking at the logs, I can see, that the site object for the test server could be retrieved from the sites table, but no interwikiIds were configured. In that case the prefix will be empty and we set the imported prefix for the usernames in the revision history.

Also ( sry I realized that just now, looking at the link you provided ): The interwiki prefixed links for test images ( e.g. from de.wp.beta ) only work when importing to commons.wikimedia.beta. Since only the FileImporter there is correctly configured to use the sites table and retrieve the interwikiids from it. The FileImporter on test.wm.beta is configured to accept imports from everywhere and does not respect the sites table.

Last comment ( for now ) to this:

This mirrors best what would happen in production :-)

Change 439566 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/extensions/FileImporter@master] Use interwiki lookup to get prefixes

https://gerrit.wikimedia.org/r/439566

So I tested this different other pages and looked into the sitestable db on the production server. It seems that in many cases the inter wiki prefixes are not set or incomplete in the sites configuration.

I explored the alternative in the patch above. Some general issues with inter wiki linking remain, since adding the lang code is not valid for all wikis and the lookup should probably be improved for links where the source is not Wikipedia.

Lea_WMDE moved this task from Demo to Done on the WMDE-QWERTY-Sprint-2018-05-23 board.

Change 439566 abandoned by WMDE-Fisch:
Use interwiki lookup to get prefixes

https://gerrit.wikimedia.org/r/439566