Page MenuHomePhabricator

The export and import processes should transport revision tags
Open, MediumPublic

Description

The import and export processes currently do not handle revision tags.
There are plenty of reasons, why they should handle them, at least optionally.

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:53 PM
bzimport set Reference to bz20691.

Removing i18n, has nothing to do with that.

Should be straightforward to add them to the export format at least.

Should we include them in the export unconditioally, or rather have the exporter ask for them in the export form? After all, they cost an extra query per revision.

I believe we've got a summary table which can be left-joined, which'll pull the info in with the main query.

Yes. I was thinking that this also would increase the query result size considerably,
but real data suggests not so. There are hardly any such tags at all, and even less
occasions where there are several tags to a single revision.

Nevertheless, there may be more than 1 result row for one revision with the join, which
makes loop control over rows more complicated.

The tag_summary has one row per revision for exactly this reason, so there's no change to loop control.

TTO subscribed.

I don't think Ariel is working on this any longer.

I wonder about this a bit. Sure, if XML dumps are supposed to be a backup of the wiki, then they should include change tags. But for importing pages from one wiki to another, importing change tags rarely, if ever, makes sense. Tags are specific to the individual wiki in most cases. Just as we don't export patrol status, I don't know if we should export change tags.

Perhaps change tag import/export could be made available via the command line (maintenance scripts) only.

TTO renamed this task from Special:Export and Special:Import should transport revision tags. to The export and import processes should transport revision tags.Jan 8 2016, 9:04 AM
TTO updated the task description. (Show Details)
TTO added a project: Google-Code-In-2015.
TTO set Security to None.
TTO removed subscribers: Aklapper, Georggi199.

This is out of scope for GCI. There are various issues, like extension-specific tags and local wiki tags. Should they be mapped on import (e.g. if there is an analogous tag, but maybe with a different name)? Is the wiki being imported to required to have the same extensions and versions (for extension tags).

In T22691#1922139, @Mattflaschen wrote:

This is out of scope for GCI. There are various issues, like extension-specific tags and local wiki tags. Should they be mapped on import (e.g. if there is an analogous tag, but maybe with a different name)? Is the wiki being imported to required to have the same extensions and versions (for extension tags).

The "export" part is certainly a lot more straightforward than this, and makes an appropriate GCI task. Tags will be exported using dumpBackup.php with the --change-tags switch. No tag metadata will be exported for now; that can be added later if desired.

As for "import", in my mind, the purpose of exporting and importing change tags is part of a full-wiki-backup type scenario (not, say, transwiki import), so we wouldn't really need to add anything to valid_tag or do any mapping. That would be up to the end user.

In any case, the "export" part (only) is already being worked on by a GCI student, so we will re-add the project.

Change 263168 had a related patch set uploaded (by Georggi199):
Export: Exporting now includes custom tags

https://gerrit.wikimedia.org/r/263168

@TTO added a project: Google-Code-In-2015.

@TTO: Is there a corresponding task on the GCI site already? If so, link is welcome. And if not, how would you like to proceed? :)

@Aklapper: I am not @TTO, but I am one who did part of this task already with this change(will be merged once all XSD changes are ready), I am not aware of any other task except this.