Import should allow mapping of namespace names and aliases
Open, LowestPublicFeature
Actions

Assigned To

None

Authored By

	• Purodha
	Nov 10 2012, 3:08 PM

Description

Import data and the importing Wiki may contain differing and/or more or less name space names and name space name aliases. Currently, if adjustemns are needed, one needs to know that before the import is started, and has to alter the import data so as to match the importing wikis names space names and aliases. This can be cumbersome and time consuming for. It is error prone, and next to impossible for large automated imports.

We could add an option (checkbox) asking dor a stepwise approach like this:

Show a list of all namespace names in the import and in the local wiki with

an automatically generated mapping suggestion.

Allow the importer to adjust the mapping.
Do the final import.

The downsides:
A) An uploaded file has to be preserved over some time including possibly
multiple data submissions by the importer.
B) The import file has to be read twice. It has to be read and analyzed in its
entirity during the 1st scan already since the the list of original namespaces
in the beginning does not deal with possible occurrences of
name space name aliases embedded in page data. Those need to be part of the
mapping, however.

The good sides:

Most flexible.
Often used mappings can be preserved and automagically be recalled by the

import process.

Step 1) could by the way reveal some statistics to the importer, allowing e.g. to not import implausible data.

This looks like a major revaming of the import code, however.

Version: unspecified
Severity: enhancement
URL: https://bugzilla.wikimedia.org/show_bug.cgi?id=30723#c6
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=62111

Details

Reference: bz41969

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		TTO	T32723 Import should always use original wiki's namespace names in log entries and trim namespaces it doesn't know in the target title to allow manual choice
Open	Feature	None	T64111 Importing XML dumps should validate that the target wiki has the same namespaces as the pages being imported
Open	Feature	None	T43969 Import should allow mapping of namespace names and aliases
Open		None	T113472 Warnings functionality for the import process

Event Timeline

• bzimport raised the priority of this task from to Low.Nov 22 2014, 12:55 AM

• bzimport added projects: Future-Release, MediaWiki-Core-Snapshots.

• bzimport set Reference to bz41969.

• bzimport added a subscriber: Unknown Object (MLST).

• Purodha created this task.Nov 10 2012, 3:08 PM

TTO mentioned this in T32723: Import should always use original wiki's namespace names in log entries and trim namespaces it doesn't know in the target title to allow manual choice.Jan 6 2015, 1:59 AM

This would be a significant feature to add, not least because it would turn import into a two-step procedure. The dump would be uploaded to the server at the first stage, then MW would look at it and display a namespace choice form. Then, upon submission of that form, it would go back to the previously uploaded dump and actually do the import procedure. So among other things, it would require some kind of intermediate storage place for dumps.

What's more, while this feature would be nice in some situations, I don't think it would see a whole lot of use. The resolution of T32723 has helped with namespace matching.

So in summary, there is still some room for improvement here, so I won't close this task as declined. But this is very unlikely to be implemented unless we end up implementing the "intermediate storage place for dumps" as part of some other feature or bug fix.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 23 2015, 11:58 AM

TTO added a subtask: T113472: Warnings functionality for the import process.Sep 23 2015, 1:01 PM

The mapping could be input before the upload and import begins, but that has disadvantages:

users must know all original namespacenames
users must type original namespacenames and cannot be prompted
misspellings would have to abort the entire process after the initial dump lines are scanned

Advantages:

much more easily implemented
import is a single pass process

Of course, uploaders could copy&paste the original namespacenames from the lines at the beginning of dumps.

TTO mentioned this in T64111: Importing XML dumps should validate that the target wiki has the same namespaces as the pages being imported.Sep 24 2015, 10:15 AM

TTO added a parent task: T64111: Importing XML dumps should validate that the target wiki has the same namespaces as the pages being imported.

• Purodha mentioned this in T114662: RFC: Per-language URLs for multilingual wiki pages.Feb 5 2016, 5:51 PM

• Phabricator_maintenance removed a project: Future-Release.Aug 12 2016, 4:54 PM

• Phabricator_maintenance removed a subscriber: • wikibugs-l-list.

jayvdb mentioned this in T143687: Allow setting target namespace in importDump.php .Aug 25 2016, 12:55 AM

ArielGlenn moved this task from Backlog to Snapshot import on the MediaWiki-Core-Snapshots board.Oct 29 2018, 10:04 AM

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:14 AM

Aklapper removed a subscriber: • Purodha.

Import should allow mapping of namespace names and aliasesOpen, LowestPublicFeatureActions

Description

Details

Related ObjectsSearch...

Event Timeline

Import should allow mapping of namespace names and aliases
Open, LowestPublicFeature
Actions

Related Objects
Search...