Page MenuHomePhabricator

Special:Import should optionally set timestamp and summary comment
Closed, DeclinedPublic

Description

The easiest way to search-and-replace in MediaWiki is to use Special:Export, then edit the export dump file, then use Special:Import. Unfortunately, there are two pieces missing from this method:

  • All <timestamp> fields are wrong, because the import is happening NOW, not at the time of the last revision
  • All summary comments are wrong, because they don't mention the search-and-replace operation.

To achieve this right now, we have to edit the dump file to change timestamps and summary comments... but couldn't Special:Import do this for us, optionally? (Or perhaps maintenance/importDump.php?)

I suggest adding two new fields to Special:Import:

  1. A checkbox, "Set all import timestamps to NOW", defaulted to unchecked
  2. A text input, "Override all summary comments", defaulted to empty

and/or modifying maintenance/importDump.php to add these features via command line.


Version: 1.13.x
Severity: enhancement

Details

Reference
bz17554

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:31 PM
bzimport set Reference to bz17554.
bzimport added a subscriber: Unknown Object (MLST).

This isn't really what Import was designed for. If you want to edit the contents of a page, one should really do this via a normal edit. If a lot of pages need to have an automatic search and replace done, there is an API for this. But letting people modify aspects of the page history via the import form sounds a little scary.

Hmmm. I don't see why this is "modifying aspects of the page history." It's just importing a new, current version at the current time, not in the past.

If the MediaWiki API can do this simply, that's great. Where can I see a complete example that does search-and-replace in general? ("In general" meaning "you don't have to write a new PHP script for every search/replace operation.") Somehow I wonder, if it were this simple, wouldn't someone already have written a search-and-replace Special Page?

Also, I think the import solution is better than an API solution in one respect: you can see and validate your changes (in the XML file) before import. With the API, if you get your search-and-replace patterns wrong (which is VERY easy to do), you've modified your content wrongly. I find that more "scary" than modifying an XML file you can check in advance.

(In reply to comment #2)

Hmmm. I don't see why this is "modifying aspects of the page history." It's
just importing a new, current version at the current time, not in the past.

Export/Import is designed for copying the history or content of a page from one wiki to another. AFAIK, Import isn't designed so that people can download the XML docuument, then re-upload it.

If the MediaWiki API can do this simply, that's great. Where can I see a
complete example that does search-and-replace in general? ("In general"
meaning "you don't have to write a new PHP script for every search/replace
operation.") Somehow I wonder, if it were this simple, wouldn't someone
already have written a search-and-replace Special Page?

Something like http://www.mediawiki.org/wiki/Extension:Replace_Text? It doesn't support regex yet, but it does seem to be maintained at least.
I don't know of any premade PHP scripts, but I know Pywikipedia has one and writing a simple one shouldn't be too difficult.

  1. Define the search expression and replace text
  2. Load the list of page to edit from a text file or something
  3. Login to the site with action=login
  4. Get the pagetext for each from prop=revisions
  5. Perform the replacements
  6. Save the pages with action=edit

Also, I think the import solution is better than an API solution in one
respect: you can see and validate your changes (in the XML file) before import.
With the API, if you get your search-and-replace patterns wrong (which is VERY
easy to do), you've modified your content wrongly. I find that more "scary"
than modifying an XML file you can check in advance.

If this is that much of a concern, add "save the text to a file or run through action=parse and load in a browser and wait for user input" before saving in the above steps.

Closing as WONTFIX per earlier comments. If you want to do a search-and-replace, use the edit API (or if keen and able, modify the database directly). Export/modify/reimport (aka rewriting history) is not something we should be supporting in MW core.