Page MenuHomePhabricator

Add Hooks to SpecialImport and SpecialExport
Closed, ResolvedPublicFeature

Description

I have an extension that adds additional metadata to pages. I've modified SpecialImport and SpecialExport to handle importing/exporting this additional metadata, but it would be nice if this could be done via hooks instead of modifying core code. It doesn't look like an easy problem to solve since the importer relies on callbacks but...


Version: 1.11.x
Severity: enhancement

Details

Reference
bz11539

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:58 PM
bzimport set Reference to bz11539.
bzimport added a subscriber: Unknown Object (MLST).

Fix against 11.0 release

Adds a hook to Export.php right before it writes the closing </revision> to allow extensions to export additional metadata for a revision.

Changes the way importing is handled slightly. When an unknown tag is encountered in the <revision> block now, instead of throwing an error, the importer stores the tag in an array. After the revision is saved in the WikiRevision class, a hook is fired with the array as a parameter allowing extensions to import the custom tags. Hopefully nothing currently relies on the error throwing bit.

Usage:

$wgHooks['ExportPageRevision'][] = 'efExportMetadata';
function efExportMetadata( $writer, $revid, $out ) {

    // lookup additional metadata for the revision using the $revid
    $metadata = getRevisionMetadata($revid);
	
    // construct additional metadata chunks and write them to $out
    foreach($metadata as $field=>$value) {
        $out .= "      " . wfElement( $field, null, strval( $value ) ) . "\n";
    }
    return true;

}

$wgHooks['ImportPageRevision'][] = 'efImportMetadata';
function efImportMetadata( $wikirev, $revid, $metadata ) {

// Do something with the imported metadata
$revision = Revision::newFromId($revid);
$custom_metadata_handler = new CustomMetadataHandler($revision);

foreach($metadata as $name=>$value) {
    $custom_metadata_handler->handle($name, $value);
}
return true;

}

attachment ImportExport.diff ignored as obsolete

Please post patches as attachments so they can be managed without copy/paste problems.

I don't understand. Do you mean don't select the patch checkbox when you are submitting an attachment?

I mean *attach a file*, don't paste it into a comment.

Um, I did attach the patch. It's id 4219. Its the one and only attachment on this bug. What I typed into the first comment is not a patch, its the recommended usage of the code that was added in the patch. Something that an end user would put in an extension to use the new hooks.

Patch against 1.13r39163

The import hook in this patch fires right before the revision is saved and passes the revision object so it can be manipulated.

attachment ImportExport.patch ignored as obsolete

Mass compoment change: <some> -> Export/Import

mike.lifeguard+bugs wrote:

Is this a dupe of bug 11537?

It seems to me that this is a superset of bug 11537 in that it proposes more changes to the code. The part corresponding to bug 11537 is the patch to importOldRevision(). This patch is different from that of bug 11537 and appears at a different place, but I think it can be used for similar purposes. (The order in which hooks would be fired is different between the two bugs, but this should be of little importance.)

Personally, I'd be happy if any of the two would be included in one of the next releases.

My DataTable extension (http://www.mediawiki.org/wiki/Extension:DataTable) now uses the hook NewRevisionFromEditComplete (http://www.mediawiki.org/wiki/Manual:Hooks/NewRevisionFromEditComplete), so personally I don't have any further need for the hook suggested here.

So I still need this, but I'm not going to continue making patches for it until there is some indication that it is going to get added to the code. If someone with permissions feels like committing this, let me know and I'll generate a new patch against trunk.

Oh so close!

Unless I'm missing something, ImportHandleRevisionXMLTag is worthless for processing the imported XML. It doesn't pass the name of the tag being processed or the contents of the tag. Also, the tag name/contents are stored in a private member/function of the importer so you can't get to them in a hook.

Can we either, 1) modify the hook to pass the tag name and tag contents or 2) make the $reader member and nodeContents function in WikiImporter public?

sumanah wrote:

Christian, I am very sorry for the wait. I've cc'd the people who currently work on these dumps so they can advise you. Thank you for your contribution and sorry again for the delay in response!

Comment on attachment 5175
Patch against 1.13r39163

Marking as obsolete as it is superseded in core

(In reply to comment #13)

Oh so close!

Unless I'm missing something, ImportHandleRevisionXMLTag is worthless for
processing the imported XML. It doesn't pass the name of the tag being
processed or the contents of the tag. Also, the tag name/contents are stored
in a private member/function of the importer so you can't get to them in a
hook.

Can we either, 1) modify the hook to pass the tag name and tag contents or 2)
make the $reader member and nodeContents function in WikiImporter public?

Well, making the variables directly public is a no, adding a getFunction() is fair enough.

Extra parameters passed to the hook should be fine

What do the currently passed things give you?

Make nodeContents public and add the $tag param to various hooks

(In reply to comment #16)

What do the currently passed things give you?

http://www.mediawiki.org/wiki/Manual:Hooks/ImportHandleRevisionXMLTag
http://www.mediawiki.org/wiki/Manual:Hooks/AfterImportPage

  • $importer: The WikiImporter object
  • $pageInfo: An array of xml tag names => xml tag content for the <page> object
  • $revisionInfo: An array of xml tag names => xml tag contents for the <revision> object

Theoretically in the ImportHandleRevisionXMLTag hook, you would process the XML input and add data to the $pageInfo or $revisionInfo array. Then later on in the AfterImportPage hook, you could process the data and save it to the database or whatever.

The problem is, the actual data being parsed out of the XML is stored in the $tag object in the importer and that isn't passed to the hook so you can't actually see what tag is being encountered. Adding the $tag param to the hook would fix that. To get the contents of the XML tag, you need to call $importer->nodeContents() which is currently a private function. Making that function public would totally solve that.

Attached:

(In reply to comment #17)

Created attachment 9869 [details]
Make nodeContents public and add the $tag param to various hooks

(In reply to comment #16)

What do the currently passed things give you?

http://www.mediawiki.org/wiki/Manual:Hooks/ImportHandleRevisionXMLTag
http://www.mediawiki.org/wiki/Manual:Hooks/AfterImportPage

  • $importer: The WikiImporter object
  • $pageInfo: An array of xml tag names => xml tag content for the <page> object
  • $revisionInfo: An array of xml tag names => xml tag contents for the

<revision> object

Theoretically in the ImportHandleRevisionXMLTag hook, you would process the XML
input and add data to the $pageInfo or $revisionInfo array. Then later on in
the AfterImportPage hook, you could process the data and save it to the
database or whatever.

The problem is, the actual data being parsed out of the XML is stored in the
$tag object in the importer and that isn't passed to the hook so you can't
actually see what tag is being encountered. Adding the $tag param to the hook
would fix that. To get the contents of the XML tag, you need to call
$importer->nodeContents() which is currently a private function. Making that
function public would totally solve that.

I think your patch is the wrong way round, as it shows you're making it private

Also, please use a unified diff against a file, see https://www.mediawiki.org/wiki/Subversion#Making_a_diff

Attached:

(In reply to comment #18)

I think your patch is the wrong way round, as it shows you're making it private

Also, please use a unified diff against a file, see
https://www.mediawiki.org/wiki/Subversion#Making_a_diff

Bah! I was trying to make a diff without checking anything out. I'll try again later when I have some time to mess with it.

sumanah wrote:

Christian, is this still something you have an interest in? We now use Git https://www.mediawiki.org/wiki/Git/Tutorial in case you'd like to address this hook addition again. (And if the answer is "no, I'm rather tired of this issue and wash my hands of it" I completely understand.)

Best wishes.

I am still interested in the capability but realistically I'm not going to be submitting a patch for it again. Thanks for following up.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 12:24 PM
Aklapper removed a subscriber: wikibugs-l-list.
Pppery subscribed.

Ancient task with no updates in a decade, https://www.mediawiki.org/wiki/Category:MediaWiki_hooks_included_in_WikiImporter.php shows that there are now plenty of import hooks available.