Page MenuHomePhabricator

mwdumper documentation for --filter is incorrect for namespaces
Closed, ResolvedPublic

Description

Author: JArmistead

Description:
If I run the following command:

java -jar mwdumper.jar --output=file:temp10.xml --format=xml --filter=namespace:10 metawiki-
20060202-pages-meta-current.xml

Then the resulting XML file, temp10.xml, actually INCLUDES all articles from namespace 10
(templates)

Yet the README.txt file for mwdumper

http://download.wikimedia.org/tools/README.txt

clearly says

--filter=namespace:[!]<NS_KEY,NS_OTHERKEY,100,...>
    Excludes all pages in (or not in, with "!") the given namespaces.
    You can use the NS_* constant names or the raw numeric keys.

i.e. it should EXCLUDE all namespace 10 (Template) pages.

Seems like the README.txt has it the wrong way, and it should instead read:

--filter=namespace:[!]<NS_KEY,NS_OTHERKEY,100,...>
    Includes all pages in (or not in, with "!") the given namespaces.
    You can use the NS_* constant names or the raw numeric keys.

At least, then it would line up with the observed behaviour.

According to my Java VM, my Windows XP SP2 PC is running

java -version

java version "1.5.0_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05)
Java HotSpot(TM) Client VM (build 1.5.0_04-b05, mixed mode)


Version: 1.5.x
Severity: normal
OS: Windows XP
Platform: PC

Details

Reference
bz4835

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:04 PM
bzimport set Reference to bz4835.
bzimport added a subscriber: Unknown Object (MLST).