WikiExtractor.py was run on the Swedish Wikipedia dump file on two different machines, and produced different text files, using "\n" or "\n\n" as paragraph delimiter (the latter should be more correct)?
Furthermore WikiExtractor.py skips some tokens, such as "1 500" (they disappear).
(WikiExtractor.py is currently used but will probably be replaced.)