Page MenuHomePhabricator

Cosmetic_changes.py deletes poorly formatted cross wiki-links
Closed, DuplicatePublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/296/
Reported by: Anonymous user
Created on: 2011-12-26 21:47:33
Subject: Cosmetic_changes.py deletes cross wiki-links
Original description:
Python 2.6.7 \(r267:88850, Sep 19 2011, 13:25:28\)
\[GCC 4.5.2\]
config-settings:
use\_api = True
use\_api\_login = True
unicode test: ok

1\. I ran in RU-wiki one command:
python /home/$USERNAME/pywiki/cosmetic\_changes.py -lang:ru -always -file:/tmp/somefile

2\. File "/tmp/somefile" contain list of articles for processing
Struthiomimus
QoS

3\. For article "Struthiomimus" in RU-wiki I see, that it was deleted EN cross-link
\[\[en:Steveville|Steveville\]\]
https://secure.wikimedia.org/wikipedia/ru/w/index.php?title=Struthiomimus&diff=prev&oldid=40295326

4\. I think, that cosmetic\_changes.py should not deletes cross-links to other wikis.


Version: unspecified
Severity: enhancement
See Also:
https://sourceforge.net/p/pywikipediabot/feature-requests/296

Details

Reference
bz55041

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:16 AM
bzimport set Reference to bz55041.
bzimport added a subscriber: Unknown Object (????).

Error with "standardizePageFooter".
When I comment line
\# text = self.standardizePageFooter\(text\)
in file cosmetic\_changes.py
then I can't reproduce error.

  • status: open --> open-duplicate

This may be caused by the same code but the desired effect is slightly different, so it's not entirely a dup - but it is obviously related. In this case though, cosmetic\_changes.py should convert \[\[en:Steveville|Steveville\]\] to \[\[:en:Steveville|Steveville\]\] because it's in the body of the article rather than treating it as an interwiki language link.

  • priority: 5 --> 7
  • status: open-duplicate --> open

Like mediawiki software pwb interprets \[\[en:Steveville|Steveville\]\] as a interwiki link not as a crosswiki link. These must be writen as \[\[:en:Steveville|Steveville\]\] . I do not see anything wrong with the bot.

  • assigned_to: nobody --> xqt
  • status: open --> pending-invalid

It's still a bug to remove a link that isn't supposed to be removed, even if the link is improperly formed. The bug isn't invalid, it's just not possible to overcome by normal means \(as someone pointed out to me on IRC, users sometimes put real interwiki language links in the body because they don't know the correct place. Though, it would probably be possible to get the bot to avoid these.

If this link is the only iw link to another side it will be re-placed to the bottom of the page. Otherwise it will be deleted \(which is an other bug tracker\). The given issue is indeed wrong. I was not a inline interwiki link as you proposed, it was a malformed crosswiki link. Maybe interwiki always should start at the beginning of a new line, which must be supported by mw software. But here are several wikis which place interwiki on one line and/or on top of a page. See some switches in the config.py.

Moved to feature request: Maybe it is an idea to fix crosswiki link inside a text. But we should remember: interwiki links does not always start on a new line. They could placed on a single line; this is set by family.interwiki\_on\_one\_line.

  • priority: 7 --> 5
  • labels: 745455 --> interwiki
  • assigned_to: xqt --> nobody
  • status: pending-invalid --> open
  • This bug has been marked as a duplicate of bug 55298 ***
jayvdb set Security to None.
jayvdb removed a subscriber: Unknown Object (????).