Page MenuHomePhabricator

claimit.py: Duplicated references in single edit
Closed, ResolvedPublicBUG REPORT

Description

image.png (1×1 px, 199 KB)

Total of 8 references were added on single edit. This cannot be normal.

Command used

$ python pwb.py claimit -cat:"미스코리아 지역 대회" P361 Q494195

Version info

$ git fetch --all
Fetching origin
$ git rebase origin/master
Current branch master is up to date.
$ git show
commit c1ed3a2b8932b9e05f186c3c52e6f34d531e6f27 (HEAD -> master, origin/master, origin/HEAD)
Author: xqt <info@gno.de>
Date:   Sun Mar 17 18:36:56 2019 +0100

Event Timeline

Tracked this down. The cause is both rPWBC80e00af9e1139b16842535163f8656a5cea9737d and quite permissive implementation of pywikibot.Claim. Applies to T224283 as well.

In claimit.py, we have:

scripts/claimit.py
def treat_page_and_item(self, page, item):
    for claim in self.claims:
        # The generator might yield pages from multiple sites
        site = page.site if page is not None else None
        self.user_add_claim_unless_exists(
            item, claim, self.exists_arg, site)

self.claims is a list of references to preloaded Claim objects. They are mutable objects, so when a source is attached to them during the import, the instance keeps it and with the following item another source is added and the claim is saved with two sources and so on... This can only be prevented by creating a new instance of claim every time it is added to an item. Note that harvest_template.py doesn't have this problem because a fresh instance of Claim is always constructed.

This can be fixed by creating a method to make copy of claim (and possibly by preventing saving an already imported claim to another item).

Change 512415 had a related patch set uploaded (by Matěj Suchánek; owner: Matěj Suchánek):
[pywikibot/core@master] [IMPR] [FIX] Introduce Claim.copy and prevent adding already saved claims

https://gerrit.wikimedia.org/r/512415

Change 512415 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] [FIX] Introduce Claim.copy and prevent adding already saved claims

https://gerrit.wikimedia.org/r/512415

Xqt claimed this task.

Reopened due to a lot of failing tests

Xqt changed the subtype of this task from "Task" to "Bug Report".May 25 2019, 3:16 PM

Change 512491 had a related patch set uploaded (by Matěj Suchánek; owner: Matěj Suchánek):
[pywikibot/core@master] Fix Claim.copy to make tests pass

https://gerrit.wikimedia.org/r/512491

Change 512491 merged by jenkins-bot:
[pywikibot/core@master] Fix Claim.copy to make tests pass

https://gerrit.wikimedia.org/r/512491