Page MenuHomePhabricator

Pagefromfile.py script returns errors
Closed, ResolvedPublic

Description

I'm trying to use the pagefromfile.py script from pywikibot to import text files as new pages in my mediawiki server. I am able to login and do some basic maintenance, but this specific bot gives me an error: https://pastebin.com/WDTfuLLH.

Steps to reproduce:

  • nano pages.txt
  • Paste the example from wiki:

{{-start-}}
'''Pywikibot''' is a Python library and collection of scripts that automate wor$
Originally designed for Wikipedia, it is now used throughout the Wikimedia Foun$
{{-stop-}}
{{-start-}}
'''AutoWikiBrowser''' (often abbreviated '''AWB''') is a semi-automated MediaWi$
{{-stop-}}

  • python pwb.py pagefromfile -file:pages.txt

Event Timeline

@Sc4s2cg Hello, thank you for taking the time to create a task for your issue. In which folder do you have the file pages.txt stored? Could you also please include a version information for us to investigate (pywikibot pwb.py version)?

@Dvorapa thanks for the quick reply!

Pages.txt was stored in the core directory. I copied it to the scripts directory and the same error persists. Version information:

NetShare\core> python pwb.py versionPywikibot: pywikibot/__init__.py (, -1 (unknown), 2018/06/06, 08:01:50, UNKNOWN)
Release version: 3.1.dev0
requests version: 2.18.4
  cacerts: C:\Users\Peter\AppData\Local\Programs\Python\Python36-32\lib\site-packages\certifi\cacert.pem
    certificate test: ok
Python: 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 16:07:46) [MSC v.1900 32 bit (Intel)]
PYWIKIBOT2_DIR: Not set
PYWIKIBOT2_DIR_PWB:
PYWIKIBOT2_NO_USER_CONFIG: Not set
Config base dir: \\192.168.1.240\NetShare\core
Usernames for family "famwiki":
        en: Petike (no sysop configured)
This comment was removed by Framawiki.

I can reproduce the issue. @Xqt There are those skip_page and run involved in the traceback. Do you know what could be the problem here?

Looks that page isn’t a page object but a tuple.

Xqt triaged this task as Medium priority.

Yes, because it is created by generator and generator here returns a tuple

Wierd is that this was working correctly recently, so some more recent change broke that

I found the issue. Bot here generates a tuple (title, content) in its generator. Then it makes the Page object from it in its treat(). But in d582f9dd0a09872906abe1ba4c10074275e68a52 there was added skip_page() check on lines 1470 and 1471 before the treat(), which already understands the page to be page object. So the solution here would be to change the treat() in pagefromfile to init_page()

Change 437959 had a related patch set uploaded (by Dvorapa; owner: Dvorapa):
[pywikibot/core@master] [bugfix] Initialize page before skip_page

https://gerrit.wikimedia.org/r/437959

Dvorapa raised the priority of this task from Medium to High.Jun 7 2018, 11:20 AM
Dvorapa removed a subscriber: Pywikibot.

Change 439446 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [bugfix] Ensure that BaseBot.treat is always processing a Page object

https://gerrit.wikimedia.org/r/439446

Change 437959 abandoned by Dvorapa:
[bugfix] Initialize page before skip_page

Reason:
In favor of https://gerrit.wikimedia.org/r/#/c/pywikibot/core/ /439446/

https://gerrit.wikimedia.org/r/437959

Change 439446 merged by jenkins-bot:
[pywikibot/core@master] [bugfix] Ensure that BaseBot.treat is always processing a Page object

https://gerrit.wikimedia.org/r/439446

I tested the Change 439446 on 2 different wikis, seems to work fine, no error produced.

Edit: oh, it has been merged already, good.

Vvjjkkii renamed this task from Pagefromfile.py script returns errors to 9ibaaaaaaa.Jul 1 2018, 1:05 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Xqt as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.