Page MenuHomePhabricator

Latest Pywikibot git version doesn't work on Toolforge
Closed, ResolvedPublicBUG REPORT

Description

I have been running the git version of Pywikibot on several accounts for Toolforge for years. This way I have an easy way to control upgrades.

Due to T280806 I upgraded one of the accounts. I'm on the latest version now

$ git pull origin master
From https://gerrit.wikimedia.org/r/pywikibot/core
 * branch                master     -> FETCH_HEAD
Already up to date.
$ git rev-parse HEAD
2cf5b090d79cd8a27ad8c446286bc89a7f1083ff
$ git submodule update --recursive --remote
Submodule path 'scripts/i18n': checked out '184a0978948d3b0a59b3c133024a87ade720a4f9'

But it crashes hard

$ python3 pwb.py version.py

Pywikibot is missing a MediaWiki markup parser which is necessary.
Please update the required module with either

    pip install "mwparserfromhell>=0.5.0"

or

    pip install "wikitextparser>=0.47.5"

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/comms/http.py", line 88, in flush
    log('Closing network session.')
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/logging.py", line 193, in log
    logoutput(text, decoder, newline, VERBOSE, **kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/logging.py", line 79, in logoutput
    _init()
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/logging.py", line 38, in _init
    init_routine()
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/tools/_deprecate.py", line 487, in wrapper
    return obj(*new_args, **new_kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/geograph/pywikibot/pywikibot/bot.py", line 398, in init_handlers
    if pywikibot.Site.__doc__ != 'TEST':  # set by aspects.DisableSiteMixin
AttributeError: module 'pywikibot' has no attribute 'Site'

Two parts (probably introduced in T106763)

  • If this such an important dependency, why isn't it in Toolforge?
  • AttributeError should be resolved

Event Timeline

Multichill added a subscriber: JJMC89.

Not sure why the Toolforge project was removed by @JJMC89 . A package is missing on the generic infrastructure affecting multiple tools. Also no mention of this on https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot#Clone_pywikibot_git_repo

JJMC89 removed a project: Toolforge.
JJMC89 updated the task description. (Show Details)

If you are not using the shared install, then you are required to install the dependencies in a virtual environment.

(Toolforge is not an appropriate tag. See its description.)

If you are not using the shared install, then you are required to install the dependencies in a virtual environment.

(Toolforge is not an appropriate tag. See its description.)

Really? You're being extremely unhelpful here.

  1. The virtual environment has always been a recommendation. It was never a requirement. It shouldn't be a hard requirement so either the package should be installed in a central location or the hard dependency should be removed.
  2. We also have the "AttributeError: module 'pywikibot' has no attribute 'Site'"

You can request Toolforge software installs by creating a task with Toolforge (Software install/update); however, there is no requirement for Toolforge to have Pywikibot dependencies installed. mwparserfromhell is provided centrally in the shared install, which is the recommended way to use Pywikibot on Toolforge.

The AttributeError will not be raised when the dependencies are installed.

We (Toolforge admins) encourage people to use virtualenvs to install dependencies for their tools as those will get you much more up to date versions than what Debian ships and don't require us to install the dependencies globally on all grid nodes. You need to have Pywikibot's dependencies installed somehow if you plan to use it. I don't believe we've ever guaranteed to include dependencies needed by clones of (popular) frameworks as it's outside of our control when those frameworks add new/updated requirements (although it looks like Pywikibot has been working with what's installed on the grid by default in the past).

That being said, we do install mwparserfromhell on all grid nodes, but for Python 2 only. Installing the Python 3 version is likely possible as well, but the version on Debian Stretch is 0.4 (compared to 0.5 what the error message claims to require). Upgrading the grid to a newer operating system release has been largely stalled as we've been working on creating Kubernetes-based replacements for the grid.

To summarize a bit:

  • Toolforge policy is to not install Python packages system-wide because it causes confusion and conflicts over versions. People should use virtualenvs or pip install --user instead.
    • Debian Stretch (what the bastions/grid run) contains mwparserfromhell 0.4.2, which wouldn't meet Pywikibot's requirement of >=0.5.0 anyways.
  • Pywikibot requires external dependencies and the shared Toolforge installation (maintained by Pywikibot developers, not Toolforge admins) installs them for you.

If you want to maintain your own Pywikibot installation, IMO using a virtualenv is the easiest. You could also pip install --user mwparserfromhell too, I don't remember exactly what downsides that has over a virtualenv.

Otherwise, I'm not really sure what's actionable in this ticket besides asking for a revert of the mwparserfromhell/wikitextparser dependency.

  • MediaWiki markup parser dependency was introduced because the Pywikibot's regex parser failed for nested templates with depth 3 and the old compat parser was preplaced years ago due to other bugs, see T106763.
  • mwparserfromhell >= 5.0 is required due to T71384 and solved issues with iterables
  • the exception on exit time occures because Pywikibot framework isn't fully loaded when the dependency is checked and the script is exited in this early state. This is indeed ugly but circular dependency of framework parts mainly of the logging module is still problematic in such cases (btw circular Import problems was the main reason to give up Python 3.5.0-3.5.2). Will be solved shortly I guess.

What can we do on Pywikibot's side?

  • We can check the impacts for downgrade the mwpfh version to 4.2 but this is 6 years old and I am not sure whether it works with newer Pythons and if we get back other bugs. Unfortunately we have lost the Travis CI for testing.
  • We can recover the old regex parser but a lot of bugs are unsolved then and we have a different behavior of the results
  • we can have our own MW markup parser but I have no time to recover it from compat and fix the known issues

I am a bit confused about a preinstalled mwpfh noted above because textlib itself doesn’t care about its version [1] and should not raise that given exception. Is it really preinstalled? Or is it for Py2 only?

[1] https://gerrit.wikimedia.org/r/plugins/gitiles/pywikibot/core/+/13c2008f4eb6fe44450339372085f3faf22434a4/pywikibot/textlib.py

The "AttributeError: module 'pywikibot' has no attribute 'Site' error is fixed with T298384. Is anything left to do here?

The "AttributeError: module 'pywikibot' has no attribute 'Site' error is fixed with T298384. Is anything left to do here?

Did you try a blank install on Toolforge without virtualenv? As long as that doesn't work, this problem isn't solved. A work-around exists so you can lower the priority. To really fix this two options exist:

  • Toolforge should do the (bit overdue) update of Debian so that all dependencies are installed by default
  • Pywikibot should strip out some of the dependencies so it runs on the current Toolforge

Hi @Multichill, thanks for your response.

Did you try a blank install on Toolforge without virtualenv?

I have von toolforge access, therefore I have no glue what packages are preinstalled there. It would be a great help if you could give me the response of pip freeze which shows the preinstalled packages with their package version.

Toolforge should do the (bit overdue) update of Debian so that all dependencies are installed by default

I suggest to create a separate task for it (if it's not already done)

Pywikibot should strip out some of the dependencies so it runs on the current Toolforge

A wikitext parser become mandatory due to T106763 with Pywikibot 6.3 (rPWBC21f9f7d), the old behaviour was deprecated with 6.1 (rPWBC525dbc7) and a lot of bugs could be solved with that change.
I don't know about the installation status of other packages which are mandatory. What about requests, which is mandatory since Pywikibot 3.0 I guess. What about setuptools which is shipped with Python but a minimum version is required for versioning.
Is using venv to complicated on toolforge? An idea would be to include the needed wikitext parser as exterals like we had previously with BeautifullSoup in compat branch. Would that be appropriate and help in this matter?

Xqt claimed this task.

As I found out mwparserfromhell 0.6.2 is preinstalled on toolforge now:

$ python3 pwb.py version
Pywikibot: [https] r-pywikibot-core (4cebe88, g15737, 2022/01/05, 12:09:13, master)
Release version: 7.0.0.dev0
setuptools version: 33.1.1
mwparserfromhell version: 0.6.2
wikitextparser version: n/a
requests version: 2.21.0
  cacerts: /etc/ssl/certs/ca-certificates.crt
    certificate test: ok
Python: 3.5.3 (default, Nov  4 2021, 15:29:10) 
[GCC 6.3.0 20170516]
Toolforge hostname: tools-sgebastion-07
PYWIKIBOT_DIR: Not set
PYWIKIBOT_DIR_PWB: ''
PYWIKIBOT_NO_USER_CONFIG: 2
Config base dir: /mnt/nfs/labstore-secondary-tools-home/xqt/pywikibot