Page MenuHomePhabricator

textlib extract_templates_and_params depends on future on Python2.6
Closed, DeclinedPublic

Description

Python 2.6.6 is supported, see https://www.mediawiki.org/wiki/Manual:Pywikibot/Version_table , so why the hell are we using features that are not part of python 2.6 in core?!?!?
I'm on a Redhat install where Python 2.6.6 is still the default python and next to impossible to upgrade without disturbing other things on the system.

Event Timeline

Multichill raised the priority of this task from to High.
Multichill updated the task description. (Show Details)
Multichill subscribed.
Restricted Application added subscribers: Aklapper, Unknown Object (MLST). · View Herald TranscriptFeb 20 2015, 3:39 PM
jayvdb set Security to None.

What is the problem you saw? We run the full test suite against Python 2.6.9 on Travis.
I am now running Win32 and Win64 Python 2.6.6 tests on Appveyor.
https://ci.appveyor.com/project/jayvdb/pywikibot-core/build/1.0.py2.6.6.66

Can't reproduce, I've installed 2.6.6 locally and I didn't get any textlib related errors (I got a Python 2.6 specific one, one related to a broken wiki and the script tests didn't work because of an insecure python egg directory).

======================================================================
ERROR: test_valid_entities (tests.page_tests.HtmlEntity)
Test valid entities.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/xzise/Programms/pywikibot/core/tests/page_tests.py", line 871, in test_valid_entities
    self.assertEqual(pywikibot.page.html2unicode('𐀀'), u'\U00010000')
  File "/home/xzise/Programms/pywikibot/core/pywikibot/page.py", line 4881, in html2unicode
    return entityR.sub(handle_entity, text)
  File "/home/xzise/Programms/pywikibot/core/pywikibot/page.py", line 4875, in handle_entity
    return eval("'\\U{:08x}'".format(unicodeCodepoint))
ValueError: zero length field name in format

======================================================================
FAIL: testSearch (tests.site_tests.SiteUserTestCase)
Test the site.search() method.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/xzise/Programms/pywikibot/core/tests/site_tests.py", line 1000, in testSearch
    self.assertTrue(all(hit.namespace() == 0 for hit in se))
AssertionError: False is not true

======================================================================
FAIL: test__login_help (tests.script_tests.TestScript)
Test running login -help
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/xzise/Programms/pywikibot/core/tests/script_tests.py", line 278, in testScript
    self.assertIsNone(stderr_other)
AssertionError: [u'WARNING: /home/xzise/.pyenv/versions/2.6.6/lib/python2.6/site-packages/setuptools-15.2-py2.6.egg/pkg_resources/__init__.py:1250: UserWarning: /home/xzise/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable).', u'', u''] is not None

And is the tone necessary in the bug report? Considering that the information is very vague (textlib and 2.6.6, but what in textlib?) I think it's inappropriate. When at some point we broke compatibility then it happened and we need to fix that (or drop support for it) but starting the issue that way does not help your cause.

The appveyor build is a bust so you need to give more information like a stack trace or similar.

The appveyor build is a bust so you need to give more information like a stack trace or similar.

Hopefully fixed that build setup problem; new build underway

https://ci.appveyor.com/project/jayvdb/pywikibot-core/build/1.0.appveyor-more-builds.67

We managed to reproduce the bug. It's caused by the check if a bot is allowed to edit. Talked with Jay about this. Nice solution would be to check if the template actually exists before trying to parse a page with textlib

So… is that now a Python 2.6.6 specific problem? And where and how does it fail now? When it tries to parse the text and extract the templates? Because that is the only usage of textlib via botMayEdit and that doesn't need to know which templates actually exist on that wiki. So it can't be fixed by checking if the template actually exist and it would only hide the problem.

I tested Python 2.6.6 on a wiki without Bots and Nobots template and botMayEdit worked if you've installed the future package:

Python 2.6.6 (r266:84292, May  3 2015, 02:23:05) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux3
Type "help", "copyright", "credits" or "license" for more information.
>>> import pywikibot
>>> s = pywikibot.Site()
>>> pywikibot.Page(s, 'Template:Bots').exists()
False
>>> pywikibot.Page(s, 'Template:Nobots').exists()
False
>>> pywikibot.Page(s, 'Main Page').botMayEdit()
True

So is this bug report actually about the fact that you need the future package in Python 2.6 (or that pywikibot needs OrderedDict)? Or did I misunderstand that update?

So is this bug report actually about the fact that you need the future package in Python 2.6 (or that pywikibot needs OrderedDict)? Or did I misunderstand that update?

Yes. On that box, @Multichill can't install python packages into the global system, and it doesnt have virtualenv installed. So reducing the code paths that utilise OrderedDict helps keep that box running.

In this case the workaround is to set config.ignore_bot_templates to True.
A simple way to bed that workaround into the system is by adding post config logic to config that 'if a ordereddict isnt available, turn on ignore_bot_templates with a user warning'. We could take that further and add 'if a ordereddict or counter isnt available, make DataSite and the new Page methods depending on OrderedDict/Counter return NotImplementedError'.

To reduce how much needs to be disabled, my initial attempt at adding an OrderedDict dependency could also be used, as it falls back to using a (unordered) dict, which is perfectly acceptable for most uses. https://gerrit.wikimedia.org/r/#/c/147665/1/pywikibot/__init__.py,cm . Counter is very easy to reimplemented, esp. if some of its methods raise NotImplementedError.

Nice solution would be to check if the template actually exists before trying to parse a page with textlib

This approach would work for @Multichill and some others, as he is using py2.6 on a private wiki which doesnt have this template. The current code is very Wikimedia-centric.
A patch to reduce the problem is https://gerrit.wikimedia.org/r/#/c/179177/ . That is -1'd because 7e3772ca added a warning that some templates may not appear in https://www.mediawiki.org/wiki/API:Templates , so falling back to text parsing is 'mandatory'. I cant see any mention of this problem in any relevant documentation.

This approach could be achieved by botMayEdit asking for a specific template list very early, such as:

self.templatesWithParams(only_template_names=['bots', 'nobots']) ,

Then templatesWithParams could pass the list to extract_templates_and_params , and the regex and mwpfh implementations could optimise their algorithm accordingly, and return less data, and often no data at all.

jayvdb renamed this task from Textlib.py broken in Python2.6.6 to textlib extract_templates_and_params depends on future on Python2.6.Jun 6 2015, 5:41 AM

The appveyor build is a bust so you need to give more information like a stack trace or similar.

Hopefully fixed that build setup problem; new build underway

https://ci.appveyor.com/project/jayvdb/pywikibot-core/build/1.0.appveyor-more-builds.67

Successful 2.6.6 Win32 builds now happening at https://ci.appveyor.com/project/jayvdb/pywikibot-core

Xqt subscribed.

py2.6 will be dropped with T154771