Page MenuHomePhabricator

The order of parameters is lost when using extract_templates_and_params() and extract_templates_and_params_regex()
Closed, ResolvedPublic

Description

In compat, templatesWithParams from class Page used to
provide a pair containing the template name and a list of parameters,
with the full "key=value" string. Nowadays, we're getting a dictionary
instead of that list. Normally there is nothing wrong with that,
except that in Python 2 the dictionary is unordered, which means that:

  • the order of the parameters is forever lost - this can be easily solved using OrderedDict instead.
  • the original text cannot be reconstructed (because of the above and

the missing whitespace information) - this means there is no easy way
to identify and/or replace a particular instance of the template in a
page with many identical templates.


Version: core-(2.0)
Severity: normal

Details

Reference
bz55882

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 2:15 AM
bzimport added a project: Pywikibot-textlib.py.
bzimport set Reference to bz55882.
bzimport added a subscriber: Unknown Object (????).
Strainu created this task.Oct 18 2013, 4:38 PM

(In reply to Strainu from comment #0)

  • the original text cannot be reconstructed (because of the above and

the missing whitespace information) - this means there is no easy way
to identify and/or replace a particular instance of the template in a
page with many identical templates.

If you're looking for a way to preserve spacing rules, try mwparserfromhell:
http://mwparserfromhell.readthedocs.org

Change 126147 had a related patch set uploaded by Ricordisamoa:
use collections.OrderedDict instead of built-in dict for params

https://gerrit.wikimedia.org/r/126147

Change 126147 merged by jenkins-bot:
use collections.OrderedDict instead of built-in dict for params

https://gerrit.wikimedia.org/r/126147

Xqt added a comment.Apr 20 2014, 4:19 PM

Reverted in https://gerrit.wikimedia.org/r/#/c/127467/ because orderedDict is not availlable in python < 2.7

What about just:

try:

from collections import OrderedDict

except ImportError

OrderedDict = dict

(In reply to Kunal Mehta (Legoktm) from comment #6)

Preserving the order when OrderedDict is available while discarding it otherwise would be an inconsistent behavior.

(In reply to Ricordisamoa from comment #7)

Preserving the order when OrderedDict is available while discarding it
otherwise would be an inconsistent behavior.

I suppose. I think bundling the ordereddict pypi package is a good idea then.

The python layer for ItemPage.claims is greatly simpified if it is an ordered set. The qualifiers also have an order that needs to be preserved by default and manipulatable.

Change 140212 had a related patch set uploaded by Merlijn van Deen:
use collections.OrderedDict instead of built-in dict for params

https://gerrit.wikimedia.org/r/140212

jayvdb added a comment.Aug 4 2014, 4:02 PM

Are there some complicated structures that we should add to the unit tests to ensure it correctly organises them?

Change 140212 merged by jenkins-bot:
use OrderedDict instead of builtin dict for params

https://gerrit.wikimedia.org/r/140212

Ricordisamoa closed this task as Resolved.Dec 12 2014, 8:14 AM
Ricordisamoa claimed this task.

Yay!

This task has been assigned to me since the Bugzilla era. It is obviously a bug in Phabricator.

Note also that extract_templates_and_params_regex() has bugs that are not related to the OrderedDict thing.