Page MenuHomePhabricator

PagesTagParser doesn't display as expected when attributes have two or more words
Open, Needs TriagePublicFeature

Description

Feature summary:
I would like to easily create multiple <pages /> tags using pywikibot. However, when I create a new instance of PagesTagParser and asign the attributes, the parser fails when the values have two or more words, as it doesn't add quotes automatically as one might expect

tag = PagesTagParser()  
tag.index = 'Sample index with more than two words.pdf'
tag.ffrom = 5
tag.to = "6"
tag.fromsection = '"chapter XVI"'
print(tag)

'<pages index=Sample index with more than two words.pdf from=5 to=6 fromsection="chapter XVI" />'

This <pages> tag won't display accordingly, as there is no "Sample" index page.

I propose the following changes on the definition of TagAttr, on proofreadpage.py (lines 137 onwards)

if ' ' in str(value) and '"' not in value and "'" not in value:
    self._orig_value = f'"{value}"'
else:
    self._orig_value = value

this might be not in the coding style of pywikibot, but it does the work and it doesn't make a mess when you explicitly add the quotes.

tag = PagesTagParser()  
tag.index = 'Sample index with more than two words.pdf'
tag.ffrom = 5
tag.to = "6"
tag.fromsection = '"chapter XVI"'
print(tag)

<pages index="Sample index with more than two words.pdf" from=5 to=6 fromsection="chapter XVI" />

Event Timeline

@Binovolador: Could you please describe this issue Aal give few examples.

@Xqt I thought I already gave an example, so I will copy the one from the description.

Suppose I want to create a bunch of <pages> tags. Maybe I want to create all the subpages of a very complex work with many chapters. So I want to do it via script and not manually. In order to make sure I comply with the correct format of a <pages> tag, I want to use PagesTagParser. So I create an instance of PagesTagParser, and give it the attributes programmatically. The snippet of code below will normally go inside a loop and the expressions will be another variables or come from a list or a dictionary:

tag = PagesTagParser()  
tag.index = 'Sample index with more than two words.pdf'
tag.ffrom = 5
tag.to = "6"
tag.fromsection = '"chapter XVI"'
print(tag)

This snippet of code gives the following output:

<pages index=Sample index with more than two words.pdf from=5 to=6 fromsection="chapter XVI" />

This is not a correctly formatted <pages> tag. This tag will throw a big bold and red error text saying that there is no index called "Sample". This is due to the lack of quotes ("" or '') around the index name. Note that the fromsection attribute IS correctly formatted, because at the moment of assignment, I included the quotes

tag.fromsection = '"chapter XVI"'

It would nice to not have to add the quotes manually.

Change #1209061 had a related patch set uploaded (by Dumbledore; author: Dumbledore):

[pywikibot/core@master] Fix PagesTagParser: Support attributes with multiple words

https://gerrit.wikimedia.org/r/1209061

Hi @Ninovolador

I have submitted a patch in Gerrit that addresses this task.
Feedback and review would be appreciated. Thank you!