It seems there is an issue with pywikibot.textlib.extract_templates_and_params sometimes. For instance, pywikibot.textlib.extract_templates_and_params('{{Temp <\u0021-- -->}}') returns Temp <!-- --> now (vs. Temp before).
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | matej_suchanek | T113892 templateWithParams() and HTML comments | |||
Resolved | matej_suchanek | T127691 harvest_template.py crashes on comment |
Event Timeline
Please note that this broke templatesWithParams as it doesn't detect the Temp template somewhere.
>>> pywikibot.textlib.extract_templates_and_params('{{Temp <!-- -->}}', remove_disabled_parts=True) [(u'Temp', OrderedDict())]
To me it looks like the code is doing the opposite of what is stated in the docstring:
@param remove_disabled_parts: Remove disabled wikitext such as comments and pre. If None (default), this is enabled when mwparserfromhell is not available or is disabled in the config, and disabled if mwparserfromhell is present and enabled in the config. ... if use_mwparserfromhell: if remove_disabled_parts is None: remove_disabled_parts = False ... if use_mwparserfromhell: return extract_templates_and_params_mwpfh(text) else: return extract_templates_and_params_regex(text, False) --> shouldn't this be True?
Cannot say if the docstring is correct or the code.
Also it is not clear why in the docstring a opposite behaviour is wanted with/without mpfh.
@jayvdb?
I got this on the master branch:
if remove_disabled_parts: text = removeDisabledParts(text)
So in theory if this was called before reliably the False parameter later is fine.
Interestingly I have mwparserfromhell installed and it use it and it does return the comment inside (so it would need to remove disabled parts too).
@param remove_disabled_parts: If None (default), this is enabled when mwparserfromhell is not available or is disabled in the config
By default remove_disabled_parts=None, so text = removeDisabledParts(text) is not executed.
My point was that maybe the False is not wrong as it should've been called earlier in the code block I quoted. If you were to change the False into True you wouldn't be able to switch it off. So you either need there a more complex logic… or actually look at the code before and fix it. To me it looks like it should be the inverse of use_mwparserfromhell if it's None.
Anyway it was added in 13cd73de but I can't find any comment about that though. And the reason it's disabled by default when using mwpfh is that it wasn't doing that before.
At least for non-mwpfh users this issue could be solved easily. For mwpfh users this is still a problem as it'll break backwards compatibility when we remove disabled parts by default too in templatesWithParams.
We might only need to remove them for the template name itself and the indexed parameters.
Change 241566 had a related patch set uploaded (by XZise):
[FIX] Remove disabled parts when extracting templates
Change 241565 had a related patch set uploaded (by XZise):
[FIX] Removing disabled parts without mwpfh
Change 227073 had a related patch set uploaded (by John Vandenberg):
templatesWithParams: cache and standardise params
Change 227073 merged by jenkins-bot:
[pywikibot/core@master] templatesWithParams: cache and standardise params
Change 241566 abandoned by Xqt:
[pywikibot/core@master] [FIX] Remove disabled parts when extracting templates
Reason:
The underlying task is fixed