Page MenuHomePhabricator

LRM in wikitext of section title causes SectionError when the LRM is not the page object's title section
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

python3 pwb.py shell
>>> site = pywikibot.Site()
>>> page = pywikibot.Page(site, 'Wikipedia:Categories for discussion/Log/2025 November 8#Japanese superhero films by decade')
>>> page.text

What happens?:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/jjmc89/wrk/pwb/pywikibot/page/_basepage.py", line 581, in text
    return self.get(get_redirect=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jjmc89/wrk/pwb/pywikibot/page/_basepage.py", line 396, in get
    raise SectionError(f'{page_section!r} is not a valid section '
pywikibot.exceptions.SectionError: 'Japanese superhero films by decade' is not a valid section of Wikipedia:Categories for discussion/Log/2025 November 8

What should have happened instead?:

An exception is not raised due to a LRM being present in the wikitext but not in page.section().

[[Wikipedia:Categories for discussion/Log/2025 November 8#Japanese superhero films by decade]] is a working link to the section of the page with ==== Japanese superhero films\u200e by decade ====.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
Pywikibot: [ssh] pywikibot-core (af27117, g20065, 2025/11/28, 21:37:46, master)
Release version: 11.0.0.dev2

Other information (browser name/version, screenshots, etc.):

Report based on the content of the page as of https://en.wikipedia.org/w/index.php?title=Wikipedia:Categories_for_discussion/Log/2025_November_8&oldid=1324681231, which was subsequently edited to remove the LRM.

Details

Event Timeline

Here's the traceback from where I originally discovered this issue.

2025-11-29 20:47:15         logging.py,  355 in          exception: ERROR    'Japanese superhero films by decade' is not a valid section of Wikipedia:Categories for discussion/Log/2025 November 8
Traceback (most recent call last):
  File "/data/project/jjmc89-bot/repos/jjmc89-bot-scripts/enwiki/cfdw.py", line 338, in parse
    self._parse_section(str(section))
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/data/project/jjmc89-bot/repos/jjmc89-bot-scripts/enwiki/cfdw.py", line 375, in _parse_section
    _, action = cfd.get_result_action(
                ~~~~~~~~~~~~~~~~~~~~~^
        instruction["bot_options"]["old_cat"]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/data/project/jjmc89-bot/repos/jjmc89-bot-scripts/enwiki/cfdw.py", line 284, in get_result_action
    text = removeDisabledParts(self.text, tags=EXCEPTIONS, site=self.site)
                               ^^^^^^^^^
  File "/data/project/jjmc89-bot/repos/.venvs/jjmc89-bot-scripts/lib/python3.13/site-packages/pywikibot/page/_basepage.py", line 580, in text
    return self.get(get_redirect=True)
           ~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/data/project/jjmc89-bot/repos/.venvs/jjmc89-bot-scripts/lib/python3.13/site-packages/pywikibot/page/_basepage.py", line 395, in get
    raise SectionError(f'{page_section!r} is not a valid section '
                       f'of {self.title(with_section=False)}')
pywikibot.exceptions.SectionError: 'Japanese superhero films by decade' is not a valid section of Wikipedia:Categories for discussion/Log/2025 November 8

Change #1225150 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] FIX: Remove invisible chars from Section.heading

https://gerrit.wikimedia.org/r/1225150

Change #1225150 merged by jenkins-bot:

[pywikibot/core@master] FIX: Remove invisible chars from Section.heading

https://gerrit.wikimedia.org/r/1225150

Xqt claimed this task.