Page MenuHomePhabricator

Page.get(force=True) and Page.save() does not refresh page data
Open, HighPublicBUG REPORT

Description

I file here two bugs because they are closely connected. The whole process of handling the data should be revised. Involved methods are at least (others may be):

get()
save()
exists()
has_content()

delete() seems to work well.
Steps to replicate the issue:

>>> page = pywikibot.Page(site, 'Teszt')  # It does not exist at his point
>>> # Now I create [[Teszt]] manually
...
>>> page.get()
'Teszt'
>>> page.text
'Teszt'
>>> # Now I delete [[Teszt]] manually
...
>>> page.get()
'Teszt'
>>> page.text
'Teszt'
>>> page.exists()
True
>>> page.get(force=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Pywikibot\pywikibot\page\_page.py", line 397, in get
    self._getInternals()
  File "c:\Pywikibot\pywikibot\page\_page.py", line 436, in _getInternals
    self.site.loadrevisions(self, content=True)
  File "c:\Pywikibot\pywikibot\site\_generators.py", line 772, in loadrevisions
    raise NoPageError(page)
pywikibot.exceptions.NoPageError: Page [[hu:Teszt]] doesn't exist.
>>> page.exists()
True
>>>

What happens?:
After page.get(force=True) the page data were not refreshed sufficiently. page.exists() believes the page is still existing.

What should have happened instead?:
page.get(force=True) should refresh self.pageid so that page.exists() will not be mislead.
I don't know what further measures are neccessary for other methods.

Documentation of page.exists() should clearly describe what happens and how to force rechecking.

Steps to replicate the issue:

page = pywikibot.Page(site, 'Teszt')
print('Exists:', page.exists())
print('Id:', page.pageid)
page.text = 'Teszt'
page.save('Botteszt')
print('Exists:', page.exists())
print('Id:', page.pageid)

What happens?:

Exists: False
Id: 0
Page [[Teszt]] saved
Exists: False
Id: 0

What should have happened instead?:
Page.revid should have been updated upon save().

Software version (skip for WMF-hosted wikis like Wikipedia): 8.0.0

Event Timeline

binbot updated the task description. (Show Details)
binbot updated the task description. (Show Details)

Change 894208 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [bugfix] load page info when creating a page if not updated previously

https://gerrit.wikimedia.org/r/894208

Xqt triaged this task as High priority.Mar 5 2023, 4:25 PM
Xqt moved this task from Backlog to Needs Review on the Pywikibot board.

Change 894208 merged by jenkins-bot:

[pywikibot/core@master] [bugfix] load page info when creating a page if not updated previously

https://gerrit.wikimedia.org/r/894208

binbot reopened this task as Open.EditedMar 5 2023, 8:43 PM

Sorry, but the first problem with deletion still exists. page.exists() only notices the deletion if I recreate the Page object.

I am mentally blocked. Could you please expain a bit or make an example for that issue.

When a page is deleted by another process (e.g. a human) after getting the page for the first time, page.exists() will not get noticed by any method.
I think when I use page.get(force=True) and it finds a NoPageError, page.pageid should be set to 0 to notify other methods about the change.

In the first "Steps to replicate the issue" block on top of this page the last response should be False, not True.