Page MenuHomePhabricator

Site() can request info unnecessarily
Closed, ResolvedPublic

Description

Since {891a720}, initializing APISite will attempt to login using cookies; therefore, any Site() can cause API requests for info.

Example situation:

  1. Do not configure pywikibot, and ensure the cache is clear.
    • This will default to test:wikipedia.
  2. Run t.py
t.py
import pywikibot
from pywikibot.comms.eventstreams import EventStreams

site = pywikibot.Site("en", "wikipedia")  # site other than the default
stream = EventStreams("revision-create", site=site)  # unnecessarily requests data from the default site
trace
pywikibot/comms/eventstreams.py:146: in __init__
    self._site = kwargs.pop('site', Site())
pywikibot/__init__.py:1339: in Site
    _sites[key] = interface(code=code, fam=fam, user=user)
pywikibot/site/_apisite.py:127: in __init__
    self.login(cookie_only=True)
pywikibot/site/_apisite.py:384: in login
    if self.userinfo['name'] == self.user():
pywikibot/site/_apisite.py:543: in userinfo
    uidata = uirequest.submit()

Event Timeline

We can apply patches like this, but I'm not sure that unconditionally logging in when initializing APISite is a good idea.

diff --git a/pywikibot/comms/eventstreams.py b/pywikibot/comms/eventstreams.py
index cd9c280c3..aacc4be54 100644
--- a/pywikibot/comms/eventstreams.py
+++ b/pywikibot/comms/eventstreams.py
@@ -143,7 +143,10 @@ class EventStreams(GeneratorWrapper):
                               'install it with "pip install sseclient"\n')
         self.filter = {'all': [], 'any': [], 'none': []}
         self._total = None
-        self._site = kwargs.pop('site', Site())
+        if 'site' in kwargs:
+            self._site = kwargs.pop('site')
+        else:
+            self._site = Site()

The problem is that the pop default is evaluated even if it is not needed:

x = {'foo': 'bar'}
def baz(): print('quez')

x.pop('foo', baz())
quez
'bar'

Here some other samples:
https://codesearch.wmcloud.org/pywikibot/?q=%28get%7Cpop%29%5C%28%5B%5E+%2C%5C%29%5D%2B%2C+%5B%5E%5C%28%5C%29%5D%2B%5C%28&files=&excludeFiles=&repos=

dict should have been implemented like that instead:

class lazy_dict(dict):
    sentinel = object()
    def pop(value, default=sentinel):
        try:
            r = super.pop(value)
        except KeyError:
            if default is sentinel:
                raise
            r = default
        return r

but it is a C function

Xqt triaged this task as Medium priority.May 4 2023, 7:19 AM

Change 915392 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [IMPR] do not evaluate Site() with dict.pop() as default value

https://gerrit.wikimedia.org/r/915392

The problem is that the pop default is evaluated even if it is not needed:

x = {'foo': 'bar'}
def baz(): print('quez')

x.pop('foo', baz())
quez
'bar'

Here some other samples:
https://codesearch.wmcloud.org/pywikibot/?q=%28get%7Cpop%29%5C%28%5B%5E+%2C%5C%29%5D%2B%2C+%5B%5E%5C%28%5C%29%5D%2B%5C%28&files=&excludeFiles=&repos=

dict should have been implemented like that instead:

class lazy_dict(dict):
    sentinel = object()
    def pop(value, default=sentinel):
        try:
            r = super.pop(value)
        except KeyError:
            if default is sentinel:
                raise
            r = default
        return r

but it is a C function

See https://bugs.python.org/issue39083

Change 915392 merged by jenkins-bot:

[pywikibot/core@master] [IMPR] do not evaluate Site() with dict.pop() as default value

https://gerrit.wikimedia.org/r/915392