Page MenuHomePhabricator

pywikibot library should not attach logging handlers
Open, Needs TriagePublicFeature

Description

Steps to replicate the issue (include links if applicable):

Run the following code:

from pywikibot import Site, Page

site = Site("test", "wikipedia")
page = Page(site, "Wikipedia:Sandbox")
page.put("foo")

What happens?:

It prints to stderr:

Sleeping for 9.6 seconds, 2023-01-10 10:35:00
Page [[test:Wikipedia:Sandbox]] saved

What should have happened instead?:

It should not produce anything on either stdout or stderr

Software version (skip for WMF-hosted wikis like Wikipedia):

pywikibot==7.7.1
Python 3.9.13

Other information (browser name/version, screenshots, etc.):

Poking around you can see that pywikibot has installed its own logging handler:

o   "pywiki"
|   Level Level 11
|   Propagate OFF
|   Handler <TerminalHandler <stderr> (INFO)>
|     Level INFO
|     Filter <pywikibot.userinterfaces.terminal_interface_base.MaxLevelFilter object at 0x7fc485a43e80>
|     Formatter fmt='%(message)s%(newline)s' datefmt=None
|   Handler <TerminalHandler <stdout> (STDOUT)>
|     Level STDOUT
|     Filter <pywikibot.userinterfaces.terminal_interface_base.MaxLevelFilter object at 0x7fc485a43fd0>
|     Formatter fmt='%(message)s%(newline)s' datefmt=None
|   Handler <TerminalHandler <stderr> (WARNING)>
|     Level WARNING
|     Formatter fmt='%(levelname)s: %(message)s%(newline)s' datefmt=None
|   |
|   o<--[pywiki.wiki]
|       |
|       o<--"pywiki.wiki.family"
|           Level NOTSET so inherits level Level 11

Not only does this produce unwanted output to stderr/stdout, the handler is set to not propagate up to the root logger, so I can't even get the output into my application's log file, where I want it. It should not be installing logging handlers when just called as a library. Installing handlers is the purview of the application layer. As stated in the Python language docs:

Note It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers. This is because the configuration of handlers is the prerogative of the application developer who uses your library. The application developer knows their target audience and what handlers are most appropriate for their application: if you add handlers ‘under the hood’, you might well interfere with their ability to carry out unit tests and deliver logs which suit their requirements.

The handlers should be installed by the individual scripts, or perhaps in the pwb wrapper script. You could provide a pywikibot.logging.init_handlers() which a user could call to get the current behavior. Or, at the very least, pywikibot.logging.inhibit_handlers() which could be called by a developer using the library to prevent the handlers from being installed. Note that once a handler is installed, there's no good way for an application to remove it.

Event Timeline

Xqt changed the subtype of this task from "Bug Report" to "Feature Request".EditedJan 10 2023, 5:32 PM
Xqt subscribed.

Logging was added more than 10 years ago. Using it is intentional and not a bug. The logging system is also used by the scripts. Removing it would lead to more problems than it solves. The related modules are

There are several methods yet which can modify the handlers:

  • set_interface() to set the UI. Currently only a terminal interface is definded. the buffer interface is usefull for tests. On compat release we also had a cgi, wxpython, and tkinter interface.
  • init_handlers() to initialize the handlers

OK, so it seems like what has to happen is to define a new subclass of ABUIC, called null_interface. It looks like it could be almost the same as buffer_interface, except that it wouldn't attach any handers at all. Maybe instead of:

self.log_handler = logging.handlers.QueueHandler(self._buffer)

it would have

self.log_handler = logging.NullHandler()

Then somebody who wanted the normal logging behavior could just call pywikibot.bot.set_interface("null") before calling pywikibot.Site(). At least that's how it looks from my reading of the code.

I see that pywikibot.Site() has an "interface" parameter. That's not related, is it? From experimentation, calling Site() is when the handlers get attached, but I've been unable to trace through the code how that happens. I only see 3 places in the codebase where set_interface() is called, and they're all in tests.

The last piece I'm not grokking is where the "pywiki" logger has its propagate attribute set to False. That would need to get changed to True, to let everything flow up the the root logger. pywikibot/bot.py has:

root_logger = logging.getLogger('pywiki')

which is confusing at best. The root logger is what you get by calling logging.getLogger() with no arguments.

Hi @RoySmith, looks like a way to go. I'll make some proposals in this matter.

set_interface() is to set the UI where the output stream goes through. Currently only 'terminal' is defined and 'buffer' interface is usually used by tests. In past we had other interfaces too, see my remarks above.

The propagate attribute set to False due to T281643. If you have a better solution this would be fine.

'pywiki' wasn't the only logger in past. I've cleaned a lot of loggers where each module had it's own logger. (see T85620). With layer argument for each of the logging functions a new logger can be defined.

The Site() interface is the Site class to be used, usually APISite but also DataSite for wikidata or commons.

Huji renamed this task from pywikibot library should not attach logging handers to pywikibot library should not attach logging handlers.Jan 17 2024, 3:07 AM
Huji updated the task description. (Show Details)