Page MenuHomePhabricator

RuntimeError: dictionary changed size during iteration
Closed, ResolvedPublicBUG REPORT

Description

Run an interactive script pwb.py replace and confirm an edit.

What happens?:

RuntimeError: dictionary changed size during iteration
Traceback (most recent call last):
  File "C:\Users\matej\Dokumenty\core\pwb.py", line 495, in <module>
    main()
  File "C:\Users\matej\Dokumenty\core\pwb.py", line 479, in main
    if not execute():
  File "C:\Users\matej\Dokumenty\core\pwb.py", line 463, in execute
    run_python_file(filename, script_args, module)
  File "C:\Users\matej\Dokumenty\core\pwb.py", line 143, in run_python_file
    exec(compile(source, filename, 'exec', dont_inherit=True),
  File ".\scripts\replace.py", line 1096, in <module>
    main()
  File ".\scripts\replace.py", line 1092, in main
    bot.run()
  File "C:\Users\matej\Dokumenty\core\pywikibot\bot.py", line 1565, in run
    self.treat(page)
  File ".\scripts\replace.py", line 705, in treat
    choice = pywikibot.input_choice(
  File "C:\Users\matej\Dokumenty\core\pywikibot\bot.py", line 536, in wrapper
    init_handlers()
  File "C:\Users\matej\Dokumenty\core\pywikibot\bot.py", line 447, in init_handlers
    writelogheader()
  File "C:\Users\matej\Dokumenty\core\pywikibot\bot.py", line 506, in writelogheader
    for module in sys.modules.values():
RuntimeError: dictionary changed size during iteration
CRITICAL: Exiting due to uncaught exception <class 'RuntimeError'>

Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc.:

Pywikibot: [ssh] pywikibot-core.git (eccb6ab, g16023, 2022/02/28, 18:44:18, master)
Release version: 7.1.0.dev0
Python: 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)]

Event Timeline

Xqt triaged this task as High priority.EditedMar 1 2022, 12:06 PM
Xqt subscribed.

I can't reproduce it:

C:\pwb\GIT\core>pwb replace -page:user:xqt/Test a b -simulate -log
The summary message for the command line replacements will be something like: Bot: Automatisierte Textersetzung  (-a +b)
Press Enter to use this automatic message, or enter a description of the
changes your bot will make:
Retrieving 1 pages from wikipedia:de.


>>> Benutzer:Xqt/Test <<<
@@ -1 +1 @@
- Foo\nbar<code>&amp;</code>
+ Foo\nbbr<code>&bmp;</code>

Do you want to accept these changes? ([y]es, [N]o, [e]dit original, edit
[l]atest, open in [b]rowser, [m]ore context, [a]ll, [q]uit): y
Edit summary: Bot: Automatisierte Textersetzung  (-a +b)
SIMULATION: edit action blocked.
Page [[Benutzer:Xqt/Test]] saved

Anyway this is strange because I've no glue where/when the sys.modules dict is changed. We only have:

for module in sys.modules.values():
    filename = version.get_module_filename(module)
    if not filename:
        continue

    param = {'sep': ' '}
    if PYTHON_VERSION >= (3, 6, 0):
        param['timespec'] = 'seconds'
    mtime = version.get_module_mtime(module).isoformat(**param)

with

def get_module_filename(module) -> Optional[str]:
    if hasattr(module, '__file__'):
        filename = module.__file__
        if not filename or not os.path.exists(filename):
            return None

        program_dir = _get_program_dir()
        if filename[:len(program_dir)] == program_dir:
            return filename
    return None

and

def get_module_mtime(module):
    filename = get_module_filename(module)
    if filename:
        return datetime.datetime.fromtimestamp(os.stat(filename).st_mtime)
    return None

It is obvious that this will fail:

x = {1:1, 2:2, 3:3}
for i in x.values():
    print(i)
    if i == 2:
        x[4] = 5

        
1
2
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    for i in x.values():
RuntimeError: dictionary changed size during iteration

I completely forgot to mention it's on Python 3.10...

I forgot to completely mention it's on Python 3.10...

I also use Python 3.10.2 for developing. Do you have the complete command line for me?

I've made up one that currently crashes for me (even if you -simulate):
pwb.py replace -regex "ť[eě]\b" "tě" -summary:"oprava překlepu" -ns:0 -search:"insource:/ť[eě][ .,]/" -lang:cs -family:wikipedia -simulate

Worked for me:

...
Edit summary: oprava překlepu
SIMULATION: edit action blocked.
Page [[Šest stupňů odloučení]] saved

28 pages read
28 pages written
0 pages skipped
Execution time: 28 seconds
Read operation time: 1.0 seconds
Write operation time: 1.0 seconds
Script terminated successfully.

Hm???

After some debugging, I suspect this could be related to the asynchronous requests spawned when you confirm a replacement in replace.py.
It only happens sometimes and I think it did not happen if I had cosmetic changes turned off.

Just an idea to find out which module was changed, could you please change these lines in bot.py arround line 504 and below:

# imported modules
log('MODULES:')
_old = sys.modules.copy()
try:
    for module in sys.modules.values():
        filename = version.get_module_filename(module)
        if not filename:
            continue

        param = {'sep': ' '}
        if PYTHON_VERSION >= (3, 6, 0):
            param['timespec'] = 'seconds'
        mtime = version.get_module_mtime(module).isoformat(**param)

        log('  {} {}'.format(mtime, filename))
except RuntimeError:
    print(_old.keys() ^ sys.modules.keys())  # should print the difference when the RuntimeError occurres.

After some debugging, I suspect this could be related to the asynchronous requests spawned when you confirm a replacement in replace.py.
It only happens sometimes and I think it did not happen if I had cosmetic changes turned off.

Ah, yes that would explain why it did not fails for me: I used "a" for always for the first choice.

Just an idea to find out which module was changed, could you please change these lines in bot.py arround line 504 and below:

It printed {'pydoc'}. (My debugging was a bit conservative.)

Just an idea to find out which module was changed, could you please change these lines in bot.py arround line 504 and below:

It printed {'pydoc'}. (My debugging was a bit conservative.)

Uhh, pydoc isn't used inside any framework run
https://codesearch.wmcloud.org/pywikibot/?q=pydoc&i=nope&files=&excludeFiles=&repos=
https://docs.python.org/3.11/library/pydoc.html?highlight=pydoc#module-pydoc

Just an idea to find out which module was changed, could you please change these lines in bot.py arround line 504 and below:

It printed {'pydoc'}. (My debugging was a bit conservative.)

Uhh, pydoc isn't used inside any framework run
https://codesearch.wmcloud.org/pywikibot/?q=pydoc&i=nope&files=&excludeFiles=&repos=
https://docs.python.org/3.11/library/pydoc.html?highlight=pydoc#module-pydoc

It actually varies, I also got {'stdnum.ean'} or {'stdnum.exceptions'}.

$ pip list | grep stdnum
python-stdnum      1.17

it actually varies, I also got {'stdnum.ean'} or {'stdnum.exceptions'}.

Does this behaviour change if cosmetic_changes is disabled?

it actually varies, I also got {'stdnum.ean'} or {'stdnum.exceptions'}.

Does this behaviour change if cosmetic_changes is disabled?

I believe it helps but I can't say for sure.

Now when I enabled them using -cc, the debugging yielded {'pywikibot.cosmetic_changes'}. I really think there is a race condition, caused by the import inside BasePage._cosmetic_changes_hook. Perhaps it could now be moved to the header?

Change 767210 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [bugfix] import dependecies at top of a script file

https://gerrit.wikimedia.org/r/767210

I believe it helps but I can't say for sure.

Now when I enabled them using -cc, the debugging yielded {'pywikibot.cosmetic_changes'}. I really think there is a race condition, caused by the import inside BasePage._cosmetic_changes_hook. Perhaps it could now be moved to the header?

I still cannot reproduce it but importing at top of the file could help.

Change 767210 merged by jenkins-bot:

[pywikibot/core@master] [bugfix] import dependecies at top of a script file

https://gerrit.wikimedia.org/r/767210

matej_suchanek assigned this task to Xqt.