Page MenuHomePhabricator

Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes
Closed, ResolvedPublic

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/wikisaurusbot) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

As noted in https://techblog.wikimedia.org/2022/03/16/toolforge-gridengine-debian-10-buster-migration/ some use cases are already supported by kubernetes and should be migrated. If your tool can migrate, please do plan a migration. Reach out if you need help or find you are blocked by missing features. Most of all, WMCS is here to support you.

However, it’s possible your tool needs a mixed runtime environment or some other features that aren't yet present in https://techblog.wikimedia.org/2022/03/18/toolforge-jobs-framework/. We’d love to hear of this or any other blocking issues so we can work with you once a migration path is ready. Thanks for your hard work as volunteers and help in this migration!

@komla @nskaggs We can't run python bots on "python3.11" image because of error: ModuleNotFoundError: No module named 'pywikibot'

@komla , @nskaggs , hello, can you help me, please? I created jobs.yaml file and tried to run Python scripts through it, but it gives an error:

import pywikibot
ModuleNotFoundError: No module named 'pywikibot'

I tried to do as is said in https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Jobs, with "pip install pywikibot", but it didn't help, unfortunately.

That means that you haven't installed pywikibot (correctly) or aren't using the venv you installed it in.

See Help:Toolforge/Running Pywikibot scripts or Help:Toolforge/Running Pywikibot scripts (advanced) for more specific guidance on running pywikibot scripts.

@Wikisaurus2 thanks for having your code in a public repo! :)

I'll give it a look and suggest some changes.

@Wikisaurus2 I have sent a PR https://github.com/wikisaurus/wikisaurusbot/pull/1 that should work, note the instructions in the README.md file.

With this you don't need to have any code in the tool home directory, as the code will be pulled from github and built into an image.

I have tested some, but not all of it, so please test (note that you can define the jobs one by one after building the image, so you can test each separatedly).

Cheers

I have merged your patch and successfully built the image, what should we do now? I tried to load jobs.yaml and

tools.wikisaurusbot@tools-sgebastion-10:~$ toolforge-jobs load jobs.yaml
/usr/bin/toolforge-jobs:15: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html from pkg_resources import load_entry_point

One of scripts fails with this error:

Traceback (most recent call last):
  File "/data/project/wikisaurusbot/facenapalmscripts/sandbox.py", line 12, in <module>
    import pywikibot
ModuleNotFoundError: No module named 'pywikibot'

You need to remove

export PYTHONPATH=/data/project/shared/pywikibot/stable:/data/project/shared/pywikibot/stable/scripts

from .bash_profile.

One of scripts fails with this error:

Traceback (most recent call last):
  File "/data/project/wikisaurusbot/facenapalmscripts/sandbox.py", line 12, in <module>
    import pywikibot
ModuleNotFoundError: No module named 'pywikibot'

It was not using the right image. The jobs.yaml file in the tool home is not the same from the git repo:

tools.wikisaurusbot@tools-sgebastion-10:~$ cat jobs.yaml 
#every 5 minutes
- name: sandbox
  command: python3 $HOME/facenapalmscripts/sandbox.py
  image: python3.11
  schedule: "*/5  * * * *"
  emails: onfailure
...

I followed both instructions and load jobs.yaml again. Let's see if bots will run...

I followed both instructions and load jobs.yaml again. Let's see if bots will run...

There's some issues:

tools.wikisaurusbot@tools-sgebastion-10:~$ toolforge jobs logs sandbox
2024-03-13T12:37:37+00:00 [sandbox-1710333455-mfbtf] Traceback (most recent call last):
2024-03-13T12:37:37+00:00 [sandbox-1710333455-mfbtf]   File "/workspace/facenapalmscripts/sandbox.py", line 17, in <module>
2024-03-13T12:37:37+00:00 [sandbox-1710333455-mfbtf]     os.environ["PYWIKIBOT_DIR"] = str(curdir)
2024-03-13T12:37:37+00:00 [sandbox-1710333455-mfbtf]     ^^
2024-03-13T12:37:37+00:00 [sandbox-1710333455-mfbtf] NameError: name 'os' is not defined. Did you forget to import 'os'?

I think I might have missed adding an import somewhere.

Thanks. Could you did something to .yaml file on Toolforge updated automatically when GitHub file updated, or in process of building?

Thanks. Could you did something to .yaml file on Toolforge updated automatically when GitHub file updated, or in process of building?

That's something we are looking into yes :), might take a bit to get there though as we want first to be able to trigger a build when you do git push (so you don't need to manually build yourself).

* Job 'sandbox' (cronjob) (emails: onfailure) had 3 events:
  -- Pod 'sandbox-1710333455-mfbtf'. Phase: 'pending'. Container state: 'terminated'. Start timestamp 2024-03-13T12:37:37Z. Finish timestamp 2024-03-13T12:37:37Z. Exit code was '1'. With reason 'Error'. 
  -- Pod 'sandbox-1710333455-mfbtf'. Phase: 'failed'. Container state: 'terminated'. Start timestamp 2024-03-13T12:37:37Z. Finish timestamp 2024-03-13T12:37:37Z. Exit code was '1'. With reason 'Error'. 
  -- Pod 'sandbox-28505560-h7bcr'. Phase: 'failed'. Container state: 'terminated'. Start timestamp 2024-03-13T12:40:14Z. Finish timestamp 2024-03-13T12:40:14Z. Exit code was '1'. With reason 'Error'.

Sandbox process still fails, but it .err file doesn't updated.

* Job 'sandbox' (cronjob) (emails: onfailure) had 3 events:
  -- Pod 'sandbox-1710333455-mfbtf'. Phase: 'pending'. Container state: 'terminated'. Start timestamp 2024-03-13T12:37:37Z. Finish timestamp 2024-03-13T12:37:37Z. Exit code was '1'. With reason 'Error'. 
  -- Pod 'sandbox-1710333455-mfbtf'. Phase: 'failed'. Container state: 'terminated'. Start timestamp 2024-03-13T12:37:37Z. Finish timestamp 2024-03-13T12:37:37Z. Exit code was '1'. With reason 'Error'. 
  -- Pod 'sandbox-28505560-h7bcr'. Phase: 'failed'. Container state: 'terminated'. Start timestamp 2024-03-13T12:40:14Z. Finish timestamp 2024-03-13T12:40:14Z. Exit code was '1'. With reason 'Error'.

Sandbox process still fails, but it .err file doesn't updated.

Yep, as I mention in the readme, the logs are not dumps to files by default, you have to add mount: all and filelogs: true to the entry in jobs.yaml for the files to get created. You have to check the logs with toolforge job logs sandbox

I see this:

2024-03-13T12:56:14+00:00 [sandbox-1710334571-2hnv5] Skipped '/workspace/user-config.py': owned by someone else.

I think that's something @taavi told me yesterday and I forgot (it worked on my local xd), let me look.

you have to add mount: all and filelogs: true to the entry in jobs.yaml for the files to get created

It will be easier to use .err files like earlier, could you did needed edits?

A sandbox job spams my mail every ~15 mins about its failures, it will be good to fix it.

Got a working patch https://github.com/wikisaurus/wikisaurusbot/pull/3
It was kind of tricky to get pywikibot auth working though.

Let's see if that works.

Sleeping for 8.4 seconds, 2024-03-13 14:25:06
Page [[Обсуждение Википедии:Песочница]] saved
Sleeping for 9.5 seconds, 2024-03-13 14:25:15
Page [[Инкубатор:Песочница]] saved
Skipped '/workspace/user-config.py': owned by someone else.
WARNING: /workspace/facenapalmscripts/sandbox.py:38: FutureWarning: pywikibot.page._basepage.BasePage.editTime is deprecated since release 8.0.0; use latest_revision.timestamp instead.
  delta = time - page.editTime()

A script partially worked and partially not. I fixed 38th line and recompiled project, but doesn't know how to solve owning issue on /workspace/.

Sleeping for 8.4 seconds, 2024-03-13 14:25:06
Page [[Обсуждение Википедии:Песочница]] saved
Sleeping for 9.5 seconds, 2024-03-13 14:25:15
Page [[Инкубатор:Песочница]] saved
Skipped '/workspace/user-config.py': owned by someone else.
WARNING: /workspace/facenapalmscripts/sandbox.py:38: FutureWarning: pywikibot.page._basepage.BasePage.editTime is deprecated since release 8.0.0; use latest_revision.timestamp instead.
  delta = time - page.editTime()

A script partially worked and partially not. I fixed 38th line and recompiled project, but doesn't know how to solve owning issue on /workspace/.

This line is ok Skipped '/workspace/user-config.py': owned by someone else., it happens when you import pywikibot, then after that, the code properly loads the user-config (that's the hack I had to make).

What did not work? What was it supposed to happen?

As in, can you elaborate on what did not work?

Current error:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/sandbox.py", line 49, in <module>
    main()
  File "/workspace/facenapalmscripts/sandbox.py", line 38, in main
    delta = time - latest_revision.timestamp()
                   ^^^^^^^^^^^^^^^
NameError: name 'latest_revision' is not defined
CRITICAL: Exiting due to uncaught exception NameError: name 'latest_revision' is not defined

Looks like my fixing of 38th line was wrong, it's needed to define/construct latest_revision before calling its method, but I have no experience in Pywikibot and even Python and can't do it now (should read the docs), maybe you know how to solve this?

Current error:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/sandbox.py", line 49, in <module>
    main()
  File "/workspace/facenapalmscripts/sandbox.py", line 38, in main
    delta = time - latest_revision.timestamp()
                   ^^^^^^^^^^^^^^^
NameError: name 'latest_revision' is not defined
CRITICAL: Exiting due to uncaught exception NameError: name 'latest_revision' is not defined

Looks like my fixing of 38th line was wrong, it's needed to define/construct latest_revision before calling its method, but I have no experience in Pywikibot and even Python and can't do it now (should read the docs), maybe you know how to solve this?

Let me look

Merged, rebuilded, loaded. Current errors in two tools:

NameError: name 'latest_revision' is not defined
CRITICAL: Exiting due to uncaught exception NameError: name 'latest_revision' is not defined
Skipped '/workspace/user-config.py': owned by someone else.
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/sandbox.py", line 49, in <module>
    main()
  File "/workspace/facenapalmscripts/sandbox.py", line 38, in main
    delta = time - page.latest_revision.timestamp()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'Timestamp' object is not callable
CRITICAL: Exiting due to uncaught exception TypeError: 'Timestamp' object is not callable
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 83, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 33, in process_hourly
    return "срочных: " + process_purge(site, "К:Википедия:Страницы с ежечасно очищаемым кэшем")
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 27, in process_purge
    if not site.purgepages(members[i:i+limit]):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2897, in purgepages
    result = req.submit()
             ^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 993, in submit
    response, use_get = self._http_request(use_get, uri, body, headers,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
    response = http.request(self.site, uri=uri,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
    r = fetch(baseuri, headers=headers, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
    callback(response)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise ServerError(response)
pywikibot.exceptions.ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)
CRITICAL: Exiting due to uncaught exception ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)

Merged, rebuilded, loaded. Current errors in two tools:

NameError: name 'latest_revision' is not defined
CRITICAL: Exiting due to uncaught exception NameError: name 'latest_revision' is not defined
Skipped '/workspace/user-config.py': owned by someone else.
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/sandbox.py", line 49, in <module>
    main()
  File "/workspace/facenapalmscripts/sandbox.py", line 38, in main
    delta = time - page.latest_revision.timestamp()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'Timestamp' object is not callable
CRITICAL: Exiting due to uncaught exception TypeError: 'Timestamp' object is not callable
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 83, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 33, in process_hourly
    return "срочных: " + process_purge(site, "К:Википедия:Страницы с ежечасно очищаемым кэшем")
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 27, in process_purge
    if not site.purgepages(members[i:i+limit]):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2897, in purgepages
    result = req.submit()
             ^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 993, in submit
    response, use_get = self._http_request(use_get, uri, body, headers,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
    response = http.request(self.site, uri=uri,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
    r = fetch(baseuri, headers=headers, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
    callback(response)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise ServerError(response)
pywikibot.exceptions.ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)
CRITICAL: Exiting due to uncaught exception ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)

ack, this should be just removing the (), give me a sec

One ruwiki user on our discord server noticed this change on your first PR, it looks like syntax error:

image.png (330×491 px, 18 KB)

This file: https://github.com/wikisaurus/wikisaurusbot/blob/master/facenapalmscripts/validstats.py

One ruwiki user on our discord server noticed this change on your first PR, it looks like syntax error:

image.png (330×491 px, 18 KB)

This file: https://github.com/wikisaurus/wikisaurusbot/blob/master/facenapalmscripts/validstats.py

Oops, yep, that should not be there

If Skipped '/workspace/user-config.py': owned by someone else. isn't an issue, I don't see problems now on three bots.

Bot edits started: https://ru.wikipedia.org/wiki/Special:Contributions/WikisaurusBot

But one bot still has this error:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 83, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 33, in process_hourly
    return "срочных: " + process_purge(site, "К:Википедия:Страницы с ежечасно очищаемым кэшем")
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 27, in process_purge
    if not site.purgepages(members[i:i+limit]):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2897, in purgepages
    result = req.submit()
             ^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 993, in submit
    response, use_get = self._http_request(use_get, uri, body, headers,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
    response = http.request(self.site, uri=uri,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
    r = fetch(baseuri, headers=headers, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
    callback(response)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise ServerError(response)
pywikibot.exceptions.ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)
CRITICAL: Exiting due to uncaught exception ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)

But one bot still has this error:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 83, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 33, in process_hourly
    return "срочных: " + process_purge(site, "К:Википедия:Страницы с ежечасно очищаемым кэшем")
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 27, in process_purge
    if not site.purgepages(members[i:i+limit]):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2897, in purgepages
    result = req.submit()
             ^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 993, in submit
    response, use_get = self._http_request(use_get, uri, body, headers,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
    response = http.request(self.site, uri=uri,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
    r = fetch(baseuri, headers=headers, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
    callback(response)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise ServerError(response)
pywikibot.exceptions.ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)
CRITICAL: Exiting due to uncaught exception ServerError: HTTPSConnectionPool(host='ru.wikipedia.org', port=443): Read timed out. (read timeout=45)

That does not seem code related though, it timed out trying to connect to ru.wikipedia.org :/

If you retry, does it time out again?

A file autopurge-hourly.err contains this error many times (since we revitalised this bots), so I assume it can be related to bot code. Maybe some outdated login method?

A file autopurge-hourly.err contains this error many times (since we revitalised this bots), so I assume it can be related to bot code. Maybe some outdated login method?

Feels weird :/, maybe it does too many requests at the same time and gets the worker ip banned (the limit per-k8s-worker node is 500 simultaneous requests).

I'll have to investigate more in depth, looking

A file autopurge-hourly.err contains this error many times (since we revitalised this bots), so I assume it can be related to bot code. Maybe some outdated login method?

Feels weird :/, maybe it does too many requests at the same time and gets the worker ip banned (the limit per-k8s-worker node is 500 simultaneous requests).

I'll have to investigate more in depth, looking

It seems to be working sometimes, and sometimes it does not and just skips that run:

--- failure before this
Skipped '/workspace/user-config.py': owned by someone else.
--- here it worked
Sleeping for 5.0 seconds, 2024-03-13 17:00:19
--- here it worked
Sleeping for 8.2 seconds, 2024-03-13 17:00:26
--- failure
ERROR: Traceback (most recent call last):
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
    response = http.request(self.site, uri=uri,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
    r = fetch(baseuri, headers=headers, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
    callback(response)

Do you know if it's actually doing things? (I have no idea of Russian, so I'm having a hard time navigating the wiki xd)

It (English) name means it purges pages, i.e. did null edits, so its edits will not be seen in its contribs. Its code written on Python, not Russian, so you can understand it better than me.

It (English) name means it purges pages, i.e. did null edits, so its edits will not be seen in its contribs. Its code written on Python, not Russian, so you can understand it better than me.

Can you verify if it purged any pages? I don't know what К:Википедия:Страницы с ежечасно очищаемым кэшем is or how to find that category (guessing it's a category by the code), or Шаблон:Очищать кэш/статус where it seems to be doing some logging.

I don't know how to find out if it purged any pages.

Thanks, it's not that obvious to me. The latter does not exist though it seems? That's weird, from the code it seems that it should put stuff there (I think).

Oh, it's running with --nolog that avoids logging there actually :/, so I don't know then how to find out either

I can try modifying it a bit tomorrow to have some logs see where it's getting stuck if it is.

You could open any page you have questions about just by entering its title into search field on ruwiki and clicking Enter. "К" is a ruwiki's alias for Категория (Category) namespace.

The grid engine has been shut down, so I'm closing any remaining migration tasks as Declined. If you're still planning to migrate this tool, please re-open this task and add one or more active project tags to it. (If you need a project tag for your tool, those can be created via the Toolforge admin console.)

MBH changed the task status from Declined to Resolved.Mar 14 2024, 1:24 PM

We are in process of migration and some scripts works now, so I think this will be more correct status.

The autopurge job now is passing using https://github.com/wikisaurus/wikisaurusbot/pull/7

I think that the issue was that it tried to do many (500) parallel requests to the wikis, and ends up getting banned for a few minutes, I reduced the parallelism there and it now seems to pass in a reasonable time.

New errors on autopurge-daily:

WARNING: API error protectedpage: This page has been protected to prevent editing or other actions.
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 77, in process_null
    temp.touch()
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1424, in touch
    self.save(summary=summary, watch='nochange',
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1276, in save
    self._save(summary=summary, watch=watch, minor=minor, botflag=botflag,
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 55, in wrapper
    handle(func, self, *args, **kwargs)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 46, in handle
    raise err
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 35, in handle
    func(self, *args, **kwargs)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1288, in _save
    done = self.site.editpage(self, summary=summary, minor=minor,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_decorators.py", line 86, in callee
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2113, in editpage
    raise exception(page) from None
pywikibot.exceptions.LockedPageError: Page [[ru:Википедия:ИИ]] is locked.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 122, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 114, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in process_null
    except pywikibot.exceptions.LockedPage:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/tools/_deprecate.py", line 659, in __getattr__
    return getattr(self._module, attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pywikibot.exceptions' has no attribute 'LockedPage'
CRITICAL: Exiting due to uncaught exception AttributeError: module 'pywikibot.exceptions' has no attribute 'LockedPage'

Yes, a page https://ru.wikipedia.org/w/index.php?title=Project:ИИ&redirect=no is full protected now from editing, but could we process this exception so that the process doesn't crash in such cases?

validation-stats.err:

WARNING: /workspace/scripts/../facenapalmscripts/validstats.py:78: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
  stats.append(datetime.strftime(datetime.utcnow(), "%Y-%m-%d"))

WARNING: /workspace/scripts/../facenapalmscripts/validstats.py:29: DeprecationWarning: Instead of using kwargs from Request.__init__, parameters for the request to the API should be added via the "parameters" parameter.
  request = Request(site=site,

WARNING: /workspace/scripts/../facenapalmscripts/validstats.py:47: DeprecationWarning: Instead of using kwargs from Request.__init__, parameters for the request to the API should be added via the "parameters" parameter.
  request = Request(site=site,

This bots has also a strange issue: they write messages about correct working into error stream, see contents of files techtasks.err, autopurge-daily.err for example. This messages should be redirected to .out stream, and .err files should contain only messages about crashes.

New errors on autopurge-daily:

WARNING: API error protectedpage: This page has been protected to prevent editing or other actions.
Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 77, in process_null
    temp.touch()
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1424, in touch
    self.save(summary=summary, watch='nochange',
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1276, in save
    self._save(summary=summary, watch=watch, minor=minor, botflag=botflag,
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 55, in wrapper
    handle(func, self, *args, **kwargs)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 46, in handle
    raise err
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_decorators.py", line 35, in handle
    func(self, *args, **kwargs)
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/page/_basepage.py", line 1288, in _save
    done = self.site.editpage(self, summary=summary, minor=minor,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_decorators.py", line 86, in callee
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/site/_apisite.py", line 2113, in editpage
    raise exception(page) from None
pywikibot.exceptions.LockedPageError: Page [[ru:Википедия:ИИ]] is locked.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/facenapalmscripts/autopurge.py", line 122, in <module>
    main()
  File "/workspace/facenapalmscripts/autopurge.py", line 114, in main
    respond.append(KEYS[arg](site))
                   ^^^^^^^^^^^^^^^
  File "/workspace/facenapalmscripts/autopurge.py", line 78, in process_null
    except pywikibot.exceptions.LockedPage:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/layers/heroku_python/dependencies/lib/python3.12/site-packages/pywikibot/tools/_deprecate.py", line 659, in __getattr__
    return getattr(self._module, attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pywikibot.exceptions' has no attribute 'LockedPage'
CRITICAL: Exiting due to uncaught exception AttributeError: module 'pywikibot.exceptions' has no attribute 'LockedPage'

Yes, a page https://ru.wikipedia.org/w/index.php?title=Project:ИИ&redirect=no is full protected now from editing, but could we process this exception so that the process doesn't crash in such cases?

All you need to do is wrap the code that might throw the exception with try-except like:

from pywikibot.exceptions import LockedPageError  # at the top of the file

...


try:
    # code that throws exception
except LockedPageError as error:
    print(f"An error occurred, continuing...\nError: {error}")