Page MenuHomePhabricator

Create a "my first Pywikibot bot" tutorial for Toolforge
Closed, ResolvedPublic

Description

Wanted: A detailed but not technical jargon heavy guide to setting up a new Pywikibot tool.

[10:25]  <theglobetrotter>	How do I set up a bot on toolforge that edits the wiki at regular intervals?

This question was asked in the #wikimedia-cloud irc channel, unfortunately at a time when no one was providing answers. It is a good question though. There are many possible ways to answer, but a reasonably common solution would be to first create a pywikibot script to perform the edits and then to setup a cron job to run that script at some defined interval.

Tutorial should cover:

  • Standard tutorial header material (prerequisites, overview of steps)
  • Setting up Owner-only OAuth for pywikibot (maybe something to document on mw.o if not already there)
  • Using shared pywikibot code on Toolforge
  • Example of running a pywikibot provided script
  • Example of running a trivial custom script
  • Example of Grid Engine cron job
  • Example of Kubernetes CronJob object (might be a good excuse to work on some helper script for this)
  • Standard tutorial footer material
    • Choosing a license
    • Publishing source code
    • Adding a Tool:* documentation page
    • Where to look next for help with more advanced use cases

Should include:

  • Screenshots!
  • Cut-n-paste instructions for everything

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This comment was removed by Xephyr826.

@bd808 or anyone, Two questions:

@bd808 or anyone, Two questions:

  • Should this tutorial use PAWS to introduce Pywikibot as suggested above?

Feel free to, though PAWS is quite different from Tool Labs in many aspects (eg. Grid), but users should be able to familiarize themselves with Pywikibot with PAWS.

Definitely not https://www.mediawiki.org/wiki/Manual:Pywikibot/My_first_Pywiki_bot; that does not show how the tutorial is related to tool labs. Personally I'd recommend somewhere inside https://wikitech.wikimedia.org/wiki/Special:PrefixIndex/Help:Tool_Labs/ as it is where a few other "my first..." tutorials for tool labs live)

There is also a very old (and perhaps outdated) tutorial at https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation/Labs. I'd recommend rewriting that to wikitech and redirect this page to the new tutorial.

@Xephyr826: Thanks for working on this!
I see this got moved to https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool , great!
Is there more work needed on this task? Or is this task "resolved"?

Is there more work needed on this task? Or is this task "resolved"?

I think there is quite a bit more work that can be done on this tutorial, but it is nice to have something for people to iterate on.

I'm back for another crack at this one :-)

Xephyr826 renamed this task from Create a "my first Pywikibot bot" tutorial for Tool Labs to Create a "my first Pywikibot bot" tutorial for Toolforge.Mar 24 2018, 8:41 PM
Xephyr826 removed a project: Cloud-Services.

https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool#Install_Pywikibot
^ I suggest using the shared pywikibot as dosumented at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Developing#Using_the_shared_Pywikibot_files_(recommended_setup)

User:Russell Blau/Using pywikibot on Labs Might be outdated.

It looks like written in the pmtpa times, but it's not too bad of a current guide for a shared install. From a quick glance only minor details like the bash prompt and the pywikibot branch are different.

mw:Manual:Pywikibot/Installation/Toolforge Detailed instructions but some parts marked outdated.

Documents the method of having a tool running its own copy of pywikibot.
Section Setup the webspace is hopelessly outdated (pmtpa instructions, when we still had a global apache).
Section Setup the job submission describes the conversion from toolserver command to toolforge command and is not relevant. Syntax like -m 512M is preferred over -l h_vmem=512M as the former also specifies virtual_free in addition to h_vmem. And note that 512M is now the default after T120517.
Section Automatic updating git on Toolforge is really unnecessary. If someone wants to auto-update, they should use the shared pywikibot, which is updated twice a day AFAIK.

FWIW, because of issues like https://unix.stackexchange.com/q/67940, I do not personally use that export PYTHONPATH thing. It should be unnecessary anyways. If a custom environment must be specified, eg. for virtualenvs or C dependencies, I usually use workarounds, eg.

  • using /absolute/path/to/virtualenv/bin/python as the python interpretor
  • using .local as the virtualenv
  • using wrapper scripts like load.sh

For https://wikitech.wikimedia.org/wiki/Help:Toolforge/Developing:

  • compat isn't maintained anymore. I'd say don't even mention it... causes more confusion than good.
  • Section setup web-space & Setup job submission suffers from the same outdatedness as mw:Manual:Pywikibot/Installation/Toolforge.

Expanded on the instructions to request access to Toolforge: https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool#Request_access_to_Toolforge

Pulled in simplified instructions from a few places so all the instructions are in one section.

both pretty good

both interior to the other in some aspects.

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot

  • Should use /shared/pywikibot/core (if grid-only) or /data/project/shared/pywikibot/core (works also in k8s, but I very few run bots in k8s...)
  • Modifying PYTHONPATH is evil. There is no guarantee that bash_profile will be loaded at all. (T134495#4078967, T164277: jsub/jstart inconsistency: non-continuous jobs spawns a login bash shell that loads .bash_profile, but continuous jobs doesn't load either .bash_profile or .bashrc)
    • Highly recommend either using venv and/or the pwb.py in the shared nightly clone.
    • Honestly, some people tell you to modify env vars with .bashrc, some .bash_profile, some .profile, and some .bash_aliases. The outcome? A terrible mess. (My opinion: .bash_aliases should go into .bashrc and .bashrc should be loaded by .profile after testing if the shell is bash; .bash_profile should go into .profile. Keep login shell configuration in .profile and non-login shell configuration in .bashrc)
  • Doesn't explain T60784#3925224 (Granted, people are very unlikely to hit this bug if they use pywikibot's own scripts, but if some more advanced people start to code their own pywikibot scripts and start adding some print statements, oh, gotta expect some fun...)
  • generate_user_files.py is evil. (my opinion doesn't help does it? admittedly, the script does help newcomers a lot, so, ignore this complaint)
  • Should seriously stop talking about compat. It ain't supported.
  • Setup web-space is like 4 years outdated. I'd recommend just remove the section.
  • qcronsub?! Is this toolserver?
  • Why h_vmem when we have -mem?
  • 256MB was default limit... until 3 years ago
  • Should start talking about python3 rather than this aged python2

tl;dr: badly outdated.

https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool

  • Nice docs on how to access toolforge
  • Local clone is the only way? This should not be used unless you're developing pywikibot code itself.
  • The instruction for webpage... while it works, it has nothing to do with pywikibot.
  • Doesn't even talk about how to submit a job to wither grid or k8s.
  • And yeah... empty sections

tl;dr: nice start, but bad finish, lacking in so much stuffs that should be said.

https://wikitech.wikimedia.org/wiki/User:Dvorapa/Toolforge_for_beginners

Looking forward to it :)

@zhuyifei1999 Agree, python2, compat, 2.0, all these things should be eliminated by force from Pywikibot manuals (and brutal force to still occurring pywikipedia and trunk or rewrite! Are we still at the beginning of the century?)

("screenshots"? of command line interface? wtf?)

Retaking this task.

Will be looking at the current tutorial and https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot for updates and improvements.

See staging doc here: https://wikitech.wikimedia.org/wiki/User:SRodlund/myfirstpywikibot_(staging)

FYI, in case it helps prioritize this task, we just had a Bot development workshop with South Asia community members and https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot was recommended and listed as the resource for them to consult.

Xqt raised the priority of this task from Low to Medium.Mar 3 2021, 5:42 PM

@srodlund: are you still working on it?

@Xqt to be honest, I started working on this a while ago when I was working on some pywikibot tutorials for PAWS, but it has been something I haven't made much progress on lately. I'm going to unassign myself from this. If you are interested in working on it, I'm happy to collaborate.

Here's the original page on Wikitech: https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool
And here's the document I started to work on: https://wikitech.wikimedia.org/wiki/User:SRodlund/myfirstpywikibot_(staging)

Hi, I just followed the guide and had a bit of a hard time getting everything in place. I'm happy to collaborate in improving it with my recent experience using the shared files, installing custom dependencies and submitting the job to toolforge.

My path here was writing a script to edit/create pages that runs for longer than PAWS is good for, trying WSL2, hitting 'writeable by others' error, not being able to find Pywikibot Docker documentation (my brief attempts gave me the same permission error), then resorting to Toolforge. If there's a simpler way to do this please let me know...

Hi, I just followed the guide and had a bit of a hard time getting everything in place. I'm happy to collaborate in improving it with my recent experience using the shared files, installing custom dependencies and submitting the job to toolforge.

Please feel empowered to make edits to the documentation that you feel would be helpful to others @Martimpassos.

@bd808 before I make any edits I'd like to understand what was caused by technical shortcomings on my side and what could actually be improved:

  • My first issue was with the location of the pwb.py wrapper. Following step 5 doesn't work, even when using release 7.0.0 scripts location. What worked for me was calling python3 /data/project/shared/pywikibot/core/pwb.py generate_user_files. Since the files' location has changed and that's where new users will find them, should there still be a caution alert or should step 5 be updated with the new path and recommend calling pwb.py with arguments rather than the utility scripts directly?
  • What is the difference between /stable/ and /core/? The caution alert points to core whereas the rest of the guide to points to stable
  • The guide states that using the shared files is recommended whenever possible, but the examples for setting a virtual environment use the local installation scenario. This is where I had the most trouble with: how to install custom dependencies and use the shared files? Or anyone with extra dependencies should install pywikibot locally? In the virtual environment bash script example one cds to $HOME/pywikibot and pip3 installs their dependencies with the -e flag. What if the user does not have the local pywikibot folder? What worked for me was simply install all my dependencies together in the first pip3 install line, and ignore the change of directory and editable installs. Should there be two extra dependencies examples, one for local installation and one for shared files?
  • Does one still need the .pywikibot folder? Running generate_user_files creates them in the $HOME directory, should they be moved?

Furthermore, I ended up here after understanding PAWS isn't appropriated for long-running scripts, but have tried using Pywikibot locally on WSL2 and Docker without success. Does documentation for either exist? I found a WSL hack but it looks outdated.

Sorry for the long rant!

@Martimpassos I'm going to start with updating Help:Toolforge/Pywikibot (T322364), then I'll circle back here with answers.

  • My first issue was with the location of the pwb.py wrapper. Following step 5 doesn't work, even when using release 7.0.0 scripts location. What worked for me was calling python3 /data/project/shared/pywikibot/core/pwb.py generate_user_files. Since the files' location has changed and that's where new users will find them, should there still be a caution alert or should step 5 be updated with the new path and recommend calling pwb.py with arguments rather than the utility scripts directly?

The doc now uses pwb.

  • What is the difference between /stable/ and /core/? The caution alert points to core whereas the rest of the guide to points to stable

stable is the git branch that corresponds to the latest released version, and core is the git development branch. Both locations are updated daily from the corresponding git branch.

  • The guide states that using the shared files is recommended whenever possible, but the examples for setting a virtual environment use the local installation scenario. This is where I had the most trouble with: how to install custom dependencies and use the shared files? Or anyone with extra dependencies should install pywikibot locally? In the virtual environment bash script example one cds to $HOME/pywikibot and pip3 installs their dependencies with the -e flag. What if the user does not have the local pywikibot folder? What worked for me was simply install all my dependencies together in the first pip3 install line, and ignore the change of directory and editable installs. Should there be two extra dependencies examples, one for local installation and one for shared files?

At this point, there is less value in using the shared files since you will still have to install pywikibot's dependencies. These are available globally on the grid but aren't in k8s. Generally, T249787 is the solution.
If the local install directions were followed from the beginning, the first step was to git clone pywikibot into $HOME/pywikibot.

  • Does one still need the .pywikibot folder? Running generate_user_files creates them in the $HOME directory, should they be moved?

Not necessarily. The configuration files can go in multiple locations, including .pywikibot. See Manual:Pywikibot/user-config.py.

I do not recommend using a .pth file.

It is for pywikibot. See Manual:Pywikibot/Page Generators.

Furthermore, I ended up here after understanding PAWS isn't appropriated for long-running scripts, but have tried using Pywikibot locally on WSL2 and Docker without success. Does documentation for either exist? I found a WSL hack but it looks outdated.

I don't use docker, but I do my development in Ubuntu 22.04 in WSL2. I'm guessing you are running into issues with the user-config.py file. It must be owned by the user running the script and not writable by other users. WSL used to have issues with default file permissions - not sure if that is still the case. I have the below in my ~/.profile plus I remember making some other adjustment for file permissions (don't recall if that was just for WSL1). https://learn.microsoft.com/en-us/windows/wsl/file-permissions has information about other ways to adjust file permissions.

if [[ "$(umask)" = "0000" ]]; then
    umask 0022
fi

Hi @JJMC89, thanks for the clarifications and edits, the page looks better. I'll look into the WSL permission when I have a chance.

KBach changed the task status from Open to In Progress.Jan 11 2023, 12:23 PM
KBach claimed this task.

I'll look into this as part of T322217.

KBach removed KBach as the assignee of this task.Jun 14 2023, 12:03 PM

I added three missing sections to https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot - Prerequisites, OAuth, and Next steps. For the remaining content requested in this ticket:

  • Using shared Pywikibot code on Toolforge is, to my knowledge, no longer recommended.
  • Examples of running a provided and a custom script, and of using a scheduled Kubernetes job are already covered.
  • Grid Engine is deprecated so there is no need for an example there.

Unless I missed anything, I think this task is ready to be resolved.

KBach claimed this task.

Marking as resolved. Please feel free to reopen if you disagree.