Page MenuHomePhabricator

Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes
Closed, ResolvedPublic

Assigned To
Authored By
komla
Oct 6 2022, 8:05 PM
Referenced Files
F43340986: image.png
Mar 25 2024, 3:39 PM
F42577439: image.png
Mar 12 2024, 4:44 PM
F42446625: image.png
Mar 7 2024, 5:47 PM
F42398119: image.png
Mar 5 2024, 11:14 AM
F42395781: image.png
Mar 4 2024, 3:37 PM
F41942795: image.png
Feb 17 2024, 8:59 AM

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/mbh) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This is my development environment. Could you explain, WHERE I should type dotnet sln add web-service/<project-name>/<project-name>.csproj here?

image.png (1×2 px, 497 KB)

This is my development environment. Could you explain, WHERE I should type dotnet sln add web-service/<project-name>/<project-name>.csproj here?

image.png (1×2 px, 497 KB)

Sorry for the confusion, you don't need to, using visual studio executes the command itself, when you click the right options, so you don't have to go to the shell and run it yourself.

Essentially, using visual studio you should be able follow the tutorial and add projects to the solutions file.
I don't have windows and have 0 experience with visual studio, so I can't help you more than that.

If you want to try using the command line, you can still download and install the dotnet sdk, that comes with a command line client for windows too:
https://learn.microsoft.com/en-us/dotnet/core/install/windows?tabs=net80 - how to install
https://learn.microsoft.com/en-us/dotnet/core/tools/ - windows command line utility documentation
https://www.wikihow.com/Open-Terminal-in-Windows - how to open a command line terminal in windows
https://www.youtube.com/watch?v=Amvq1WbmFmA - first result when searching a tutorial in youtube (if you prefer videos, at around 00:30 is where they open the console and run dotnet)

Now I have more than 10 csproj files with exactly the same content like https://github.com/Saisengen/wikibots/blob/main/web-services/clusters5/clusters5.csproj

Could we use one file for it?

Unfortunately I don't think so, I'm no expert in dotnet, but for what I've been able to find out, you'll need one csproj file for each binary you want to generate (that currently match each *.cs file afaik).

An option would be to use a web framework instead of cgi-bin scripts, and then you would be able to just compile one binary having each endpoint implement the functionality, but that requires a bigger refactor of the code. For example using asp.net (ex. https://dotnet.microsoft.com/en-us/learn/aspnet/blazor-tutorial/intro).

Oh, looks like I understand. I should use a native .sln files from projects' folders from my PC, they contains these IDs. But I have many dozens of sln files, by number of projects/bots, and you use one for all bots. I'll try to gather info about all bots into one file.

Since I have .sln file in every my bot/project folder on my PC, like .csproj file, maybe it will be better to use this native sln files, one for every tool, instead of combining them into one big file? Could you modify building process to use .sln files from project folders, like .csproj files? I will put .sln file from project folder from my PC into GitHub folder like https://github.com/Saisengen/wikibots/tree/main/web-services/pages-wo-iwiki

image.png (230×168 px, 6 KB)

Since I have .sln file in every my bot/project folder on my PC, like .csproj file, maybe it will be better to use this native sln files, one for every tool, instead of combining them into one big file? Could you modify building process to use .sln files from project folders, like .csproj files? I will put .sln file from project folder from my PC into GitHub folder like https://github.com/Saisengen/wikibots/tree/main/web-services/pages-wo-iwiki

image.png (230×168 px, 6 KB)

That complicates a bit the building process, as it has to now find all the .sln files and compile them one by one, while before just one command would compile everything (dotnet build at the root of the repo), and it seems to be the way dotnet 'groups' projects.
Is there any reason why not use the global all.sln file? (you can even use both)

btw. you can do anything you want though, but I might not have time to help you tweak the build script or might take me a while to get to it, specially if it's not blocking anything.

@MBH let me know if/when you have a patch moving all the .sln files, and I'll try to help you with the compile.sh script.

I might not have time to help you tweak the build script or might take me a while to get to it, specially if it's not blocking anything.

I understand it. I really appreciate you spending a lot of time to help me deal with my very specific case.

But now I have a problem again. I executed a command from your manual on Toolforge:
toolforge build start https://github.com/Saisengen/wikibots
Build was done without errors, as far as I can see.

But after that, nothing changed in cgi-bin folder on Toolforge - it contains old three projects you compiled on 03.03.2024 and no one new file. You added 10 new projects to all.sln file on my repo, I have used this file, but all of these projects was not compiled.

I might not have time to help you tweak the build script or might take me a while to get to it, specially if it's not blocking anything.

I understand it. I really appreciate you spending a lot of time to help me deal with my very specific case.

But now I have a problem again. I executed a command from your manual on Toolforge:
toolforge build start https://github.com/Saisengen/wikibots
Build was done without errors, as far as I can see.

But after that, nothing changed in cgi-bin folder on Toolforge - it contains old three projects you compiled on 03.03.2024 and no one new file. You added 10 new projects to all.sln file on my repo, I have used this file, but all of these projects was not compiled.

Yep, with the latest changes, it will not use anything from the NFS anymore, so the cgi-bin on the tool home is not used at all, if you see the logs for the build you should see there the compilation logs.

Restarting the webservice should pull in the new image (toolforge webservice restart), I can take a look if that does not work.

I did it

tools.mbh@tools-sgebastion-10:~$ toolforge webservice restart
Restarting..............................
Your webservice is taking quite while to restart. If it isn't up shortly, run a 'webservice stop' and the start command used to run this webservice to begin with.
tools.mbh@tools-sgebastion-10:~$ toolforge webservice restart
Restarting......................

After that I did ws stop and ws start.
Before that, I renamed category-pathfinder project to cpf and fixed a path to executable in html form, and fixed project name in all.sln. After that

it will not use anything from the NFS anymore

And only one working way to put something into my public_html (I mean not NFS folder, but a space, visible from the Internet on my subdomain on Toolforge) is now from GitHub? Earlier I sometimes generate a new version of data file for cluster analysis of ruwiki's elections and just move it to my public_html folder on Toolforge (several web tools reads this file). And sometimes I put a big .txt files for ruwiki to my public_html folder. Should I now add all updates of this files to GitHub and create an image with its new versions every time? If yes, it will be great if you improve compiling script to add all .txt files from my GitHub root folder and add them to image.

For static files, you can still use your public_html folder, but you'll have to access it through tools-static (https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Serving_static_files), for example https://tools-static.wmflabs.org/mbh/test.txt (just created the www/static directory on your tool with a test file).

I did it

tools.mbh@tools-sgebastion-10:~$ toolforge webservice restart
Restarting..............................
Your webservice is taking quite while to restart. If it isn't up shortly, run a 'webservice stop' and the start command used to run this webservice to begin with.
tools.mbh@tools-sgebastion-10:~$ toolforge webservice restart
Restarting......................

After that I did ws stop and ws start.
Before that, I renamed category-pathfinder project to cpf and fixed a path to executable in html form, and fixed project name in all.sln. After that

Looking....

Looking....

It's running the non-buildservice webservice, starting it with the buildservice image fails with crashloop, looking...

Looking....

It's running the non-buildservice webservice, starting it with the buildservice image fails with crashloop, looking...

Fonud it, there's a bug in the script, where it does exec 2 >>... it should not have a space between 2 and >>, same for exec 1 >>...

There you go: https://github.com/Saisengen/wikibots/pull/4

Tested that with the envvar set too (so it goes over the exec statements).

For static files, you can still use your public_html folder, but you'll have to access it through tools-static

And will my tools be able to read this file from new folder? They contains code like this: var reader = new StreamReader("elections.txt");, and elections.txt file is just on old public_html folder. Should I move it to folder pointed by you, and will tools read it from new folder?

For static files, you can still use your public_html folder, but you'll have to access it through tools-static

And will my tools be able to read this file from new folder? They contains code like this: var reader = new StreamReader("elections.txt");, and elections.txt file is just on old public_html folder. Should I move it to folder pointed by you, and will tools read it from new folder?

They will not, but you can do something like (untested):

// this points to the tool home
var dataDir = Environment.GetEnvironmentVariable("TOOL_DATA_DIR");
// now to the static directory
var electionsPath = Path.Combine(dataDir, "www/static/elections.txt");
var reader = new StreamReader(electionsPath);

Hmmm, now both links https://mbh.toolforge.org/cgi-bin/cpf and https://mbh.toolforge.org/cgi-bin/category-pathfinder responds with

image.png (158×647 px, 20 KB)

Toolforge acts like this when webservice is not running, but

tools.mbh@tools-sgebastion-10:~/public_html$ webservice start
Your job is already running
tools.mbh@tools-sgebastion-10:~/public_html$ toolforge webservice start
Your job is already running

Works now, may be a random failure.

No, the problem persists: now three tools you initially added to .sln file works, and ~seven new, added by you to sln later - responds with No webservice.

No, the problem persists: now three tools you initially added to .sln file works, and ~seven new, added by you to sln later - responds with No webservice.

From the error.log file:

192.168.36.67 - - [08/Mar/2024 11:27:17] "GET /cgi-bin/pages-wo-iwiki HTTP/1.1" 200 -
Unhandled exception. System.IO.FileNotFoundException: Could not find file '/workspace/pages-wo-iwiki.html'.
File name: '/workspace/pages-wo-iwiki.html'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirError)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String path, OpenFlags flags, Int32 mode, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, UnixFileMode openPermissions, Int64& fileLength, UnixFileMode& filePermissions, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamReader..ctor(String path)
   at Program.sendresponse(String sourcewiki, String category, String template, String targetwiki, String type, String pagetype, String sort, Boolean wikilist, Boolean wikitable, Int32 depth, Int32 miniwiki, String result) in /workspace/web-services/pages-wo-iwiki/pages-wo-iwiki.cs:line 54
   at Program.Main() in /workspace/web-services/pages-wo-iwiki/pages-wo-iwiki.cs:line 154
192.168.36.67 - - [08/Mar/2024 11:27:17] CGI script exit code -6

It seems you forgot to change the code to load the html file relative to the script:

# replace this
        var sr = new StreamReader("pages-wo-iwiki.html");
# with
        // Get the path of the executable, and lead the html file from that same directory
        string strExeFilePath = System.Reflection.Assembly.GetExecutingAssembly().Location;
        string? strWorkPath = Path.GetDirectoryName(strExeFilePath);
        string strHtmlPath = Path.Combine(strWorkPath!, "pages-wo-iwiki.html");
        var sr = new StreamReader(strHtmlPath);

From the error.log file too:

192.168.36.67 - - [08/Mar/2024 11:26:39] "GET /cgi-bin/patstats HTTP/1.1" 200 -
Unhandled exception. System.IO.FileNotFoundException: Could not find file '/workspace/patstats.html'.
File name: '/workspace/patstats.html'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirError)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String path, OpenFlags flags, Int32 mode, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, UnixFileMode openPermissions, Int64& fileLength, UnixFileMode& filePermissions, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamReader..ctor(String path)
   at Program.Sendresponse(String type, String project, String startdate, String enddate, String sort, String result) in /workspace/web-services/patstats/patstats.cs:line 57
   at Program.Main() in /workspace/web-services/patstats/patstats.cs:line 88
192.168.36.67 - - [08/Mar/2024 11:26:40] CGI script exit code -6

Same as pages-wo-iwiki.

Same as pages-wo-iwiki for loading the html (checking the code, no logs here).

I think it hangs because it expects input:

string input = Console.ReadLine();

If I use the url with parameters https://mbh.toolforge.org/cgi-bin/cite2template?something=something I get the error for the html:

Unhandled exception. System.IO.FileNotFoundException: Could not find file '/workspace/cite2template.html'.
File name: '/workspace/cite2template.html'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirError)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String path, OpenFlags flags, Int32 mode, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, UnixFileMode openPermissions, Int64& fileLength, UnixFileMode& filePermissions, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamReader..ctor(String path)
   at Program.Sendresponse(String source, Boolean addauthor, String author, String result) in /workspace/web-services/cite2template/cite2template.cs:line 11
   at Program.Main() in /workspace/web-services/cite2template/cite2template.cs:line 24
192.168.166.48 - - [08/Mar/2024 11:33:39] CGI script exit code -6
192.168.254.194 - - [08/Mar/2024 11:33:57] "GET /cgi-bin/cite2template?something=something HTTP/1.1" 200 -

so yep, that script is not able to handle non-parameter paths (that's not new though). You can use the same strategy as you use in the compare script of using QUERY_STRING environment variable instead of ReadLine.

Same as pages-wo-wiki, using the wrong path for the html:

192.168.36.67 - - [08/Mar/2024 11:37:23] "GET /cgi-bin/compare HTTP/1.1" 200 -
Unhandled exception. System.IO.FileNotFoundException: Could not find file '/workspace/compare.html'.
File name: '/workspace/compare.html'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirError)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String path, OpenFlags flags, Int32 mode, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, UnixFileMode openPermissions, Int64& fileLength, UnixFileMode& filePermissions, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamReader..ctor(String path)
   at Program.Sendresponse(String page, String result, Boolean loadfromtool) in /workspace/web-services/compare/compare.cs:line 11
   at Program.Main() in /workspace/web-services/compare/compare.cs:line 29
192.168.36.67 - - [08/Mar/2024 11:37:23] CGI script exit code -6

And same here too:

192.168.254.194 - - [08/Mar/2024 11:39:11] "GET /cgi-bin/unreviewed-pages HTTP/1.1" 200 -
Unhandled exception. System.IO.FileNotFoundException: Could not find file '/workspace/unreviewed-pages.html'.
File name: '/workspace/unreviewed-pages.html'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirError)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String path, OpenFlags flags, Int32 mode, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, UnixFileMode openPermissions, Int64& fileLength, UnixFileMode& filePermissions, Boolean failForSymlink, Boolean& wasSymlink, Func`4 createOpenException)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamReader..ctor(String path)
   at Program.sendresponse(String wiki, String cat, String template, Int32 depth, String result) in /workspace/web-services/unreviewed-pages/unreviewed-pages.cs:line 24
   at Program.Main() in /workspace/web-services/unreviewed-pages/unreviewed-pages.cs:line 89

I think that's all the errors :)

Note that if you have podman or docker in your laptop, you should be able to test the built image locally with:

dcaro@urcuchillay$ podman pull tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest  # just making sure you have the latest
dcaro@urcuchillay$ podman run --rm -ti --publish 8000:8000 --env PORT=8000 tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest

Thank you very much, but it doesn't solve some of errors.

I did a relative path replacement you suggested in two tools

I built projects again, but after that both tools return the same error Could not find file '/workspace/pages-wo-iwiki.html' After several tries, I stopped and started webservice, it's running now, but both tools now returns 404.

Also I can't move data file elections.txt from my PC to www\static folder due to permission error, and I can't change owner ot this folder from tools.mbh to mbh due to permission error, could you change owner or group of this folder to mbh? I moved these files from older public_html folder using mv, but it will be easier to me to can upload files to new folder directly from my PC (I use WinSCP).

You can use the same strategy as you use in the compare script of using QUERY_STRING environment variable instead of ReadLine.

This (and several other) tool uses ReadLine because it receives big input strings and thus uses POST method instead of GET, it was intentionally (GET fails, because URL string can't be more than 4096 bytes). How could our tools receive POST request data?

OK, I guessed, I should use webservice buildservice start --mount all instead of just webservice start.

A tool that uses DB replica connection can't locate a password file, so looks like I forced to use a password envvar...

OK, I have solved most of errors, but got new. I have added last 5 tools to all.sln manually

run building and got this

[step-build] 2024-03-10T14:40:31.167168720Z Script ./compile.sh handling the post-update-cmd event returned with error code 1
[step-build] 2024-03-10T14:40:31.175133748Z
[step-build] 2024-03-10T14:40:31.175211160Z [Error: Failed to install dependencies]
[step-build] 2024-03-10T14:40:31.175231152Z Dependency installation failed!
[step-build] 2024-03-10T14:40:31.175234821Z
[step-build] 2024-03-10T14:40:31.175238924Z The 'composer install' process failed with an error. The cause
[step-build] 2024-03-10T14:40:31.175242196Z may be the download or installation of packages, or a pre- or
[step-build] 2024-03-10T14:40:31.175244954Z post-install hook (e.g. a 'post-install-cmd' item in 'scripts')
[step-build] 2024-03-10T14:40:31.175248276Z in your 'composer.json'.
[step-build] 2024-03-10T14:40:31.175250938Z
[step-build] 2024-03-10T14:40:31.175253559Z Typical error cases are out-of-date or missing parts of code,
[step-build] 2024-03-10T14:40:31.175256277Z timeouts when making external connections, or memory limits.
[step-build] 2024-03-10T14:40:31.175258681Z
[step-build] 2024-03-10T14:40:31.175261938Z Check the above error output closely to determine the cause of
[step-build] 2024-03-10T14:40:31.175264691Z the problem, ensure the code you're pushing is functioning
[step-build] 2024-03-10T14:40:31.175267344Z properly, and that all local changes are committed correctly.
[step-build] 2024-03-10T14:40:31.175269779Z
[step-build] 2024-03-10T14:40:31.175272474Z For more information on builds for PHP on Heroku, refer to
[step-build] 2024-03-10T14:40:31.175282592Z https://devcenter.heroku.com/articles/php-support
[step-build] 2024-03-10T14:40:31.177132777Z ERROR: failed to build: exit status 1

I built projects again, but after that both tools return the same error Could not find file '/workspace/pages-wo-iwiki.html' After several tries, I stopped and started webservice, it's running now, but both tools now returns 404.

That script seems to be running well now, I guess you were able to make it work?

Also I can't move data file elections.txt from my PC to www\static folder due to permission error, and I can't change owner ot this folder from tools.mbh to mbh due to permission error, could you change owner or group of this folder to mbh? I moved these files from older public_html folder using mv, but it will be easier to me to can upload files to new folder directly from my PC (I use WinSCP).

All you need is to give the group write permissions as your user account (mbh) is part of the tool group account (tools.mbh), just did it myself from the tools.mbh account (chmod g+w www/static), can you try again?

This (and several other) tool uses ReadLine because it receives big input strings and thus uses POST method instead of GET, it was intentionally (GET fails, because URL string can't be more than 4096 bytes). How could our tools receive POST request data?

Usually web frameworks take care of all that xd
In this specific case, it seems that the python CGIHTTPServer defines the REQUEST_METHOD environment variable (https://github.com/enthought/Python-2.7.3/blob/master/Lib/CGIHTTPServer.py#L164C14-L164C28), so you can read that one, check if it's POST, and if it is, then use ReadLine.

OK, I guessed, I should use webservice buildservice start --mount all instead of just webservice start.

A tool that uses DB replica connection can't locate a password file, so looks like I forced to use a password envvar...

When using --mount=all, the replica password file is under $TOOL_DATA_DIR/replica.my.cnf, note the $TOOL_DATA_DIR instead of a hardcoded path or $HOME. In any case, using the envvars is the recommended way (avoiding NFS, even if you mount it, the less usage of NFS the better).

OK, I have solved most of errors, but got new. I have added last 5 tools to all.sln manually

run building and got this

[step-build] 2024-03-10T14:40:31.167168720Z Script ./compile.sh handling the post-update-cmd event returned with error code 1
[step-build] 2024-03-10T14:40:31.175133748Z
[step-build] 2024-03-10T14:40:31.175211160Z [Error: Failed to install dependencies]
[step-build] 2024-03-10T14:40:31.175231152Z Dependency installation failed!
[step-build] 2024-03-10T14:40:31.175234821Z
[step-build] 2024-03-10T14:40:31.175238924Z The 'composer install' process failed with an error. The cause
[step-build] 2024-03-10T14:40:31.175242196Z may be the download or installation of packages, or a pre- or
[step-build] 2024-03-10T14:40:31.175244954Z post-install hook (e.g. a 'post-install-cmd' item in 'scripts')
[step-build] 2024-03-10T14:40:31.175248276Z in your 'composer.json'.
[step-build] 2024-03-10T14:40:31.175250938Z
[step-build] 2024-03-10T14:40:31.175253559Z Typical error cases are out-of-date or missing parts of code,
[step-build] 2024-03-10T14:40:31.175256277Z timeouts when making external connections, or memory limits.
[step-build] 2024-03-10T14:40:31.175258681Z
[step-build] 2024-03-10T14:40:31.175261938Z Check the above error output closely to determine the cause of
[step-build] 2024-03-10T14:40:31.175264691Z the problem, ensure the code you're pushing is functioning
[step-build] 2024-03-10T14:40:31.175267344Z properly, and that all local changes are committed correctly.
[step-build] 2024-03-10T14:40:31.175269779Z
[step-build] 2024-03-10T14:40:31.175272474Z For more information on builds for PHP on Heroku, refer to
[step-build] 2024-03-10T14:40:31.175282592Z https://devcenter.heroku.com/articles/php-support
[step-build] 2024-03-10T14:40:31.177132777Z ERROR: failed to build: exit status 1

The last build passed, were you able to sort it out?

tools.mbh@tools-sgebastion-10:~$ toolforge build list
build_id                          status    start_time            end_time              source_url                             ref    envvars    destination_image
mbh-buildpacks-pipelinerun-l8ggw  ok        2024-03-10T22:54:30Z  2024-03-10T22:57:37Z  https://github.com/Saisengen/wikibots  N/A    N/A        tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest

OK, that's what I know about all of this now.

I have written also a simplest tool that just displays keys and values of all envvars seen by tool: https://github.com/Saisengen/wikibots/blob/main/web-services/test/test.cs . Earlier I runned same tool from grid and it worked. Now it doesn't compile due to the same dependency error, athough it doesn't contain any outer libs. Copying local .csproj to https://github.com/Saisengen/wikibots/blob/main/web-services/test/test.csproj doesn't help.

When using --mount=all, the replica password file is under $TOOL_DATA_DIR/replica.my.cnf

I mean my password file, ..\..\p, in my format. I doesn't use replica.my.cnf because my file format is way easier to parse. I understand that it doesn't work this way on k8s and switched to envvar. But using of envvar doesn't simplify password file management for me, but on the contrary complicates the matter, because only way I know to update multi-line envvar is to load it from file (cat p | envvar), so I still should have password file in my NFS folder.

The last build passed, were you able to sort it out?

No, I just remove problematic projects from all.sln to compile working projects, to make at least some of tools working.

(chmod g+w www/static), can you try again?

It works, thanks for explanation.

OK, that's what I know about all of this now.

  • After building, webservice should be stopped and started again, otherwise old compiled version of tools will work.

Yep, it has to pull the new image.

I have written also a simplest tool that just displays keys and values of all envvars seen by tool: https://github.com/Saisengen/wikibots/blob/main/web-services/test/test.cs . Earlier I runned same tool from grid and it worked. Now it doesn't compile due to the same dependency error, athough it doesn't contain any outer libs. Copying local .csproj to https://github.com/Saisengen/wikibots/blob/main/web-services/test/test.csproj doesn't help.

Careful with that, it might expose passwords to the DBs and secrets, try to show only a few vars instead of showing all. Let me have a look at the build logs...

When using --mount=all, the replica password file is under $TOOL_DATA_DIR/replica.my.cnf

I mean my password file, ..\..\p, in my format. I doesn't use replica.my.cnf because my file format is way easier to parse. I understand that it doesn't work this way on k8s and switched to envvar. But using of envvar doesn't simplify password file management for me, but on the contrary complicates the matter, because only way I know to update multi-line envvar is to load it from file (cat p | envvar), so I still should have password file in my NFS folder.

Oh true xd, I forgot about that one. I would recommend splitting it into different envvar variables for ease of usage and maintenance (so it would not be multiline).

The last build passed, were you able to sort it out?

No, I just remove problematic projects from all.sln to compile working projects, to make at least some of tools working.

(chmod g+w www/static), can you try again?

It works, thanks for explanation.

Found the issue with the compilation, the logs are a bit hidden in the pile of logs as the colors get stripped out too :/

[step-build] 2024-03-11T17:26:31.159806837Z /layers/fagiani_apt/apt/usr/lib/dotnet/sdk/8.0.102/Microsoft.Common.CurrentVersion.targets(1241,5): error MSB3644: The reference assemblies for .NETFramework,Version=v4.8 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framewo
rk Developer Packs at https://aka.ms/msbuild/developerpacks [/workspace/web-services/clusters5/clusters5.csproj]

The issue is that you are forcing it to be .net v4, while the one running is latest stable (v8), you can try just removing the restrictions in the csproj files:

06:28 PM ~/Work/wikimedia/user_tools/wikibots (main|✔)
dcaro@urcuchillay$ git grep TargetFrameworkVersion
web-services/clusters5/clusters5.csproj:    <TargetFrameworkVersion>v4.8</TargetFrameworkVersion>
web-services/test/test.csproj:    <TargetFrameworkVersion>v4.7.2</TargetFrameworkVersion>

Found the issue with the compilation, the logs are a bit hidden in the pile of logs as the colors get stripped out too :/

[step-build] 2024-03-11T17:26:31.159806837Z /layers/fagiani_apt/apt/usr/lib/dotnet/sdk/8.0.102/Microsoft.Common.CurrentVersion.targets(1241,5): error MSB3644: The reference assemblies for .NETFramework,Version=v4.8 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framewo
rk Developer Packs at https://aka.ms/msbuild/developerpacks [/workspace/web-services/clusters5/clusters5.csproj]

The issue is that you are forcing it to be .net v4, while the one running is latest stable (v8), you can try just removing the restrictions in the csproj files:

06:28 PM ~/Work/wikimedia/user_tools/wikibots (main|✔)
dcaro@urcuchillay$ git grep TargetFrameworkVersion
web-services/clusters5/clusters5.csproj:    <TargetFrameworkVersion>v4.8</TargetFrameworkVersion>
web-services/test/test.csproj:    <TargetFrameworkVersion>v4.7.2</TargetFrameworkVersion>

That does not seem to be enough though :/, I recreated the full csproj file bit by bit, and this seems to work:

dcaro@urcuchillay$ cat web-services/clusters5/clusters5.csproj 
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Newtonsoft.Json" Version="13.0.0.0"/>
  </ItemGroup>
</Project>

Thanks. But I added clusters5 and test to all.sln, replaced .csproj of both projects with code from your last message, runned building and it failed with the same error.

Thanks. But I added clusters5 and test to all.sln, replaced .csproj of both projects with code from your last message, runned building and it failed with the same error.

This time the error is different:

[step-build] 2024-03-11T18:01:42.461081953Z ## Gathering web-services/test
[step-build] 2024-03-11T18:01:43.007834016Z cp: missing destination file operand after '/layers/heroku_php/wikibots/public_html/cgi-bin/'

This is because the compile script expects each webservice to have some html files. Just sent a PR to not expect that, and adapt the test.cs script not to leak credentials (https://github.com/Saisengen/wikibots/pull/5).

Thanks. I merged your patch, successfully built all projects, stopped and started webservice (buildservice mount=all) and now all tools is inaccessible with ERR_EMPTY_RESPONSE or No webservice.

Thanks. I merged your patch, successfully built all projects, stopped and started webservice (buildservice mount=all) and now all tools is inaccessible with ERR_EMPTY_RESPONSE or No webservice.

It seems something went wrong, looking:

tools.mbh@tools-sgebastion-10:~$ toolforge webservice logs
2024-03-12T13:30:07+00:00 [mbh-7cd857496-x88cw] Unable to find cgi-bin directory at /layers/heroku_php/wikibots/public_html, something went wrong.

I see, the compile script did not work as expected xd

tools.mbh@tools-sgebastion-10:~$ toolforge build logs
...
[step-build] 2024-03-12T13:06:00.986898628Z Local build, you can find the output at /tmp/wikibots_build/wikibots
...

It thinks it's working in a local development environment.

@MBH this is the fix https://github.com/Saisengen/wikibots/pull/6

I rebuilt your webservice from that branch, and it's working as expected (I think, please verify). Make sure to rebuild and restart the webservice once you merge it to verify the build still works for you (it should).

Oh, and be very very careful with the test.cs webservice, I'd recommend removing it as soon as you finish testing, as it might expose secrets.

Thanks, looks like it works now (I still have some questions. but will ask them later).

Could you also help another ruwiki user with transferring his tool? T320164 We doesn't understand what exactly should we do to run his pywikibot scripts.

Remind me, how to rebuild only one tool? Will it be faster than rebuilding all tools and will other tools remain in image after that?

Another question: why https://github.com/Saisengen/wikibots/blob/main/all.sln contains web-services project, when it doesn't have a .csproj file?

And about POST receiving tool. I chanded its code to use ReadLine only when receive POST https://github.com/Saisengen/wikibots/commit/f10cf30c355cc30dd24445f9b6392e4301b2aafe , now it opens initially (https://mbh.toolforge.org/cgi-bin/cite2template), but when I enter anything and run it, it freezes again, looks like it can't read POST data even now.

And what the difference between TOOL_REPLICA and TOOL_TOOLSDB envvars? They are equal now for my tool.

And (this is screenshot from building log) why some tools are built into .dll, and another are built into /publish, what's the difference?

image.png (162×966 px, 27 KB)

Remind me, how to rebuild only one tool? Will it be faster than rebuilding all tools and will other tools remain in image after that?

Currently there's no way to rebuild only one tool, all are bundled in the image (the slower step is not the build step though, it's the apt installation iirc, so would have saved only little time).

Another question: why https://github.com/Saisengen/wikibots/blob/main/all.sln contains web-services project, when it doesn't have a .csproj file?

I don't know, I think that entry could be removed (untested though).

And about POST receiving tool. I chanded its code to use ReadLine only when receive POST https://github.com/Saisengen/wikibots/commit/f10cf30c355cc30dd24445f9b6392e4301b2aafe , now it opens initially (https://mbh.toolforge.org/cgi-bin/cite2template), but when I enter anything and run it, it freezes again, looks like it can't read POST data even now.

I would have to look into this in detail, are you sending a newline at the end of your POST data? (readline might be waiting for it)

And what the difference between TOOL_REPLICA and TOOL_TOOLSDB envvars? They are equal now for my tool.

The replicas and toolsdb are two different installations, currently we create the same user and password for both for users to access, but that might change in the future, if you make sure to use the right one (the replica one for the replicas, and the toolsdb one for toolsdb), you will not need to change your code if/whenever that happens (that might not happen).

And (this is screenshot from building log) why some tools are built into .dll, and another are built into /publish, what's the difference?

I think it's just that the ones that have dependencies, dotnet says publish instead of each of the dll files (one for the project, and one for the dependency), and if it does not have any dependencies, it shows the project dll file.

And what the difference between TOOL_REPLICA and TOOL_TOOLSDB envvars? They are equal now for my tool.

Future proofing is the only difference today. Currently all tools have the same username and password for the two separate database services. This is not guaranteed to be the case 6, 12, 18, 24 months from now so it was decided that we should expose separate envvars for each environment. The intent is to allow tool maintainers to defend against needing to update their code when the credentials change for one database service or the other.

are you sending a newline at the end of your POST data?

First time I didn't add it, now I add it and tool freezes too.

As far as I know, all of my db connections are to wiki db replicas. If replicas and toolsdb are two different things, what is tooldb?

As far as I know, all of my db connections are to wiki db replicas. If replicas and toolsdb are two different things, what is tooldb?

Thank you very much for revitalasing my web tools. Could you explain what you did? Are they still runned through mono? If yes, why they run on python image, on the other side?

Thank you very much for revitalasing my web tools. Could you explain what you did? Are they still runned through mono? If yes, why they run on python image, on the other side?

Sure! A summary:

There's two main needs from this tool that made it tricky to setup:

  • Needing a cgi server
  • Building more than one dotnet binary

The simplest solution I could think of to get a cgi server, is using the python included cgi server python -m http.server --cgi.

Then to generate the many binaries, given that dotnet does not have much support for it (they seem to want to generate a single binary for each compilation), we created the compile.sh script.

Now, to be able to run this script at build time, I had to add a composer.json pointing to it, so the php buildpack would run it (there's no way to run a custom build script, and it's highly not recommended if you can avoid it).

So we end up running the php buildpack, to run the compile.sh script, and we have to install dotnet to compile the binaries, and python to run the cgi server.

To install the extra packages you put the in the Aptfile in your repo.

In essence, it's not using the python image, it's using a custom-built image that includes python to be able to serve the compiled binaries using a python's built-in cgi server.

If the tool moved to use a dotnet web framework and were served all from that single binary (ex. https://dotnet.microsoft.com/en-us/apps/aspnet), all you would need would be something like the sample app (essentially, just the Procfile, https://gitlab.wikimedia.org/toolforge-repos/sample-dotnet-buildpack-app) and it would not use all the tricks it uses now, but the upstream dotnet buildpack.

But that's a bigger refactor :)

I tried to rewrite my tools using standard dotnet way to create web tools, but stuck on first steps.

But why it's impossible to run this tools on k8s exactly the same way they runned on grid? On grid their functioning was ensured by two lines in .lighttpd.conf

static-file.exclude-extensions += ( ".cgi" )
cgi.assign += ( ".cgi" => "/usr/bin/mono" )

Why isn't possible to run webservice on tf-mono68 image, created for Toolforge mono users (Hawkeye7 and me) and just execute my .exe files, renamed to .cgi, on mono, like earlier?

I tried to rewrite my tools using standard dotnet way to create web tools, but stuck on first steps.

But why it's impossible to run this tools on k8s exactly the same way they runned on grid? On grid their functioning was ensured by two lines in .lighttpd.conf

static-file.exclude-extensions += ( ".cgi" )
cgi.assign += ( ".cgi" => "/usr/bin/mono" )

Why isn't possible to run webservice on tf-mono68 image, created for Toolforge mono users (Hawkeye7 and me) and just execute my .exe files, renamed to .cgi, on mono, like earlier?

That was not the only thing you needed, you also needed all the building process that you were doing locally, have the same exact versions of the tools installed that you used during the build, etc.
All that process was hidden from the view, and once you had set it up, you did not have to set it up again, but for anyone trying to fork this tool, it would have been also quite hard to get started with it.

With the current build process, anyone can fork the tool, and get it running without having to change anything. And we have an image, that can run by itself, without the need of special files or user-provided input.

Essentially, with the current setup, if you don't touch your tool for several years, we could still be running it without problems (as everything needed for both building and running the tool is included in the image).

It would have been possible to enable and allow the same old patterns that were use in the grid, but after several years of maintaining it, we learnt that in the long run, they are not sustainable, so we preferred to avoid them and allow tools to be more sustainable for both tool creators and toolforge admins.

My building process was just clicking on green arrow on Visual Studio panel, my "the tools installed that you used during the build" was just plain free (as in "free beer") Visual Studio regardless of its version (2016 or 2019 or 2022...), my deployment process was just compiling this programs like my non-web bots (that still runs on mono on k8s), moving it to Toolforge and renaming .exe to .cgi (the last one was tiring, yeah). The tiny little problem is that now I have absolutely no idea how this building process works, how works all these .sh files and can't fix anything without asking the developers. And if you become inactive for any reason, will any other developer understand and fix this process? Earlier I was able to solve almost all problems with these tools myself, now I can't solve anything and forced to ask here about any little issue because I doesn't understand how this, excuse me, Goldberg machine works.

My building process was just clicking on green arrow on Visual Studio panel, my "the tools installed that you used during the build" was just plain free (as in "free beer") Visual Studio regardless of its version (2016 or 2019 or 2022...), my deployment process was just compiling this programs like my non-web bots (that still runs on mono on k8s), moving it to Toolforge and renaming .exe to .cgi (the last one was tiring, yeah). The tiny little problem is that now I have absolutely no idea how this building process works, how works all these .sh files and can't fix anything without asking the developers. And if you become inactive for any reason, will any other developer understand and fix this process? Earlier I was able to solve almost all problems with these tools myself, now I can't solve anything and forced to ask here about any little issue because I doesn't understand how this, excuse me, Goldberg machine works.

And that is fair, I agree that now the process for you has become trickier. We can work on that little by little both to make the current code match the current supported standards (dotnet standards, like using a webapp instead of cgi-bin scripts, that would get rid of 99% of the Goldberg setup), and to simplify the deployment process on toolforge (we are working on that).

Essentially, some time ago the industry shifted and cgi-bin setups have become less used, and dropped from supported paths everywhere so it becomes harder to maintain them (you can find many places around the web on why, ex. https://www.quora.com/Why-is-CGI-not-used-anymore-for-web-applications).

If you could rewrite any one of my tools to follow modern dotnet way to build web apps, it will be very helpful to me and I think I will rewrite all other tools, looking by tool you rewritten. For example, https://github.com/Saisengen/wikibots/blob/main/web-services/likes/likes.cs , it's very tiny tool for 70 loc containing DB replica requests, it's almost all I use in my web tools.

If you could rewrite any one of my tools to follow modern dotnet way to build web apps, it will be very helpful to me and I think I will rewrite all other tools, looking by tool you rewritten. For example, https://github.com/Saisengen/wikibots/blob/main/web-services/likes/likes.cs , it's very tiny tool for 70 loc containing DB replica requests, it's almost all I use in my web tools.

I can try, might take me a bit (I'm also not familiar with dotnet).

I started to trying to rewrite my tools on January and did some work, looks like new code should be something like this:

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();

app.MapGet("/", (HttpRequest rqst) => Results.Content(Response.Text(rqst), "text/html"));
app.Run();

class Response
{
    public static string Sendresponse(string page, string result, bool loadfromtool)
    {
        //here is code of Sendresponse method from old code
    }
    public static string Text(HttpRequest rqst)
    {
        //here is Main() from old code
        //string page = rqst.Query["page"][0];
        //bool loadfromtool = rqst.Query["loadfromtool"][0] == "on";
        //upper we are reading GET parameters
    }
}

I started to trying to rewrite my tools on January and did some work, looks like new code should be something like this:

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();

app.MapGet("/", (HttpRequest rqst) => Results.Content(Response.Text(rqst), "text/html"));
app.Run();

class Response
{
    public static string Sendresponse(string page, string result, bool loadfromtool)
    {
        //here is code of Sendresponse method from old code
    }
    public static string Text(HttpRequest rqst)
    {
        //here is Main() from old code
    }
}

That looks promising yes :), note that there should only be one app, and each of the subscripts would become a new app.MapGet("/myscript", (HttpRequest rqst) => Results.Content(MyscriptResponse.Text(rqst), "text/html")), where MyscriptResponse is the class that implements that specific script code (ex. LikesResponse).

taavi subscribed.

The grid engine has been shut down, so I'm closing any remaining migration tasks as Declined. If you're still planning to migrate this tool, please re-open this task and add one or more active project tags to it. (If you need a project tag for your tool, those can be created via the Toolforge admin console.)

MBH changed the task status from Declined to Resolved.Mar 14 2024, 1:13 PM

We are in process of migration and we almost completed it, so this status will be more correct.

After transition to k8s, a process of reading data from DB replicas was changed: now I got type errors when reading fields that I earlier read as strings. In one case, now field can't be read as string, it should be read as int, and I changed reading method. In another case I got error Can't convert byte[] to string when reading one text field, but another text field successfully reads as string. I found a way to convert that byte stream into string, but I'm interesting why this errors appeared after transition to k8s and why "username" field can be read as string, like earlier, but "action type" field should be read as byte stream now ("action type" field, unlike username, will contain only ASCII-7 characters and can contain only three values: "approve", "approve-i" or "unapprove"). I'm talking about this tool https://github.com/Saisengen/wikibots/blob/main/web-services/patstats/patstats.cs , see it today's changelog.

I would recommend splitting it into different envvar variables for ease of usage and maintenance (so it would not be multiline).

Turns out I need only one secret for web tools: replica DB connection string, so I created a single-line envvar CONN_STRING, excluded it from test (I'm planning not to delete this tool) and deleted multi-line envvar CREDS created earlier from my password file. Non-web bots will still use password file, they need multiple strings from it.

After transition to k8s, a process of reading data from DB replicas was changed: now I got type errors when reading fields that I earlier read as strings. In one case, now field can't be read as string, it should be read as int, and I changed reading method. In another case I got error Can't convert byte[] to string when reading one text field, but another text field successfully reads as string. I found a way to convert that byte stream into string, but I'm interesting why this errors appeared after transition to k8s and why "username" field can be read as string, like earlier, but "action type" field should be read as byte stream now ("action type" field, unlike username, will contain only ASCII-7 characters and can contain only three values: "approve", "approve-i" or "unapprove"). I'm talking about this tool https://github.com/Saisengen/wikibots/blob/main/web-services/patstats/patstats.cs , see it today's changelog.

I strongly suspect that this is because of the move from dotnet 3.x to 8.x, that uses a newer set of libraries and such that might do stronger type validation/parsing whet extracting things from the DB.

Let me have a look, probably is part of the DB schema.

After transition to k8s, a process of reading data from DB replicas was changed: now I got type errors when reading fields that I earlier read as strings. In one case, now field can't be read as string, it should be read as int, and I changed reading method. In another case I got error Can't convert byte[] to string when reading one text field, but another text field successfully reads as string. I found a way to convert that byte stream into string, but I'm interesting why this errors appeared after transition to k8s and why "username" field can be read as string, like earlier, but "action type" field should be read as byte stream now ("action type" field, unlike username, will contain only ASCII-7 characters and can contain only three values: "approve", "approve-i" or "unapprove"). I'm talking about this tool https://github.com/Saisengen/wikibots/blob/main/web-services/patstats/patstats.cs , see it today's changelog.

I strongly suspect that this is because of the move from dotnet 3.x to 8.x, that uses a newer set of libraries and such that might do stronger type validation/parsing whet extracting things from the DB.

Let me have a look, probably is part of the DB schema.

From the DB the types are what you explain yes:

MariaDB [ruwiki_p]> desc logging;
+----------------+---------------------+------+-----+----------------+-------+
| Field          | Type                | Null | Key | Default        | Extra |
+----------------+---------------------+------+-----+----------------+-------+
| log_id         | int(10) unsigned    | NO   |     | 0              |       |
| log_type       | varbinary(32)       | NO   |     |                |       |
| log_action     | varbinary(32)       | YES  |     | NULL           |       |
| log_timestamp  | binary(14)          | NO   |     | 19700101000000 |       |
| log_actor      | decimal(20,0)       | NO   |     | 0              |       |
| log_namespace  | int(11)             | YES  |     | NULL           |       |
| log_title      | varbinary(255)      | YES  |     | NULL           |       |
| log_comment_id | decimal(20,0)       | NO   |     | 0              |       |
| log_params     | blob                | YES  |     | NULL           |       |
| log_deleted    | tinyint(3) unsigned | NO   |     | 0              |       |
| log_page       | int(10) unsigned    | YES  |     | NULL           |       |
+----------------+---------------------+------+-----+----------------+-------+
11 rows in set (0.002 sec)

MariaDB [ruwiki_p]> desc actor;
+------------+---------------------+------+-----+---------+-------+
| Field      | Type                | Null | Key | Default | Extra |
+------------+---------------------+------+-----+---------+-------+
| actor_id   | bigint(20) unsigned | NO   |     | 0       |       |
| actor_user | int(10) unsigned    | YES  |     | NULL    |       |
| actor_name | varbinary(255)      | NO   |     | NULL    |       |
+------------+---------------------+------+-----+---------+-------+
3 rows in set (0.002 sec)

Note that for user the query does cast(actor_name as char) user, so it's being cast at the sql level to string. That makes me think that you might be able to do the same with the other fields you want as strings, for exmaple:

MariaDB [ruwiki_p]> select cast(log_action as char) from logging limit 1;
+--------------------------+
| cast(log_action as char) |
+--------------------------+
| upload                   |
+--------------------------+
1 row in set (0.004 sec)

Thanks for explanation. user being casted to char because otherwise non-Latin usernames can't be read correctly.

Could you say something about POST tools? T319883#9624004 Looks like it's the last unfixed issue on my tools.

Thanks for explanation. user being casted to char because otherwise non-Latin usernames can't be read correctly.

Could you say something about POST tools? T319883#9624004 Looks like it's the last unfixed issue on my tools.

Sorry, I missed the comment xd

This should do the trick, the issue is that CGI 1.1 might not end the POST data with EOF, so ReadLine just hangs, you have to read only $CONTENT_LENGTH characters:
https://github.com/Saisengen/wikibots/pull/7

One of my continuous jobs runs on k8s, it fails with error, but it .err file doesn't updated, so I can't read what's the error. This problem begins today, earlier .err file worked correctly, maybe something with redirecting error stream to file is broken?

Job name - rollbacker

And webservice's error.log now contain only 200 responses, they should be written into access.log instead.

@dcaro could you see? I also reported about the same issue on wikisaurusbot tool account on last weekend.

I also have a question. As far as I understand, an image we built, running my tools on k8s now, contains some virtual filesystem, and I can't view it like I viewed my public_html folder on Toolforge earlier. Could I view contents of this virtual filesystem in k8s image some way?

I also have a question. As far as I understand, an image we built, running my tools on k8s now, contains some virtual filesystem, and I can't view it like I viewed my public_html folder on Toolforge earlier. Could I view contents of this virtual filesystem in k8s image some way?

There's several ways:

  • you can toolforge webservice shell, that will open a shell on a new container, note that it's not the same as the one running the webservice, so any changes to the code will not reflect there, and any changes made inside the container will reset when the container restarts.
  • you can run the container locally, with docker or podman docker run --rm -ti --entrypoint launcher tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest bash will open a shell in the container for you to explore.

Note that the image url is the last one shown by the toolforge build list or toolforge build show commands:

tools.mbh@tools-bastion-12:~$ toolforge build list
build_id                          status    start_time            end_time              source_url                             ref    envvars    destination_image
mbh-buildpacks-pipelinerun-vmfm9  ok        2024-03-16T04:53:26Z  2024-03-16T04:56:11Z  https://github.com/Saisengen/wikibots  N/A    N/A        tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest
mbh-buildpacks-pipelinerun-r2tqp  ok        2024-03-16T04:47:16Z  2024-03-16T04:50:01Z  https://github.com/Saisengen/wikibots  N/A    N/A        tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest
...

tools.mbh@tools-bastion-12:~$ toolforge build show
Build ID: mbh-buildpacks-pipelinerun-vmfm9
Start Time: 2024-03-16T04:53:26Z
End Time: 2024-03-16T04:56:11Z
Status: ok
Message: Tasks Completed: 1 (Failed: 0, Cancelled 0), Skipped: 0
Parameters:
    Source URL: https://github.com/Saisengen/wikibots
    Ref: N/A
    Envvars: N/A
Destination Image: tools-harbor.wmcloud.org/tool-mbh/tool-mbh:latest

One of my continuous jobs runs on k8s, it fails with error, but it .err file doesn't updated, so I can't read what's the error. This problem begins today, earlier .err file worked correctly, maybe something with redirecting error stream to file is broken?

Job name - rollbacker

There's some buffering between the execution of the command and the writing on the file (the default as you would have when running sh -c "mycommand > file.log"), you can try running the command stdbuff -o0 mono /data/project/mbh/bots/vand_rollbacker.exe, though depending on how mono buffers it might not help.

I will try this, but .err file is empty after many hours after crashes, it doesn't look like buffering/caching problem.

Thank you very much. Is there no way to automatically remove deleted and renamed tool files from "cgi-bin" folder, I have to delete them manually?

And another question. My previous webserver, lighttpd, was tuned for logging incoming requests to access.log file, but it doesn't updated last several hours, and it was very useful file. Can I enable this logging on k8s?

This should do the trick: https://github.com/Saisengen/wikibots/pull/8

Though it starts complicating the cgi-bin setup some.

I will try this, but .err file is empty after many hours after crashes, it doesn't look like buffering/caching problem.

The job is not running right now, let me know if you still have issues, and if so, keep the job running and/or let me know how can I run it without breaking anything to test the errors.

I stopped a bot to load a new version of it, now I re-runned it.

I stopped a bot to load a new version of it, now I re-runned it.

It seems to be running, no crashes so far, but no errors at all either, should it be outputting anything?

No, by default it doesn't output anything after I avoided using DotNetWikiBot framework. New version can be stable and not producing errors.

But sometimes I get an email about crash of this process (not because of code bugs, just because it works 24/7/365 and sometimes crashes because or broken API answer, 5xx errors on API answer and similar reasons), on grid this event was written into .err file. Will now records about such crashes be written into .err file?

No, by default it doesn't output anything after I avoided using DotNetWikiBot framework. New version can be stable and not producing errors.

But sometimes I get an email about crash of this process (not because of code bugs, just because it works 24/7/365 and sometimes crashes because or broken API answer, 5xx errors on API answer and similar reasons), on grid this event was written into .err file. Will now records about such crashes be written into .err file?

They should yes, for example the last time it wrote to the error log was this morning:

tools.mbh@tools-bastion-12:~$ ls -lrt rollbacker.err 
-rw-rw---- 1 tools.mbh tools.mbh 0 Mar 18 12:25 rollbacker.err

Thank you very much, I will try to rewrite my tools to dotnet app in the coming weeks. But after you updated web server config, people complaining that https://mbh.toolforge.org/cgi-bin/cpf doesn't work. This page doesn't open, browser endlessly loads it, but access event has written into access.log with 200 code. Could you see?

Thank you very much, I will try to rewrite my tools to dotnet app in the coming weeks. But after you updated web server config, people complaining that https://mbh.toolforge.org/cgi-bin/cpf doesn't work. This page doesn't open, browser endlessly loads it, but access event has written into access.log with 200 code. Could you see?

Found it, the issue is that I was enabling http 1.1 in the python cgi-bin server, while cgi-bin scripts don't handle keep-alive connections well, this should fix it (tested locally):

https://github.com/Saisengen/wikibots/pull/9

@dcaro I can open https://mbh.toolforge.org/clusters.html , but not https://mbh.toolforge.org/elections.txt (404), but both files is in the new static files folder, what's the reason? It's due to file extension?

image.png (724×1 px, 172 KB)

The static files are under https://tools-static.wmflabs.org/mbh/, not https://mbh.toolforge.org (under mbh.toolforge.org are the files that are compiled with the binaries, that currently is the html, from here https://github.com/david-caro/wikibots/blob/main/compile.sh#L42C21-L42C49)

@dcaro A Visual Studio currently provides two types of console C# apps: old way, working on Windows only (or under mono), and a new "dotnet" way, claimed to be working on Linux natively. All of my bots are written on old way and runs under mono. I tried to rewrite one of my (non-web) bots on new way; there are also Discord wiki bot, written by one ruwiki user, and it is written on new way too. But I can't run both apps on k8s on Toolforge even on tf-mono68 image. It's sayed than such programs should be runned by command dotnet run appname.exe, but Toolforge says Unknown command: dotnet. They also can't be runned through mono because This is not a valid CLI image. There is Phab task in which a dotnet was installed to Toolforge for Hawkeye7 user, the task is completed, but where is dotnet?

@MBH dotnet is enabled only through buildpacks, similar to what we did in https://github.com/Saisengen/wikibots. That means that the process is not:

"build locally -> upload binary"

but:

"push changes to repository -> run `toolforge build ...` -> run `toolforge webservice restart`"

as the binaries are built from source by toolforge.

Do you have a repository with the bot code I can check?

I have. But how to run such bots in buildpacks? Now I use jobs.yaml and specify path to exe file, how to do it with buildpacks?

And why dotnet inaccessible without buildpacks?

I have. But how to run such bots in buildpacks?

If you have a single binary from the source code, you can follow: https://gitlab.wikimedia.org/repos/cloud/toolforge/dotnetcore-buildpack

If you have many binaries out of the source code, you'll have to do something similar to what you did in https://github.com/Saisengen/wikibots, specifically, write a 'compile.sh' script that builds the binaries for you (similar to https://github.com/Saisengen/wikibots/blob/main/compile.sh) and adds them to the Procfile, and then add a composer.json file to run it at build time (https://github.com/Saisengen/wikibots/blob/main/composer.json).

Then once you build the code (toolforge build start https://path.to.repo) you can run any of the built binaries on toolforge jobs with toolforge jobs run --command <generatedbinaryname> --image tool-test/tool-test:latest ... as described here https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Job

Now I use jobs.yaml and specify path to exe file, how to do it with buildpacks?

See https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Job on how to run jobs using buildpack images.

If you have a repository, I can help a bit more, as it will depend on how you build and structure your code (dotnet is quite opinionated on how the project should be organized).

And why dotnet inaccessible without buildpacks?

The old image mono6.8 has only the old mono installation so dotnet is not available there (will be deprecated eventually).

Maintaining a new static dotnet image up to date requires extra time and resources from toolforge roots and we are already stretch thin.

Allowing users to decide which version of dotnet to use, which libraries, and be able to self-service their environments allows way more flexibility than pre-built images while not requiring toolforge roots to have to spend time maintaining any image up to date, so it's a win-win :)

This means though that users will have to start using the buildpacks, so there's some effort needed up-front, but once it's done, it saves time continuously on our side and allows you to have control on what versions of the runtime you are using, and migrate at your own pace to newer versions.

How to specify, should run be one-time, regular or bot should be online permanently?

That one is not being compiled, it's not in the all.sln solution file.

You will have to do the same we did for the ones under web-services, that is, adding the project file .csproj and adding them to the solutions file https://github.com/Saisengen/wikibots?tab=readme-ov-file#adding-a-new-project-to-a-solutions-file

I created an initial MR for you to review and see how to do it: https://github.com/Saisengen/wikibots/pull/10

I had to change a bit the compile script to avoid putting the built binaries under the cgi-bin.

An example of entry in jobs.yaml for it:

- name: checking-new-edits
  command: checking-new-edits
  image: tool-mbh/tool-mbh:latest
  mount: all   # only if you need NFS, otherwise you can remove this line
  schedule: "29 10 * * *"
  emails: onfailure

Thank you, but I prefer traditional "build locally - transfer to server - run exe file" way as way more convenient and simple. Another reason is that I really don't like to push any minor changes to repository, even testing and intentionally broking, it creates very dirty repository. I prefer to push to repo stable, tested, working code, but using your way I am forced to push to repo even testing code, written for debugging, program should break with this code. Maybe it's needed to create two repos for every program: one "dirty" and one with with meaningful tested changes. As old way works with traditional bots, unlike web services, I'm don't planning to switch to your way.

But there is discord bot - it's written only on dotnet by another user and I plan to run it on Toolforge. I created its fork https://github.com/Saisengen/DiscordWikiBot , could you set it up to working on TF? It's also waaay more complex program that my bots.

Another reason is that I really don't like to push any minor changes to repository, even testing and intentionally broking, it creates very dirty repository. I prefer to push to repo stable, tested, working code, but using your way I am forced to push to repo even testing code, written for debugging, program should break with this code. Maybe it's needed to create two repos for every program: one "dirty" and one with with meaningful tested changes.

This is one of the best features of git, you can use two branches, one with the production code (usually called main in gitlab/github/etc.), and one with the development code, and only move code from the development to production once you have tested it:
https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging

Then you can build one or the other by passing --ref <nameofyourbranch> to toolforge build start.