Page MenuHomePhabricator

Raise tool memory limit for Depictor / Hay's tools
Closed, InvalidPublic

Description

I'm getting out of memory errors with the Depictor tool. This is mainly due to the unforeseen popularity of the tool and a system that's not particularly well designed for the >250.000 files it now needs to check for the 'depicted' status'. This will probably mean that i'll need to rewrite parts of the tool, which will take time. In the meantime i think a raise of the memory limit might fix a couple of these issues.

Depictor is part of my regular set of tools, hosted under the hay project.

Some more background for those interested (and for documenting this bug for other people)

Originally i got notified about this issue by user PMG (who is the top contributor for Depictor). He noticed he was getting crashes with the tool with the cultural heritage monuments in Poland challenge.

What happens at that moment of the 'crash' is that the system checks if the files of the next item are already depicted. Depictor doesn't know in advance which items it's going to get, so it needs to do this every time you load a new item. While developing i noticed that those database calls get *really* slow when you're checking for lots of them (some categories might have hundreds or even thousands of files). So, instead of checking for each file it just fetches all already depicted files *and* the ones that you as a user have skipped (because you don't want to be bothered with those). It then needs to combine the results of those two queries.

At this point the script fails, because there are now close to a quarter of a million files in the database, and PMG as the most active user also has a huge list of skipped files. The fetching of those two giant lists and trying to combine them gives an out of memory error. That's what you're seeing when the next image is stuck: the script has halted and the whole system fails. The reason why this is not reproducible i believe is because Toolforge is a shared hosting environment, so memory might simple be used by other processes. When i was testing out your heritage challenge i didn't see any problems at all, but that's probably just because not that many processes were running at the same time.

Obviously the main problem here is that this system worked fine when Depictor was still working with just a couple of thousand files, but now that it has gotten so popular the system is no longer capable to cope. Unfortunately i don't think there are any quick fixes for this problem, except for redesigning parts of the system (which will take up time). The monuments in Poland challenge also takes ages to load because it needs to check for the existence of so many items.

Event Timeline

Husky renamed this task from Raise tool memory limit to Raise tool memory limit for Depictor / Hay's tools.Nov 24 2021, 11:22 PM
Husky updated the task description. (Show Details)

I assume you are using Toolforge kubernetes backend.

Here your current quotas:

tools.hay@tools-sgebastion-08:~$ kubectl describe resourcequotas
Name:                   tool-hay
Namespace:              tool-hay
Resource                Used   Hard
--------                ----   ----
configmaps              2      10
count/cronjobs          0      50
count/deployments.apps  1      3
count/jobs              0      15
limits.cpu              500m   2
limits.memory           512Mi  8Gi
persistentvolumeclaims  0      3
pods                    1      10
replicationcontrollers  0      1
requests.cpu            150m   2
requests.memory         256Mi  6Gi
secrets                 1      10
services                1      1
services.nodeports      0      0

How much memory do you need?

Hey @aborrero, i'm only now reading the whole section about increasing CPU and memory usage using the webservice command, so i'll try that first and raise memory to 4 and maybe 8GB. If i still get out of memory errors i'll post a message here again.

Ok, thanks!

Please reopen the task if you need.