Page MenuHomePhabricator

VRTS: System log error: Can't write '/opt/otrs/var/tmp/CacheFileStorable/User/e/6/e67fedfc345b750ef9c4746480bb051d': No such file or directory
Closed, ResolvedPublic

Description

VRTS reports once in a while in it's system log:

Can't write '/opt/otrs/var/tmp/CacheFileStorable/User/e/6/e67fedfc345b750ef9c4746480bb051d': No such file or directory

Hash values vary. Please check.

Details

Event Timeline

Jelto triaged this task as Low priority.
Jelto moved this task from Incoming to Backlog on the collaboration-services board.

@Krd I haven't seen this on the logs for some time now.

Checked the permissions for the directory and they look okay. The only odd entry is the one owned by the www-data user:

aokoth@vrts1001:~$ sudo ls  -l /opt/otrs/var/tmp/CacheFileStorable/User/e/6/
total 32
-rw-rw---- 1 otrs     www-data 1629 Jan 12 13:50 e62b1ee00a5471117329915970109df4
-rw-rw---- 1 otrs     www-data 1311 Jan 12 13:50 e62cffa2b2613dbc61dc6f8a095e281b
-rw-rw---- 1 otrs     www-data 3425 Jan 12 13:50 e642b68f024251e90c62f2abe2ea841b
-rw-rw---- 1 otrs     www-data 2030 Jan 12 13:50 e64431b342368cab18f51901e237bc8e
-rw-rw---- 1 otrs     www-data 1701 Jan 12 13:50 e67fedfc345b750ef9c4746480bb051d
-rw-rw---- 1 otrs     www-data 3270 Jan 12 13:50 e691e19a7f7ad0e398120de219332997
-rw-rw---- 1 www-data www-data   34 Jan 12 13:53 e6b2aecd59c5efa5b6c3f9a411ca1b5b
-rw-rw---- 1 otrs     www-data 1662 Jan 12 13:50 e6f08f0bf401f752c0714281d5d88dd7
aokoth@vrts1001:~$ id otrs
uid=498(otrs) gid=498(otrs) groups=498(otrs),33(www-data)

Though there does seem to be some varying permissions within our Puppet code compared to the ones expected after running /opt/otrs/bin/otrs.SetPermissions.pl (noted as one of the potential fixes for this issue) so I'll maybe try reconcile those.

aokoth@vrts1001:/opt/otrs$ sudo ./bin/otrs.SetPermissions.pl
Setting permissions on /opt/znuny-6.3.4
aokoth@vrts1001:/opt/otrs$ sudo puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for vrts1001.eqiad.wmnet
Info: Applying configuration version '(77e66b3b6b) Stevemunene - Remove puppet references for druid1004_6'
Notice: /Stage[main]/Vrts/File[/opt/otrs/Kernel/Config.pm]/mode: mode changed '0660' to '0440' (corrective)
Notice: /Stage[main]/Vrts/File[/opt/otrs/bin/otrs.TicketExport2Mbox.pl]/mode: mode changed '0770' to '0755' (corrective)
Notice: /Stage[main]/Vrts/File[/opt/otrs/var/httpd/htdocs/skins/Agent/default/img/icons/product.ico]/mode: mode changed '0660' to '0664' (corrective)
Notice: /Stage[main]/Vrts/File[/opt/otrs/var/httpd/htdocs/skins/Agent/default/img/logo_bg_wmf.png]/mode: mode changed '0660' to '0664' (corrective)
Notice: /Stage[main]/Vrts/File[/opt/otrs/var/httpd/htdocs/skins/Agent/default/img/loginlogo_wmf.png]/mode: mode changed '0660' to '0664' (corrective)
Notice: Applied catalog in 25.23 seconds

Last occurence: Sat Jan 13 07:00:02 2024 (UTC)

@Krd I'm curious... Do you see any error on your end as a result? I've seen a bunch of POST requests coming from your machine around the same time on the apache access log. Curious as well as to know what action you were performing that results in this error?

There is a likelihood that this issue might just be in the current version running but I'd like to try and see if I can replicate it on a test host running a different version. If that's the case we'll know it's transient and it will likely disappear with the next update.

Good idea, I didn't even consider that this may be triggered by myself. It appears to be always a few seconds after full and half hours, so it's unlikely one of my scripts, but perhaps one of the GenericAgent tasks.

There is also a message I didn't notice before, which seems to be related:

Mon Jan 15 14:30:22 2024 (UTC) error OTRS-otrs.Daemon.pl - Daemon Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker-10 There was an error executing RunAsynchronous() in Kernel::System::SupportDataCollector::PluginAsynchronous::OTRS::ConcurrentUsers: Can't opendir(/opt/otrs/var/tmp/CacheFileStorable/User/5/9): No such file or directory

@Krd Just as suspected this seems to be a reported issue with versions 6.3.x. I haven't found a fix yet but will update once I do.

Another course of action is we could proceed with the version upgrades (landing us in a stable and supported version) within a week or two (T340027) and this issue will potentially go away (will also update once this is decided).

@Krd We are currently on version 6.5.5, so this error should not re-occur. There is a separate one that seems related so will open a new ticket to track that.

Fri Mar 1 11:00:15 2024: error: OTRS-CGI-10: Can't write '/opt/otrs/var/tmp/CacheFileStorable/User/5/2/52f851d8292246f80d324eacb84f5bc5': No such file or directory

@Krd We performed the following actions in an attempt to resolve this (recommended by Znuny):

  • Deleted sessions with no users in the admin interface
  • Running bin/otrs.Console.pl Maint::Cache::Delete on the server

There is also a scheduled job that performs the second step hourly so at this point I think we'll just report this as a bug.

Change 1009303 had a related patch set uploaded (by AOkoth; author: AOkoth):

[operations/puppet@production] vrts: disable vrts-cache-cleanup timer

https://gerrit.wikimedia.org/r/1009303

Change 1009303 merged by AOkoth:

[operations/puppet@production] vrts: disable vrts-cache-cleanup timer

https://gerrit.wikimedia.org/r/1009303

The clean cache systemd timer is disabled. Last occurrence of this error on the log was at Wed Mar 6 16:05:13 2024 UTC. Will check if it re-occurs.

Closing this as it's expected behaviour when the cache is deleted. Though it's important to now that it's now occurring infrequently after we got rid of the hourly timer.