Page MenuHomePhabricator

Spin up virtualized NFS server strictly for Grid Engine database and management
Open, HighPublic

Description

In the current arrangement, the project and tools NFS is shared with the actual database of grid engine. It is possible to remove the dependence on NFS, but without a shared storage platform, shadow master functionality becomes worse or impossible even with an external DB server.

However, this means that a malfunctioning tool is able to corrupt the gridengine database or cause the entire grid to collapse for some time even after the NFS problem is recovered.

The only thing that needs to be preserved separately really is the spooling database and related files in the /var/spool/gridengine directly. Most of toolforge will collapse if NFS is in bad enough shape anyway, but the database needs to be kept in order.

Since this would encompass only files in the .system_sge directory or even a subdirectory of that, this really could be a VM in the tools project.

NOTE: It is also possible to spin up a BerkeleyDB Spooling Server, but the packaging scheme in our OS doesn't make it terribly easy. Besides that may be an even more unstable method of preserving the database than NFS.

Event Timeline

Bstorm created this task.Mar 12 2019, 6:38 PM
Bstorm triaged this task as High priority.
Bstorm removed bd808 as the assignee of this task.Mar 12 2019, 6:50 PM