Spin up virtualized NFS server strictly for Grid Engine database and management
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	• Bstorm
	Mar 12 2019, 6:38 PM

Description

In the current arrangement, the project and tools NFS is shared with the actual database of grid engine. It is possible to remove the dependence on NFS, but without a shared storage platform, shadow master functionality becomes worse or impossible even with an external DB server.

However, this means that a malfunctioning tool is able to corrupt the gridengine database or cause the entire grid to collapse for some time even after the NFS problem is recovered.

The only thing that needs to be preserved separately really is the spooling database and related files in the /var/spool/gridengine directly. Most of toolforge will collapse if NFS is in bad enough shape anyway, but the database needs to be kept in order.

Since this would encompass only files in the .system_sge directory or even a subdirectory of that, this really could be a VM in the tools project.

NOTE: It is also possible to spin up a BerkeleyDB Spooling Server, but the packaging scheme in our OS doesn't make it terribly easy. Besides that may be an even more unstable method of preserving the database than NFS.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		bd808	T218038 NFS issue affecting Toolforge SGE master
		Declined		None	T218141 Spin up virtualized NFS server strictly for Grid Engine database and management

Event Timeline

• Bstorm triaged this task as High priority.Mar 12 2019, 6:38 PM

• Bstorm created this task.

• Bstorm removed bd808 as the assignee of this task.Mar 12 2019, 6:50 PM

bd808 lowered the priority of this task from High to Low.Aug 8 2019, 4:03 AM

This may not even be the approach we take in the end. The grid may end up outside of tools first. Beyond that, we are likely to rebuild NFS servers to work differently first as well. As we are stabilizing our NFS design, this whole issue is much less scary.

We may think about this when the next iteration of NFS rebuild happens.

Spin up virtualized NFS server strictly for Grid Engine database and managementClosed, DeclinedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Spin up virtualized NFS server strictly for Grid Engine database and management
Closed, DeclinedPublic
Actions

Related Objects
Search...