We're seeing large numbers of open file handles, exceeding the Python-Linux select() limit of 1024. Ideally, we can fix this problem by cutting down on the number of open files; less ideally, we can work around by using select.poll instead of select.select.
Use lsof on a cluster worker to determine which type of process is hogging the file handles.
Review the code for correct file handling. Make sure we use context managers to close handles regardless of conditional branching. Forked full-weight processes should close all unnecessary file handles from the parent.