My immediate reaction was "why not Parallel?" but realized I'm not immediately sure how to deal with multiple servers... Off the top of my head.... have Parallel manage threads that execute remote scripts (using SSH)? SCP to copy files as needed to servers... You would have to figure out how to manage the utilization of resources (i.e. the servers) to not overload them and stuff like that.
using Dask is also possible. I have a Dask based system I built that farms out jobs/objects to members in a simple cluster I set up by hand. The Python class is responsible for managing an external process (such as the Java based Stanford NLP toolkit) or a Python script that uses spaCy. Each job gets sent a block of text which is them turned into features by whatever tool is being used. This class uses the 'subprocess' library in Python to deal with the external processes. Dead simple multiprocessing on a cluster w/o the complexities of SLURM. Setting up the venv on the servers in the cluster is the main hassle, but Dask works fine. Send me a PM if you want a copy of this class to get an idea of how it works.
My immediate reaction was "why not Parallel?" but realized I'm not immediately sure how to deal with multiple servers... Off the top of my head.... have Parallel manage threads that execute remote scripts (using SSH)? SCP to copy files as needed to servers... You would have to figure out how to manage the utilization of resources (i.e. the servers) to not overload them and stuff like that.
using Dask is also possible. I have a Dask based system I built that farms out jobs/objects to members in a simple cluster I set up by hand. The Python class is responsible for managing an external process (such as the Java based Stanford NLP toolkit) or a Python script that uses spaCy. Each job gets sent a block of text which is them turned into features by whatever tool is being used. This class uses the 'subprocess' library in Python to deal with the external processes. Dead simple multiprocessing on a cluster w/o the complexities of SLURM. Setting up the venv on the servers in the cluster is the main hassle, but Dask works fine. Send me a PM if you want a copy of this class to get an idea of how it works.