GKS2slurm
Unstable development¶
GKS2slurm is a playbook that is played to install a multinode slurm cluster over a GalaxyKickStart single-node installation.
The playbook GKS2slurm galaxyToSlurmCluster.yml
was tested with multiple virtual machines (VMs) in Stratuslab
, Google Cloud Engine (GCE)
and Amazon Web Services (AWS)
clouds.
Installation of a Galaxy slurm cluster with GKS2slurm¶
Step 1: Install a Galaxy server with GalaxyKickStart¶
-
Report to the Getting Started section of this manual for the basics of GalaxyKickStart installation
-
install any GalaxyKickStart "flavor" by configuring the inventory file (in inventory_files folder) and the group_vars file (in the group_vars folder) of your choice. Flavors currently available are
kickstart
,artimed
andmetavisitor
but other will come soon. Alternatively, you can build you own flavor by customizing a group_vars, extrafiles file and inventory file, which will install your Galaxy tools and workflows.in Step 1, the most important thing to keep track with is to configure your target machine with an extra volume
Indeed GKS2slurm has be designed so that the Galaxy slurm cluster can accumulate large amount of data in the long term, which can be more easily shared with the cluster nodes and more importantly backed up.
Thus in addition of all the adaptations you will do for your own purpose (tools, workflows, etc), edit the
group_vars/all
file and adapt thegalaxy_persistent_directory
variable to your extra volume which should be already formatted and mounted:Change
#persistent data
galaxy_persistent_directory: /export # for IFB it's /root/mydisk, by default, /export
To
#persistent data
galaxy_persistent_directory: /pathto/mounted/extravolume
- Having configured your GalaxyKickStart installation, import the extra roles (if not already done)
and run the galaxy.yml playbook:
ansible-galaxy install -r requirements_roles.yml -p roles
ansible-playbook --inventory-file inventory_files/<your_inventory_file> galaxy.yml
Step 2: Check the single node Galaxy installation¶
If the playbook was run successfully, connect to your Galaxy instance through http and check that you can login (admin@galaxy.org:admin), and that tools and workflows are correctly installed.
Step 3: Moving your single node configuration to a multinode slurm configuration¶
- Start as many compute nodes you want for the slurm cluster and gather information from each node:
- IP address (all slurm nodes should must be accessible in the same network, ie nodes can be ping-ed from any nodes)
- hostname
- number of CPUs
- memory (in MB)
Step 3-1¶
Adapt the inventory file slurm-kickstart
in the inventory_files folder.
[slurm_master]
# adapt the following line to IP address and ssh user of the slurm master node
192.54.201.102 ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/mysshkey"
[slurm_slave]
# adapt the following lnes to IP addresses and ssu users of the slum slave nodes
192.54.201.98 ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/mysshkey"
192.54.201.99 ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/mysshkey"
192.54.201.101 ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/mysshkey"
Step 3-2¶
Adapt the group_vars file slurm_master
in the group_vars
folder.
This is done using the information gathered in step 3
# nfs sharing
cluster_ip_range: "0.0.0.0/24" # replace by your ip network range
# slave node specifications, adapt to your set of slave nodes
slave_node_dict:
- {hostname: "slave-1", CPUs: "2", RealMemory: "7985"}
- {hostname: "slave-2", CPUs: "2", RealMemory: "7985"}
- {hostname: "slave-3", CPUs: "2", RealMemory: "7985"}
Step 3-3¶
Adapt the group_vars file slurm_slave
in the group_vars
folder
# adapt the following variable to the master slurm node IP address
master_slurm_node_ip: "192.54.201.102"
Step 3-4¶
Run the playbook galaxyToSlurmCluster.yml
playbook.
from the GalaxyKickStart directory:
ansible-playbook -i inventory_files/slurm-kickstart galaxyToSlurmCluster.yml
- Note that if you configure multiple slave nodes without prior ssh key authentification, you can run the same command with the variable ANSIBLE_HOST_KEY_CHECKING put to False:
ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i inventory_files/slurm-kickstart galaxyToSlurmCluster.yml
Checking slurm installation¶
Connect to your master node as root and type sinfo
Refer to slurm documentation for more investigation/control