Difference between revisions of "Biocluster"

From Carl R. Woese Institute for Genomic Biology - University of Illinois Urbana-Champaign
Jump to navigation Jump to search
(Cluster Specifications)
 
(166 intermediate revisions by 7 users not shown)
Line 1: Line 1:
= {{PAGENAME}}  =
+
__TOC__
  
__TOC__
+
==Quick Links==
  
== Quick Links  ==
+
* Main Site - [http://biocluster.igb.illinois.edu http://biocluster.igb.illinois.edu]
 +
* Request Account - [http://www.igb.illinois.edu/content/biocluster-account-form http://www.igb.illinois.edu/content/biocluster-account-form]
 +
* Cluster Accounting - [https://bioapps3.igb.illinois.edu/accounting/ https://bioapps3.igb.illinois.edu/accounting/]
 +
* Cluster Monitoring - [https://bioapps3.igb.illinois.edu//ganglia/ https://bioapps3.igb.illinois.edu/ganglia/]
 +
* SLURM Script Generator - [http://www-app.igb.illinois.edu/tools/slurm/ http://www-app.igb.illinois.edu/tools/slurm/]
 +
* Biocluster Applications - [https://help.igb.illinois.edu/Biocluster_Applications https://help.igb.illinois.edu/Biocluster_Applications]
 +
* Biocluster Introduction Presentation - [https://help.igb.illinois.edu/images/d/df/Intro_to_Biocluster_Spring_2022.pptx Intro_to_Biocluster_Spring_2022.pptx]
  
*Main Site - http://biocluster.igb.illinois.edu
+
==Description==
*Galaxy Interface - https://galaxy.igb.illinois.edu
+
Biocluster is the High Performance Computing (HPC) resource for the Carl R Woese Institute for Genomic Biology (IGB) at the University of Illinois at Urbana-Champaign (UIUC). Containing 2824 cores and over 27.7 TB of RAM, Biocluster has a mix of various RAM and CPU configurations on nodes to best serve the various computation needs of the IGB and the Bioinformatics community at UIUC. For storage, Biocluster has 1.3 Petabytes of storage on its GPFS filesystem for reliable high speed data transfers within the cluster. Networking in Biocluster is either 1, 10 or 40 Gigibit ethernet depending on the class of node and its data transfer needs.
*WebBlast Interface - https://biocluster.igb.illinois.edu/webblast/
 
*Cluster Accounting - https://biocluster.igb.illinois.edu/accounting/
 
*Cluster Monitoring - http://biocluster.igb.illinois.edu/ganglia/
 
  
== Cluster Specifications  ==
+
* The biocluster is not an authorized location to store '''HIPAA''' data.
 +
*'''If you need to update the CFOP associated with your account, please send an email with the new CFOP to help@igb.illinois.edu.'''
  
'''Default Queue'''
+
==Cluster Specifications==
  
*25 Nodes  
+
{| border="1" class='wikitable' width="1200" cellspacing="1" cellpadding="0" align="center"
*Dell Poweredge 1950
+
|-
*8 2.83Ghz E5540 Intel Xeon CPUs per Node
+
!|Queue Name
*16 Gigabytes of RAM per Node
+
!|Nodes
 +
!|Cores (CPUs) per Node
 +
!|Memory
 +
!|Networking
 +
!|Scratch Space /scatch
 +
!|GPUs
 +
|-
 +
||normal (default)
 +
||6 Supermicro
 +
||128 AMD EPYC 7543
 +
||2TB
 +
||10GB Ethernet
 +
||7TB NVME
 +
||
 +
|-
 +
||gpu
 +
||1 Supermicro
 +
||28 Intel Xeon E5-2680 @ 2.4Ghz
 +
||256GB
 +
||1GB Ethernet
 +
||1TB SSD
 +
||4 NVIDIA GeForce GTX 1080 Ti
 +
|-
 +
||classroom
 +
||5 Supermicro
 +
||72 Intel Xeon Gold 6150 CPU @ 2.70GHz
 +
||1.2TB
 +
||10GB Ethernet
 +
||8TB SSD
 +
||
 +
 
 +
|}
 +
 
 +
== Storage ==
 +
=== Information ===
 +
 
 +
* The storage system is a GPFS filesystem with 1.3 Petabytes of total disk space with 2 copies of the data. This data is '''NOT''' backed up.
 +
* The data is spread across 8 GPFS storage nodes.
 +
 
 +
===Cost===
 +
On April 1, 2021, CNRG was informed by campus that we were required to start billing external users paying with a credit card an external rate.  This rate was given to us by campus and is obtained by adding the 31.7% F&A rate and adding the standard 2.3% credit card fee.  This external rate is only charged to users paying with a credit card.
 +
{| border="1" class='wikitable' width="600" cellspacing="1" cellpadding="0" align="center"
 +
|-
 +
!|Internal Cost (Per Terabyte Per Month)
 +
!|External Cost (Per Terabyte Per Month)
 +
|-
 +
||$8.75
 +
||$11.73
 +
|}
 +
 
 +
=== Calculate Usage (/home) ===
 +
* Each Month, you will receive a bill on your monthly usage.  We take a snapspot of usage daily.  Then we average out the 95 percentile of daily snapspots to get an average usage for the month.
 +
* You can calculate your usage using the du command.  An example is below.  The result will be double what you are billed as their is 2 copies of the data.  Make sure to divide by 2.
 +
<pre>du -h /home/a-m/username</pre>
 +
 
 +
== Calculate Usage (/private_stores) ==
 +
* These are private data storage nodes.  They do not get billed monthly.
 +
* The filesystems are XFS shared over NFS.
 +
* To calculate usage, use the du command
 +
<pre>du -h /private_stores/shared/directory</pre>
 +
 
 +
==Queue Costs==
 +
The cost for each job is dependent on which queue it is submitted to. Listed below are the different queues on the cluster with their cost.  Although the service is billed by the second, the rates below are what it would cost per day to use a resource, so that it would be more easily understood.  For standard computation, the CPU cost and the memory cost are compared and the highest is billed.  For GPU bills the cost of the CPU or memory is added to that of the GPU.
 +
 
 +
Usage is charge by the second. The costs listed below are what it would cost per day.  The CPU cost and memory cost are compared and the largest is what is billed.
 +
{| border="1" class='wikitable' width="1400" cellspacing="1" cellpadding="0" align="center"
 +
|-
 +
!|Queue Name
 +
!|CPU Cost
 +
!|External CPU Cost
 +
!|Memory Cost
 +
!|External Memory Cost
 +
!|GPU Cost
 +
!|External GPU Cost
 +
 
 +
|-
 +
||normal (default)
 +
||$1.19
 +
||$1.59
 +
||$0.08
 +
||$0.09
 +
||NA
 +
||NA
 +
|-
 +
||GPU
 +
||$2.00
 +
||$2.68
 +
||$0.44
 +
||$0.59
 +
||$2.00
 +
||$2.68
 +
|}
 +
On April 1, 2021, CNRG was informed by campus that we were required to start billing external users paying with a credit card an external rate. This rate was given to us by campus and is obtained by adding the 31.7% F&A rate and adding the standard 2.3% credit card fee. This external rate is only charged to users paying with a credit card.
 +
 
 +
==Gaining Access==
 +
 
 +
* Please fill out the form at [http://www.igb.illinois.edu/content/biocluster-account-form http://www.igb.illinois.edu/content/biocluster-account-form] to request access to the Biocluster.
 +
 
 +
==Cluster Rules==
 +
 
 +
* '''Running jobs on the head node or login nodes are strictly prohibited.''' Running jobs on the head node could cause the entire cluster to crash and affect everyone's jobs on the cluster. Any program found to be running on the headnode will be stopped immediately and your account could be locked. You can start an interactive session to login to a node to manual run programs.
 +
* '''Installing Software''' Please email help@igb.illinois.edu for any software requests. Compiled software will be installed in /home/apps. If its a standard RedHat package (rpm), it will be installed in their default locations on the nodes.
 +
* '''Creating or Moving over Programs:''' Programs you create or move to the cluster should be first tested by you outside the cluster for stability. Once your program is stable, then it can be moved over to the cluster for use. Unstable programs that cause problems with the cluster can result in your account being locked. Programs should only be added by CNRG personnel and not compiled in your home directory.
 +
* '''Reserving Memory:''' SLURM allows the user to specify the amount of memory they want their program to use.. If your job tries to use more memory than you have reserved, the job will run out of memory and die. Make sure to specify the correct amount of memory.
 +
* '''Reserving Nodes and Processors:''' For each job, you must reserve the correct number of nodes and processors. By default you are reserved 1 processor on 1 node. If you are running a multiple processor job or a MPI job you need to reserve the appropriate amount. If you do not reserve the correct amount, the cluster will confine your job to that limit, increasing its runtime.
 +
 
 +
==How To Log Into The Cluster==
 +
 
 +
* You will need to use an SSH client to connect.
 +
*'''NOTICE''' The login hostname is '''biologin.igb.illinois.edu'''
 +
 
 +
===On Windows===
 +
 
 +
* You can download Putty from [http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html]
 +
* Install Putty and run it, in the Host Name input box enter '''biologin.igb.illinois.edu'''
  
'''Computation Queue'''
 
  
*11 Nodes  
+
[[File:PUTTYbiologin.PNG|400px]]
*Dell R410 Servers
+
 
*8 2.4GHz E5530 Intel Xeon CPUs per Node
+
* Hit Open and login using your IGB account credentials.
*24 Gigabytes of RAM per Node
+
 
 +
===On Mac OS X===
 +
 
 +
* Simply open the terminal under Go >> Utilities >> Terminal
 +
* Type in '''ssh username@biologin.igb.illinois.edu''' where username is your NetID.
 +
* Hit the Enter key and type in your IGB password.
 +
 
 +
==How To Submit A Cluster Job==
 +
 
 +
* The cluster runs the '''SLURM ''' queuing and resource mangement program.
 +
* All jobs are submitted to SLURM which distributes them automatically to the Nodes.
 +
* You can find all of the parameters that SLURM uses at [https://slurm.schedmd.com/quickstart.html https://slurm.schedmd.com/quickstart.html]
 +
* You can use our SLURM Generation Utility to help you learn to generate job scripts [http://www-app.igb.illinois.edu/tools/slurm/ http://www-app.igb.illinois.edu/tools/slurm/]
 +
 
 +
===Create a Job Script===
 +
 
 +
* You must first create a SLURM job script file in order to tell SLURM how and what to execute on the nodes.
 +
* Type the following into a text editor and save the file '''test.sh'''
 +
 
 +
<pre>#!/bin/bash
 +
#SBATCH -p normal
 +
#SBATCH --mem=1g
 +
#SBATCH -N 1
 +
#SBATCH -n 1
 +
 
 +
sleep 20
 +
echo "Test Script"
 +
</pre>
 +
* You just created a simple SLURM Job Script.
 +
* To submit the script to the cluster, you will use the sbatch command.
  
'''Large Memory Queue'''  
+
<pre>sbatch test.sh</pre>
 +
* Line by line explanation
 +
** '''#!/bin/bash''' - tells linux this is a bash program and it should use a bash interpreter to execute it.
 +
** '''#SBATCH''' - are SLURM parameters, for explanations of these please scroll down to SLURM Parameters Explanations section.
 +
** '''sleep 20''' - Sleep 20 seconds (only used to simulate processing time for this example)
 +
** '''echo "Test Script"''' - Output some text to the screen when job completes ( simulate output for this example)
 +
* For example if you would like to run a blast job you may simply replace the last two line with the following
  
*2 Nodes
+
<pre>module load BLAST
*Node 1 - Dell R900
+
blastall -p blastn -d nt -i input.fasta -e 10 -o output.result -v 10 -b 5 -a 5
**16 2.4GHz E7440 Intel Xeon CPUs
+
</pre>
**256 Gigabytes of RAM
+
* Note: the module commands are explained under the '''Environment Modules''' section.
*Node 2 - Dell R910
 
**24 2.0GHz E7540 Intel Xeion CPUs
 
**1024 Gigabytes (1TB) of RAM
 
  
== Usage Cost  ==
+
====SLURM Parameters Explanations:====
  
Usage is charge by the second.&nbsp; The CPU cost and memory cost are compared and the largest is what is billed.
+
* To view all possible parameters
 +
** Run '''man sbatch''' at the command line
 +
** Go to [https://slurm.schedmd.com/sbatch.html https://slurm.schedmd.com/sbatch.html] to view the man page online
  
{| align="center" width="624" cellspacing="1" cellpadding="1" border="1"
+
{| border="1" class='wikitable'
 +
|-
 +
!|Command
 +
!|Description
 +
|-
 +
||#SBATCH -p PARTITION
 +
||Run the job on a specific queue/partition. This defaults to the "normal" queue
 
|-
 
|-
| '''Queue Name'''
+
||#SBATCH -D /tmp/working_dir
| '''CPU Cost ($ per CPU per day)'''
+
||Run the script from the /tmp/working_dir directory. This defaults to the current directory you are in.
| '''Memory Cost ($ per GB per day)'''
 
 
|-
 
|-
| default
+
||#SBATCH -J ExampleJobName
| $1.37
+
||Name of the job will be ExampleJobName
| $0.46
 
 
|-
 
|-
| classroom
+
||#SBATCH -e /path/to/errorfile
| $1.00
+
||Split off the error stream to this file. By default output and error streams are placed in the same file.
| $0.5
 
 
|-
 
|-
| largememory
+
||#SBATCH -o /path/to/ouputfile
| $11.42
+
||Split off the output stream to this file. By default output and error streams are placed in the same file.
| $0.27
+
|-
 +
||#SBATCH --mail-user username@illinois.edu
 +
||Send an e-mail to specified email to receive job information.
 +
|-
 +
||#SBATCH --mail-type BEGIN, END, FAIL
 +
||Specifies when to send a message to email. You can select multiple of these with a comma separated list. Many other options exist.
 +
|-
 +
||#SBATCH -N X
 +
||Reserve X number of nodes.
 +
|-
 +
||#SBATCH -n X
 +
||Reserve X number of cpus.
 +
|-
 +
||#SBATCH --mem=XG
 +
||Reserve X gigabytes of RAM for the job.
 +
|-
 +
||#SBATCH --gres=gpu:X
 +
||Reserve X NVIDIA GPUs. (Only on GPU queues)
 
|}
 
|}
  
== How To Get Cluster Access  ==
+
===Create a Job Array Script===
 +
Making a new copy of the script and then submitting each one for every input data file is time consuming. An alternative is to make a job array using the '''#SBATCH --array''' option in your job script. The '''#SBATCH --array''' option allows many copies of the same script to be queued all at once. You can use the '''$SLURM_ARRAY_TASK_ID''' to differentiate between the different jobs in the array. A detailed example on how to do this is available at [[Job Arrays]]
  
=== IGB Affliated  ===
+
===Start An Interactive Session ===
  
*If you are a member of IGB, please fill out the form http://www.igb.illinois.edu/content/galaxy-and-biocluster-account-form to request access to the cluster.
+
* Use the '''srun''' commsnf if you would like to run a job interactively.
*You must also request your professor or theme leader to e-mail us at help@igb.illinois.edu approving your access, unless you are a fellow.
 
  
=== Non-IGB Affliated  ===
+
<pre>srun --pty /bin/bash
 +
</pre>
 +
* This will automatically reserve you a slot on one of the compute nodes and will start a terminal session on it.
 +
* Closing your terminal window will also kill your processes running in your interactive srun session, therefore it's better to submit large jobs via non-interactive sbatch.
  
*You must have a note from someone affiliated with the IGB approving your use of the cluster.
+
=== X11 Graphical Applications ===
*Once you are authorized to use the cluster you may go to room 2626 IGB in order to sign up for an IGB account
+
* To run an application with a user interface you will need to setup an Xserver on your computer [[Xserver Setup]]
*Bring the approval letter and a CFOP account to charge cluster use, to room 2626 and our staff will provide you with access.
+
*Then add the '''--x11''' parameter to your srun command
 +
<pre>
 +
srun --x11 --pty /bin/bash
 +
</pre>
  
<br>
+
==View/Delete Submitted Jobs==
 +
===Viewing Job Status===
  
== Cluster Rules  ==
+
* To get a simple view of your current running jobs you may type:
  
*'''Running jobs on the head node are strictly prohibited.''' Running jobs on the head node could cause the entire cluster to crash and affect everyone's jobs on the cluster. Any program found to be running on the headnode will be stopped immediately and your account could be locked. You can start an interactive session to login to a node to manual run programs. Please use
+
<pre>squeue -u userid
<pre>qsub -I</pre>  
+
</pre>
*'''Installing Software''' Please email help@igb.illinois.edu for any software requests. Custom compiled software will be installed in /home/apps. If its a standard RedHat package (rpm), it will be installed in their default locations on the nodes.  
+
* This command brings up a list of your current running jobs.
*'''Creating or Moving over Programs:''' Programs you create or move to the cluster should be first tested by you outside the cluster for stability. Once your program is stable, then it can be moved over to the cluster for use. Unstable programs that cause problems with the cluster can result in your account being locked.&nbsp; Programs should only be added by CNRG personnel and not compiled in your home directory.<br>
+
* The first number represents the job's ID number.
*'''Selecting which queue to run your job in:''' The biocluster, while being one big cluster, will have multiple queues which each have different hardware specifications. For example if you wanted to submit a qsub script called temp.sh to the large memory que you would execute the following command:
+
* Jobs may have different status flags:
<pre>qsub -q largememory temp.sh
+
** '''R''' = job is currently running
</pre>
 
*'''Designating Memory Usage of Job:''' TORQUE allows the user to specify the amount of memory they want their program to use. The memory cap on the Computation Queue is XXX Gigabytes and memory cap on the Large Memory Queue is XXX Gigabytes. For example if you would like your script temp.sh to use 128 Gigabytes of memory on the Large Memory Queue you would execute the following command:
 
<pre> qsub -q largememory mem=128gb temp.sh </pre>
 
*'''Slots and Process Forking:''' For each process you create in a program you run through TORQUE, you must reserve a slot in TORQUE. Normally this is done automatically (one slot for the one process you are submitting), but if you are spawning (also called forking) other processes from that process you submitted, you must reserve one slot per process created. Normally this is done as a serial process rather than a parallel process. For example if you would like to reserve 8 or more processing slots for a qsub script called temp.sh in the computation queue you would execute the following command:
 
<pre>qsub -l nodes=1:ppn=8 -q computation temp.sh
 
</pre>
 
If you are doing this serially and create (fork/spawn) more than 8 processes, you only need to request 8 processes at a time. Failure to comply with this policy will result in your account being locked.
 
  
== How To Log Into The Cluster  ==
 
  
*You will need to use an SSH client to connect.
+
* For more detailed view type:
  
=== On Windows  ===
+
<pre>squeue -l</pre>
 +
* This will return a list of all nodes, their slot availability, and your current jobs.
  
*You can download Putty from [http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html]
+
===List Queues===
*Install Putty and run it, in the Host Name input box enter '''biocluster.igb.illinois.edu'''
 
  
[[Image:PuTTYbiocluster.PNG]]
+
* Simple view
  
*Hit Open and login using your IGB account credentials.
+
<pre>sinfo</pre>This will show all queues as well as which nodes in those queues are fully used (alloc), partially used (mix), unused (idle), or unavailable (down).
  
=== On Mac OS X  ===
 
  
*Simply open the terminal under Go &gt;&gt; Utilities &gt;&gt; Terminal
+
===List All Jobs on Cluster With Nodes===
*Type in '''ssh username@biocluster.igb.illinois.edu''' where username is your NetID.
+
<pre>squeue</pre>
*Hit the Enter key and type in your IGB password.
+
===Deleting Jobs===
  
== How To Submit A Cluster Job  ==
+
* Note: You can only delete jobs which are owned by you.
 +
* To delete a job by job-ID number:
 +
* You will need to use '''scancel''', for example to delete a job with ID number 5523 you would type:
  
*The cluster runs '''TORQUE ''' queuing and resource mangement program.
+
<pre>scancel 5523
*All jobs are submitted to TORQUE which distributes them automatically to the Nodes.
+
</pre>
 +
* Delete all of your jobs
  
=== Create a Job Script  ===
+
<pre>scancel -u userid
 +
</pre>
 +
===Troubleshooting job errors===
  
*You must first create a TORQUE job script file in order to tell TORQUE how and what to execute on the nodes.
+
* To view job errors in case job status shows
*This example demonstrates script creation using the vi text editor.
 
*First we need to create a file for the script
 
*In the command line type '''vi test.sh'''
 
*Now click '''i''' in order to enable text insertion.
 
*Type this example bash script into vi
 
<pre>#!/bin/bash
 
#PBS -j oe
 
#PBS -S /bin/bash
 
  
sleep 20
+
<pre>scontrol show job 23451
echo "Starting Blast Job"
+
</pre>
</pre>  
 
*Click '''Esc''' to stop inserting text, type ''':w''' and click Enter to save the file.
 
*Type ''':q''' and click Enter to quit vi.
 
*You just created a simple SGE Job Script.
 
*The first line tells linux this is a bash program ('''#!/bin/bash''') and it should use a bash interpreter to execute it.
 
*All lines with '''#PBS''' are PBS parameters, for explanations of these please scroll down to PBS Parameters Explanations section.
 
*The last three lines are simple linux commands that will execute on the node in consecutive order.
 
*First '''sleep 20''' tells linux to sleep the bash script for 20 seconds.
 
*Then '''echo "Starting Blast Job"''' will print to the output stream file '''Starting Blast Job'''.
 
*If you would like to run a blast job for example you may add this line at the end of the file.
 
<pre>/opt/Bio/ncbi/bin/blastall -p blastn -d nt -i input.fasta -e 10 -o output.result -v 10 -b 5 -a 5
 
</pre>  
 
==== TORQUE Parameters Explanations:  ====
 
  
*These are just a few parameter options, for more type '''man qsub''' while logged into the cluster.
+
==Applications==
*'''#PBS -d /tmp/working_dir ''' tells Torque to run the script from the /tmp/working_dir directory.&nbsp; This defaults to your home directory (/share/home/username). This could be problematic if the path on the head node is different than the path on the slave node.
+
===Application Lists===
*'''#PBS -j oe''' parameter tells Torque to join the errors and output streams together into one file. This file will be created in the working directory and will be named in this case '''test.sh.o'''# where # is the job number assigned by Torque.
 
*'''#PBS -S /bin/bash''' parameter tells Torque that the program will be using bash for its interpreter. Required.
 
*'''#PBS -N testJob3''' parameter tells Torque to name the job testJob3.
 
*'''#PBS -M username@igb.illlinois.edu''' parameter tells Torque to send an e-mail to username@igb.illinois.edu when the job is done.
 
*'''#PBS -m abe''' parameter tells Torque to send an e-mail to the e-mail defined using the above -M parameter when a job is aborted,begins or ends.
 
*'''#PBS -l nodes=1:ppn=5 5''' parameter tells Torque to reserve 5 processors on a node for this job. This is only allowed if your program can actually use multiple processors natively, otherwise this is considered '''cluster abuse''' (In the example script test.sh, blastall has the parameter '''-a 5''' which tells blast to run using 5 processors in this case telling SGE to reserve 5 processors is justified!)
 
*'''#PBS -l mem=1024mb''' parameter tells Torque to reserve 1024 Megabytes of RAM for the job.'
 
*'''#PBS -q classroom '''parameter tells Torque to run the job on the classroom queue.
 
  
=== Submit Serial Job  ===
+
* View a list of installed applications at [[Biocluster Applications]]
 +
* List of currently installed applications from the commmand line, run '''module avail'''
  
*To submit the serial job you will use the '''qsub''' program. For example to submit test.sh TORQUE Job you would type:
+
===Application Installation===
<pre>qsub test.sh
 
</pre>
 
*You may also define the TORQUE parameters from the section above as qsub parameters instead of defining them in the script file. Example:
 
<pre>qsub -j oe -S /bin/bash test.sh
 
</pre>
 
=== Submit Parallel Job  ===
 
  
*To submit the parallel job you will use the '''qsub''' program.
+
* Please email '''help@igb.illinois.edu''' to request new application or version upgrades
*For more information please refer to this page [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml Submitting TORQUE Jobs]  
+
* The Biocluster uses EasyBuild to build and install software.  You can read more about EasyBuild at [https://github.com/easybuilders/easybuild https://github.com/easybuilders/easybuild]
*To distribute the jobs evenly across the reserved nodes you will have to to use '''orte_rr''' instead of the '''orte''' parallel environment. This will distribute the MPI jobs to each node in round robin order. While the default '''orte''' will make sure to fill all slots on the node before moving to the next one.
+
* The Biocluster EasyBuild scripts are located at [https://github.com/IGB-UIUC/easybuild https://github.com/IGB-UIUC/easybuild]
  
=== Start An Interactive Login Session On A Compute Node  ===
+
===Environment Modules===
 +
* The Biocluster uses the Lmod modules package to manage the software that is installed.  You can read more about Lmod at [https://lmod.readthedocs.io/en/latest/ https://lmod.readthedocs.io/en/latest/]
 +
* To use an application, you need to use the '''module''' command to load the settings for an application
 +
* To load a particular environment for example QIIME/1.9.1, simply run this command:
  
*Use interactive qsub if you would like to run a job interactively such as running a quick perl script or run a quick test interactively on your data.
+
<pre>module load QIIME/1.9.1
<pre>qsub -I
+
</pre>
</pre>  
+
* If you would like to simply load the latest version, run the the command without the /1.9.1 (version number):
*This will automatically reserve you a slot on one of the compute nodes and will start a terminal session on it.  
 
*Closing your terminal window will also kill your processes running in your interactive qsub session, therefor it's better to submit large jobs via non-interactive qsub.
 
  
== View/Delete Submitted Jobs  ==
+
<pre>module load QIIME
 +
</pre>
 +
* To view which environments you have loaded simply run '''module list''':
  
=== List Queues  ===
+
<pre>bash-4.1$ module list
<pre>qstat -q
+
Currently Loaded Modules:
</pre>  
+
  1) BLAST/2.2.26-Linux_x86_64  2) QIIME/1.9.1
=== List All Jobs on Cluster With Nodes<br>  ===
+
</pre>
<pre>qstat -a -n
+
* When submitting a job using a sbatch script you will have to add the '''module load qiime/1.5.0''' line before running qiime in the script.
</pre>
+
* To unload a module simply run '''module unload''':
=== Viewing Job Status  ===
 
  
*To get a simple view of your current running jobs you may type:
+
<pre>module unload QIIME
<pre>qstat
+
</pre>
</pre>  
+
* Unload all modules
*This command brings up a list of your current running jobs.
 
*The first number represents the job's ID number.
 
*Jobs may have different status flags:
 
**'''R''' = job is currently running
 
**'''W''' = job is waiting to be submitted (this may take a few seconds even when there are slots available so be patient)
 
**'''Eqw''' = There was an error running the job.
 
  
*For more detailed view type:
+
<pre>module purge
<pre> qstat -f </pre>  
+
</pre>
*This will return a list of all nodes, their slot availability, and your current jobs.
 
  
=== Deleting Jobs  ===
+
=== Containers ===
 +
*The Biocluster cluster supports Singularity to run containers.
 +
*The guide on how to use them is at [[Biocluster Singularity]]
  
*Note: You can only delete jobs which are owned by you.  
+
=== R Packages ===
*To delete a job by job-ID number:
+
*We have a local mirror of the [https://cran.r-project.org/ CRAN] and [https://www.bioconductor.org/ Bioconductor].  This allows you to install packages through an interactive session into your home folder.
*You will need to use '''qdel''', for example to delete a job with ID number 5523 you would type:
+
*To install a package, run an interactive session
<pre>qdel 5523
+
<pre>
</pre>  
+
srun --pty /bin/bash
=== Troubleshooting job errors  ===
+
</pre>
 +
*Load the R module
 +
<pre>
 +
module load R/4.4.0-IGB-gcc-8.2.0
 +
</pre>
 +
*Run R
 +
<pre>
 +
R
 +
</pre>
 +
*For CRAN packages, run install.packages()
 +
<pre>
 +
install.packages('shape');
 +
</pre>
 +
*For Bioconductor packages, the BiocManager package is already installed.  You just need to run BiocManager::install to install a package
 +
<pre>
 +
BiocManager::install('dada2')
 +
</pre>
 +
*If the package requires an external dependencies, you should email us to get it install centrally.
  
*To view job errors in case job status shows '''Eqw''' or any other error in the status column use '''qstat -j''', for example if job # 23451 failed you would type:
+
=== Python Packages ===
<pre>qstat -j 23451
+
* Most python packages in pypi, [https://pypi.org/ https://pypi.org/], are now precompiled.
</pre>  
+
* If the package is in pypi, from the biologin nodes, module load the Python version you want to use
== Applications  ==
+
<pre>
 +
module load Python/3.10.1-IGB-gcc-8.2.0
 +
</pre>
 +
* Then run pip install to install the package
 +
<pre>
 +
pip install package_name
 +
</pre>
 +
* This might not work if it needs to compile the package or it needs to compile dependency packages.  If that is the case, then email [mailto::help@igb.illinois.edu help@igb.illinois.edu] and we can get it installed.
 +
=== Jupyter ===
 +
* The biocluster has jupyterhub installed to allow you to run juypter notebooks at https://bioapps3.igb.illinois.edu/jupyter
 +
* We have a guide on how to setup custom jupyter conda environments at [[Biocluster Jupyter]]
  
=== Compute node paths  ===
+
=== Alphafold ===
 +
* The biocluster has Alphafold installed.  There is specific instructions that need to be follow to run it on the biocluster.  The guide is at [[Biocluster Alphafold]]
  
*Installed programs folder path:
+
== Mirror Service - Genomic Databases ==
<pre>/home/apps/
+
*  Biocluster provides mirrors of public accessible genomic databases
</pre>
+
* If a database is not installed and it is publicly accessible, email [mailto::help@igb.illinois.edu help@igb.illinois.edu] and we can get it installed.
=== Load program environment ===
+
*  If it is private database, then it must be placed in your home folder or a private group or lab folder.
 +
* A list of databases is a located at [[Biocluster Mirrors]]
  
*To automatically load the proper environment for some programs you may use the '''module''' command
+
==Transferring data files==
*To list all available environments run '''module avail''' (please e-mail help@igb.illinois.edu for special requests):
+
===Transferring using SFTP/SCP===
<pre>bash-4.1$ module avail
+
====Using WinSCP====
 +
*Download WinSCP installation package from http://winscp.net/eng/download.php#download2 and install it.
 +
*Once installed Run WinSCP >> enter biologin.igb.illinois.edu for the Host name >> Enter your IGB user name and password and click Login
 +
[[File:Bioclustertransfer.png|400px]]
 +
*Once you hit "Login, you should be connected to your Biocluster home folder, as shown below.
 +
[[File:Bioclustertransfer2.png|400px]]
 +
*From here you should be able to download or transfer your files.
  
----------------------------------------------------- /home/apps/modules/Modules/3.2.9/modulefiles -----------------------------------------------------
+
====Using CyberDuck====
ImageMagick/6.7.8-9 dot                matlab/7.11.0.584  module-info        null                python/3.2.3        qiime/1.5.0
 
R/latest            java/latest        module-cvs          modules            python/2.7.3        qiime/1.3.0        use.own
 
bash-4.1$
 
</pre>
 
*To load a particular environment for example qiime/1.5.0 simply run this command:
 
<pre>module load qiime/1.5.0
 
</pre>
 
*To view which environments you have loaded simply run '''module list''':
 
<pre>bash-4.1$ module list
 
Currently Loaded Modulefiles:
 
  1) qiime/1.5.0
 
</pre>
 
*When submitting a job using a qsub script you will have to add the '''module load qiime/1.5.0''' line before running qiime in the script.
 
*To unload a module simply run '''module unload''':
 
<pre>module unload qiime/1.5.0
 
</pre>
 
== Transferring data files  ==
 
  
=== Transferring from personal machine  ===
+
*To download cyberduck go to [http://cyberduck.ch http://cyberduck.c] and click on the large Zip icon to download.
 +
*Once cyberduck is installed on OSX you may start the program.
 +
*Click on '''Open Connection.'''
 +
*From the drop down menu at the top of the opopup window select '''SFTP(SSH File Transfer Protocol)'''
 +
[[File:Cyberduck screenshot sftp.png|400px]]
  
*In order to transfer files to the cluster from a personal Desktop/Laptop you may use [[File Server Access|WinSCP]] the same way you would use it to transfer files to the [[File Server Access|File Server]].
+
*Now in the '''Server:''' input box enter '''biologin.igb.illinois.edu''' and for Username and password enter your IGB credentials.
  
=== Transferring from file server (Very Fast)  ===
+
[[File:Cyberduckbiologin.png|400px]]
  
*To transfer files to the cluster form the file-server you will need to first setup Xserver on your machine. Please follow this guide to do so [[Xserver Setup]].  
+
*Click '''Connect.'''
*Once Xserver is setup on your personal machine you will need to SSH into the cluster using putty as mentioned above.
+
*You may now download or transfer your files.
*Then start gFTP by typing in the terminal:
+
*'''NOTICE:'''  Cyberduck by default wants to open multiple connections for transferring files.  The biocluster firewall limits you to 4 connections a minute. This can cause transfers to timeout. You can change Cyberduck to only use 1 connection by going to '''Preferences->Transfers->Transfer Files'''.  Select '''Open Single Connection'''.
<pre>gftp
 
</pre>  
 
*This will launch a graphical interface for gFTP on your computer, it should look like this
 
  
[[Image:Gftp ui.png]]
+
===Transferring using Globus===
  
*Enter the following into the gFTP user interface:
+
* The biocluster has a Globus endpoint setup. The '''Collection Name''' is  '''biocluster.igb.illinois.edu'''
**Host: '''file-server.igb.illinois.edu'''
+
* Globus allows the transferring of very large files reliably.
**Port: leave this box blank
+
* A guide on how to use Globus is [[Globus|here]]
**User: Your IGB '''username'''  
 
**Pass: Your IGB '''password'''  
 
**Select '''SSH2''' from the drop down menu
 
*Hit enter and you should now be connected to the file-server from the cluster.
 
*You may now select files and folders from your home directories and click the arrows pointing in each direction to transfer files both ways.  
 
*Please move files back to the file-server or your personal machine once you are done working with them on the cluster in order to allow others to use the space on the cluster for their jobs.
 
*Note: You may also use the standard command line tool "sftp" to transfer files if you do not want to use gFTP.
 
  
== Disk Usage  ==
+
===Core-Server===
  
*Currently there are no disk usage specifications.  
+
*The core-server is mounted on the biologin nodes at /private_stores/core-server.
*If there is a special interest or recommendations please let us know at help@igb.illinois.edu
+
*It is read-only; meaning you can only transfer data from the core-server to Biocluster. You cannot transfer any data from Biocluster to the core-server.
  
== References ==
+
== References ==
  
*http://www.adaptivecomputing.com/resources/docs/torque/3-0-5/index.php
+
* OpenHPC [https://openhpc.community/ https://openhpc.community/]
 +
* SLURM Job Scheduler Documentation - [https://slurm.schedmd.com/ https://slurm.schedmd.com/]
 +
* Rosetta Stone of Schedulers - [https://slurm.schedmd.com/rosetta.pdf https://slurm.schedmd.com/rosetta.pdf]
 +
* SLURM Quick Refernece - [https://slurm.schedmd.com/pdfs/summary.pdf https://slurm.schedmd.com/pdfs/summary.pdf]
 +
* GPFS Filesystem [https://en.wikipedia.org/wiki/IBM_Spectrum_Scale https://en.wikipedia.org/wiki/IBM_Spectrum_Scale]
 +
* Lmod Module Homepage - [https://www.tacc.utexas.edu/research-development/tacc-projects/lmod https://www.tacc.utexas.edu/research-development/tacc-projects/lmod]
 +
* Lmod Documentation - [https://lmod.readthedocs.io/en/latest/ https://lmod.readthedocs.io/en/latest/]

Latest revision as of 16:07, 14 November 2024

Quick Links[edit]

Description[edit]

Biocluster is the High Performance Computing (HPC) resource for the Carl R Woese Institute for Genomic Biology (IGB) at the University of Illinois at Urbana-Champaign (UIUC). Containing 2824 cores and over 27.7 TB of RAM, Biocluster has a mix of various RAM and CPU configurations on nodes to best serve the various computation needs of the IGB and the Bioinformatics community at UIUC. For storage, Biocluster has 1.3 Petabytes of storage on its GPFS filesystem for reliable high speed data transfers within the cluster. Networking in Biocluster is either 1, 10 or 40 Gigibit ethernet depending on the class of node and its data transfer needs.

  • The biocluster is not an authorized location to store HIPAA data.
  • If you need to update the CFOP associated with your account, please send an email with the new CFOP to help@igb.illinois.edu.

Cluster Specifications[edit]

Queue Name Nodes Cores (CPUs) per Node Memory Networking Scratch Space /scatch GPUs
normal (default) 6 Supermicro 128 AMD EPYC 7543 2TB 10GB Ethernet 7TB NVME
gpu 1 Supermicro 28 Intel Xeon E5-2680 @ 2.4Ghz 256GB 1GB Ethernet 1TB SSD 4 NVIDIA GeForce GTX 1080 Ti
classroom 5 Supermicro 72 Intel Xeon Gold 6150 CPU @ 2.70GHz 1.2TB 10GB Ethernet 8TB SSD

Storage[edit]

Information[edit]

  • The storage system is a GPFS filesystem with 1.3 Petabytes of total disk space with 2 copies of the data. This data is NOT backed up.
  • The data is spread across 8 GPFS storage nodes.

Cost[edit]

On April 1, 2021, CNRG was informed by campus that we were required to start billing external users paying with a credit card an external rate. This rate was given to us by campus and is obtained by adding the 31.7% F&A rate and adding the standard 2.3% credit card fee. This external rate is only charged to users paying with a credit card.

Internal Cost (Per Terabyte Per Month) External Cost (Per Terabyte Per Month)
$8.75 $11.73

Calculate Usage (/home)[edit]

  • Each Month, you will receive a bill on your monthly usage. We take a snapspot of usage daily. Then we average out the 95 percentile of daily snapspots to get an average usage for the month.
  • You can calculate your usage using the du command. An example is below. The result will be double what you are billed as their is 2 copies of the data. Make sure to divide by 2.
du -h /home/a-m/username

Calculate Usage (/private_stores)[edit]

  • These are private data storage nodes. They do not get billed monthly.
  • The filesystems are XFS shared over NFS.
  • To calculate usage, use the du command
du -h /private_stores/shared/directory

Queue Costs[edit]

The cost for each job is dependent on which queue it is submitted to. Listed below are the different queues on the cluster with their cost. Although the service is billed by the second, the rates below are what it would cost per day to use a resource, so that it would be more easily understood. For standard computation, the CPU cost and the memory cost are compared and the highest is billed. For GPU bills the cost of the CPU or memory is added to that of the GPU.

Usage is charge by the second. The costs listed below are what it would cost per day. The CPU cost and memory cost are compared and the largest is what is billed.

Queue Name CPU Cost External CPU Cost Memory Cost External Memory Cost GPU Cost External GPU Cost
normal (default) $1.19 $1.59 $0.08 $0.09 NA NA
GPU $2.00 $2.68 $0.44 $0.59 $2.00 $2.68

On April 1, 2021, CNRG was informed by campus that we were required to start billing external users paying with a credit card an external rate. This rate was given to us by campus and is obtained by adding the 31.7% F&A rate and adding the standard 2.3% credit card fee. This external rate is only charged to users paying with a credit card.

Gaining Access[edit]

Cluster Rules[edit]

  • Running jobs on the head node or login nodes are strictly prohibited. Running jobs on the head node could cause the entire cluster to crash and affect everyone's jobs on the cluster. Any program found to be running on the headnode will be stopped immediately and your account could be locked. You can start an interactive session to login to a node to manual run programs.
  • Installing Software Please email help@igb.illinois.edu for any software requests. Compiled software will be installed in /home/apps. If its a standard RedHat package (rpm), it will be installed in their default locations on the nodes.
  • Creating or Moving over Programs: Programs you create or move to the cluster should be first tested by you outside the cluster for stability. Once your program is stable, then it can be moved over to the cluster for use. Unstable programs that cause problems with the cluster can result in your account being locked. Programs should only be added by CNRG personnel and not compiled in your home directory.
  • Reserving Memory: SLURM allows the user to specify the amount of memory they want their program to use.. If your job tries to use more memory than you have reserved, the job will run out of memory and die. Make sure to specify the correct amount of memory.
  • Reserving Nodes and Processors: For each job, you must reserve the correct number of nodes and processors. By default you are reserved 1 processor on 1 node. If you are running a multiple processor job or a MPI job you need to reserve the appropriate amount. If you do not reserve the correct amount, the cluster will confine your job to that limit, increasing its runtime.

How To Log Into The Cluster[edit]

  • You will need to use an SSH client to connect.
  • NOTICE The login hostname is biologin.igb.illinois.edu

On Windows[edit]


PUTTYbiologin.PNG

  • Hit Open and login using your IGB account credentials.

On Mac OS X[edit]

  • Simply open the terminal under Go >> Utilities >> Terminal
  • Type in ssh username@biologin.igb.illinois.edu where username is your NetID.
  • Hit the Enter key and type in your IGB password.

How To Submit A Cluster Job[edit]

Create a Job Script[edit]

  • You must first create a SLURM job script file in order to tell SLURM how and what to execute on the nodes.
  • Type the following into a text editor and save the file test.sh
#!/bin/bash
#SBATCH -p normal
#SBATCH --mem=1g
#SBATCH -N 1
#SBATCH -n 1

sleep 20
echo "Test Script" 
  • You just created a simple SLURM Job Script.
  • To submit the script to the cluster, you will use the sbatch command.
sbatch test.sh
  • Line by line explanation
    • #!/bin/bash - tells linux this is a bash program and it should use a bash interpreter to execute it.
    • #SBATCH - are SLURM parameters, for explanations of these please scroll down to SLURM Parameters Explanations section.
    • sleep 20 - Sleep 20 seconds (only used to simulate processing time for this example)
    • echo "Test Script" - Output some text to the screen when job completes ( simulate output for this example)
  • For example if you would like to run a blast job you may simply replace the last two line with the following
module load BLAST
blastall -p blastn -d nt -i input.fasta -e 10 -o output.result -v 10 -b 5 -a 5
  • Note: the module commands are explained under the Environment Modules section.

SLURM Parameters Explanations:[edit]

Command Description
#SBATCH -p PARTITION Run the job on a specific queue/partition. This defaults to the "normal" queue
#SBATCH -D /tmp/working_dir Run the script from the /tmp/working_dir directory. This defaults to the current directory you are in.
#SBATCH -J ExampleJobName Name of the job will be ExampleJobName
#SBATCH -e /path/to/errorfile Split off the error stream to this file. By default output and error streams are placed in the same file.
#SBATCH -o /path/to/ouputfile Split off the output stream to this file. By default output and error streams are placed in the same file.
#SBATCH --mail-user username@illinois.edu Send an e-mail to specified email to receive job information.
#SBATCH --mail-type BEGIN, END, FAIL Specifies when to send a message to email. You can select multiple of these with a comma separated list. Many other options exist.
#SBATCH -N X Reserve X number of nodes.
#SBATCH -n X Reserve X number of cpus.
#SBATCH --mem=XG Reserve X gigabytes of RAM for the job.
#SBATCH --gres=gpu:X Reserve X NVIDIA GPUs. (Only on GPU queues)

Create a Job Array Script[edit]

Making a new copy of the script and then submitting each one for every input data file is time consuming. An alternative is to make a job array using the #SBATCH --array option in your job script. The #SBATCH --array option allows many copies of the same script to be queued all at once. You can use the $SLURM_ARRAY_TASK_ID to differentiate between the different jobs in the array. A detailed example on how to do this is available at Job Arrays

Start An Interactive Session[edit]

  • Use the srun commsnf if you would like to run a job interactively.
srun --pty /bin/bash
  • This will automatically reserve you a slot on one of the compute nodes and will start a terminal session on it.
  • Closing your terminal window will also kill your processes running in your interactive srun session, therefore it's better to submit large jobs via non-interactive sbatch.

X11 Graphical Applications[edit]

  • To run an application with a user interface you will need to setup an Xserver on your computer Xserver Setup
  • Then add the --x11 parameter to your srun command
srun --x11 --pty /bin/bash

View/Delete Submitted Jobs[edit]

Viewing Job Status[edit]

  • To get a simple view of your current running jobs you may type:
squeue -u userid
  • This command brings up a list of your current running jobs.
  • The first number represents the job's ID number.
  • Jobs may have different status flags:
    • R = job is currently running


  • For more detailed view type:
squeue -l
  • This will return a list of all nodes, their slot availability, and your current jobs.

List Queues[edit]

  • Simple view
sinfo

This will show all queues as well as which nodes in those queues are fully used (alloc), partially used (mix), unused (idle), or unavailable (down).


List All Jobs on Cluster With Nodes[edit]

squeue

Deleting Jobs[edit]

  • Note: You can only delete jobs which are owned by you.
  • To delete a job by job-ID number:
  • You will need to use scancel, for example to delete a job with ID number 5523 you would type:
scancel 5523
  • Delete all of your jobs
scancel -u userid

Troubleshooting job errors[edit]

  • To view job errors in case job status shows
scontrol show job 23451

Applications[edit]

Application Lists[edit]

  • View a list of installed applications at Biocluster Applications
  • List of currently installed applications from the commmand line, run module avail

Application Installation[edit]

Environment Modules[edit]

  • The Biocluster uses the Lmod modules package to manage the software that is installed. You can read more about Lmod at https://lmod.readthedocs.io/en/latest/
  • To use an application, you need to use the module command to load the settings for an application
  • To load a particular environment for example QIIME/1.9.1, simply run this command:
module load QIIME/1.9.1
  • If you would like to simply load the latest version, run the the command without the /1.9.1 (version number):
module load QIIME
  • To view which environments you have loaded simply run module list:
bash-4.1$ module list
Currently Loaded Modules:
  1) BLAST/2.2.26-Linux_x86_64   2) QIIME/1.9.1
  • When submitting a job using a sbatch script you will have to add the module load qiime/1.5.0 line before running qiime in the script.
  • To unload a module simply run module unload:
module unload QIIME
  • Unload all modules
module purge

Containers[edit]

  • The Biocluster cluster supports Singularity to run containers.
  • The guide on how to use them is at Biocluster Singularity

R Packages[edit]

  • We have a local mirror of the CRAN and Bioconductor. This allows you to install packages through an interactive session into your home folder.
  • To install a package, run an interactive session
srun --pty /bin/bash
  • Load the R module
module load R/4.4.0-IGB-gcc-8.2.0
  • Run R
R
  • For CRAN packages, run install.packages()
install.packages('shape');
  • For Bioconductor packages, the BiocManager package is already installed. You just need to run BiocManager::install to install a package
BiocManager::install('dada2')
  • If the package requires an external dependencies, you should email us to get it install centrally.

Python Packages[edit]

  • Most python packages in pypi, https://pypi.org/, are now precompiled.
  • If the package is in pypi, from the biologin nodes, module load the Python version you want to use
module load Python/3.10.1-IGB-gcc-8.2.0
  • Then run pip install to install the package
pip install package_name
  • This might not work if it needs to compile the package or it needs to compile dependency packages. If that is the case, then email help@igb.illinois.edu and we can get it installed.

Jupyter[edit]

Alphafold[edit]

  • The biocluster has Alphafold installed. There is specific instructions that need to be follow to run it on the biocluster. The guide is at Biocluster Alphafold

Mirror Service - Genomic Databases[edit]

  • Biocluster provides mirrors of public accessible genomic databases
  • If a database is not installed and it is publicly accessible, email help@igb.illinois.edu and we can get it installed.
  • If it is private database, then it must be placed in your home folder or a private group or lab folder.
  • A list of databases is a located at Biocluster Mirrors

Transferring data files[edit]

Transferring using SFTP/SCP[edit]

Using WinSCP[edit]

  • Download WinSCP installation package from http://winscp.net/eng/download.php#download2 and install it.
  • Once installed Run WinSCP >> enter biologin.igb.illinois.edu for the Host name >> Enter your IGB user name and password and click Login

Bioclustertransfer.png

  • Once you hit "Login, you should be connected to your Biocluster home folder, as shown below.

Bioclustertransfer2.png

  • From here you should be able to download or transfer your files.

Using CyberDuck[edit]

  • To download cyberduck go to http://cyberduck.c and click on the large Zip icon to download.
  • Once cyberduck is installed on OSX you may start the program.
  • Click on Open Connection.
  • From the drop down menu at the top of the opopup window select SFTP(SSH File Transfer Protocol)

Cyberduck screenshot sftp.png

  • Now in the Server: input box enter biologin.igb.illinois.edu and for Username and password enter your IGB credentials.

Cyberduckbiologin.png

  • Click Connect.
  • You may now download or transfer your files.
  • NOTICE: Cyberduck by default wants to open multiple connections for transferring files. The biocluster firewall limits you to 4 connections a minute. This can cause transfers to timeout. You can change Cyberduck to only use 1 connection by going to Preferences->Transfers->Transfer Files. Select Open Single Connection.

Transferring using Globus[edit]

  • The biocluster has a Globus endpoint setup. The Collection Name is biocluster.igb.illinois.edu
  • Globus allows the transferring of very large files reliably.
  • A guide on how to use Globus is here

Core-Server[edit]

  • The core-server is mounted on the biologin nodes at /private_stores/core-server.
  • It is read-only; meaning you can only transfer data from the core-server to Biocluster. You cannot transfer any data from Biocluster to the core-server.

References[edit]