TP 1.2: Single execution on a cluster
Goal: Identify genes of a transcript fasta file thanks to the alignment software blast (NCBI) by using cluster compute nodes.
Simple submission command¶
We will use sequence alignement with NCBI_Blast+ as a use case.
Interactive job¶
Question
Connect to a node in interactive mode.
Correct social behavior expected
Never run a calculation on a login node! Use an interactive job or a batch job.
Solution
Prerequisite¶
Load the NCBI_Blast+ module
Run blast¶
Question
Launch a blast comparing the file contigs.fasta against ensembl_danio_rerio_pep databank in interactive mode on the cluster.
Your query is nucleic, your databank is proteic so you need to use the blastx program.
Tip
For more help on blast, type
Solution
/bank/blastdb,# however the cluster was configured in a way that you don't need to specify the path.blastx -query contigs.fasta -db ensembl_danio_rerio_pep \
-evalue 10e-10 -out contigs.blastx_drLook for running jobs¶
Question
Open a new terminal and connect to the cluster again. Now check all the jobs running or waiting on the cluster. In particular, check your own job.
Solution
Question
On which node are you running ?
Solution
1232823 workq bash toto R 0:05 1 node129
Stop a running job¶
Question
Kill your job.
Solution
Batch mode¶
Question
Use a text editor to create a command file blastn.sh with the same module load and almost the same blast command line (replace blastx with blastn and ensembl_danio_rerio_pep by ensembl_danio_rerio_cdna). The first line of the file must be :
| blastn.sh | |
|---|---|
1 | |
Launch it in batch mode.
Solution
File blastn.sh contains:
| blastn.sh | |
|---|---|
1 2 3 | |
Launch it with :
Check running job¶
Question
Check the execution. When it's over, look at the blast output file and the 2 execution trace files slurm-xxxxx.out.
Has the job finished correctly ?
Solution
Batch mode with inline command¶
Question
Launch the same command without using a file ( option --wrap='command').
Check the execution.
When it's over, look at the blast output file and the execution trace file (slurm-xxxxx.out).
Has the job finished correctly ?
Solution
Look at the trace file¶
If you didn't have any error until now, redo the previous submission with an error in the command. Have a look to the trace file.
How much ressources¶
Question
Look at the ressources used by previous jobs. In particular, pay attention to CPU and Memory usage.
Solution
where you replace the