Using BLAST on Stampede

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

The BLAST suite contains the following programs

nucleotide blast Search a nucleotide database using a nucleotide query
Algorithms: blastn, megablast, discontiguous megablast
protein blast Search protein database using a protein query
Algorithms: blastp, psi-blast, phi-blast, delta-blast
blastx Search protein database using a translated nucleotide query
tblastn Search translated nucleotide database using a protein query
tblastx Search translated nucleotide database using a translated nucleotide query

BLAST Installations

There are currently two different versions (2.2.28 and 2.2.29) of BLAST installed on Stampede and one version of mpiBLAST (1.6.9).

For more information about a specific version, use "module spider":

login1$ module spider blast/2.2.28

Running BLAST

Below we detail how to run both implementations of BLAST, ncbi-blast and mpiBLAST. While mpiBLAST is the recommended method for large scale simulations, it does not provide so many options in terms of input as BLAST. This imposes a limitation for many users that need more flexibility. These users will need to use BLAST at large scale and should run ncbi-blast.

Running with ncbi-blast

The first approach consists of running BLAST with threads. Several different nodes can be used for this. The input file will have to split into the same number of nodes used for the job. Since having all the nodes accessing the shared filesystem would create a very high IO load, the database needs to be moved to the local disk of each node (/tmp).

Load the BLAST module to set up the environment:

login1$ module load blast/2.2.29

Lets say we want to use 10 nodes (10 nodes, 16 cores per node, equals 160 cores). After splitting the input file, we have 10 files with names: input.0, input.1, input.2,…, input.9. Now, the rank 0 will work on input.0, rank 1 will work on input.1, and so on. We are going to need an auxiliary executable file, "task.sh" that will do this

(file: task.sh)

#!/bin/bash

# Each process determines its own rank
rank=$((${PMI_RANK-0}+${PMI_ID-0}+${MPIRUN_RANK-0}+${OMPI_COMM_WORLD_RANK-0}
+ ${OMPI_MCA_ns_nds_vpid-0} ))

mkdir /tmp/DATABASE
lfs cp DATABASE/* /tmp/DATABASE

blastp -query input.$rank -db uniref90.fasta -num_threads 16  -outfmt 6 > blast_results.$rank

First, each task (remember, in this example we have 10 of these) gets its rank. Next, it creates a folder in its own local disk and copies the original database to the local disk. Once the file has been copied, it runs blastp using as input the file input.$rank. Since rank starts at 0 and ends at 9 (in this example), we need the input files to be name exactly like that: input.0,…, input.9. Be careful not to use: input.00, input.01,…, input.09, as that will fail. It is important to notice how we are indicating blastp to use 16 threads.

Now, all we need is to run this script using ibrun. Since BLAST will run using 16 threads per execution, we only need one instance of the previous script ("task.sh") running on each node. Therefore we have to use ibrun with task.sh, and we need to put in the job submission scripts that the number of parallel tasks (those started by ibrun) will be one per node. We achieve that by specifying the same value for -n and -N in the submission script (10 in our example). Our batch script for this example will be:

#!/bin/bash
#SBATCH -J blast_10
#SBATCH -o blast_10.o%j
#SBATCH -e blast_10.e%j
#SBATCH -n 10
#SBATCH -N 10 
#SBATCH -p normal
#SBATCH -t 0:30:00
#SBATCH -A TG-STA110019S

module load intel
module load blast

export Shared=/tmp/DATABASE/
export BLASTDB=/tmp/DATABASE/
export BLASTMAT=FOO
export Data=FOO
export Local=/tmp
ibrun tacc_affinity sh ./task.sh
cat blast_results.* > blast_results_out_10.txt
rm blast_results.*

See how we are requesting 10 nodes ("-N") and 10 processes ("-n"). Then, each of these 10 processes will use 16 threads as previously stated. It is also important to pay attention to how the environment variables are now defined, pointing to "/tmp/DATABASE", which is the local disk of each node. Once all the 10 tasks have finished, the individual output files of each execution are put together into "blast_results_out_10.txt".

Running with mpiBLAST

The second approach consists of using mpiBLAST instead of BLAST. mpiBLAST needs the database to be formatted, since it uses its own format. Load the following modules:

login1$ module load intel
login1$ module load mpiblast

TACC staff strongly recommends increasing the number of stripes of the database folder to at least 60 in order to decrease the impact of large parallel jobs on the Lustre filesystem.

If we consider that the database is going to be in a folder named DATABASE, we need to run these commands:

login1$ mkdir DATABASE
login1$ lfs setstripe -c 60 DATABASE
login1$ cp original_database.fasta DATABASE/uniref90.fasta

Once the mpiBLAST module has been loaded, reformat the database as follows:

login1$ mpiformatdb -N 200 -i DATABASE/uniref90.fasta -o T

This command formats the database into 200 fragments. It will put those fragments in the same folder as the FASTA database. To specify a different location, use the "-n" option.

Typically, both BLAST and mpiBLAST require a set of variables to be defined in the ".ncbirc" file. However, using this file with a large number of cores leads to timeouts since all those cores try to access the file at the same time. That timeout leads to MPI aborts and the computation will fail. In order to avoid that, the variables must be defined as environment variables. Since we are going to submit the job to the queues, it's a good idea to include those variables in the batch script. The following variables must be defined:

  • MPIBLAST_Shared
  • Shared
  • BLASTDB
  • Data
  • Local

Finally, the batch script to submit the job is as follows:

#!/bin/bash
#SBATCH -J blast_960
#SBATCH -o blast_960.o%j
#SBATCH -e blast_960.e%j
#SBATCH -n 960 
#SBATCH -p normal
#SBATCH -t 4:00:00
#SBATCH -A iprojectnumber

module load intel
module load blast

export MPIBLAST_Shared=$SCRATCH/00_data/large_dataset/DATABASE/
export Shared=$SCRATCH/00_data/large_dataset/DATABASE/
export BLASTDB=$SCRATCH/00_data/large_dataset/DATABASE/
export Data=/opt/apps/intel13/mvapich2_1_9/mpiblast/1.6.0/ncbi/data
export Local=/tmp
export BLASTMAT=/opt/apps/intel13/mvapich2_1_9/mpiblast/1.6.0/ncbi/data 

ibrun mpiblast -p blastp -d uniref90.fasta -i input_seq.fasta -o blast_results_960.txt --partition-size=200 -m 8

References

Last update: September 23, 2015