Difference between revisions of "Biocluster Alphafold"
Jump to navigation
Jump to search
(→How to Run) |
(→How to Run) |
||
Line 14: | Line 14: | ||
run_singularity.py | run_singularity.py | ||
</pre> | </pre> | ||
+ | * The --data-dir parameter should be set to $BIODB. $BIODB points to the location of the alphafold databases | ||
+ | * The -- | ||
= Example Job Script = | = Example Job Script = |
Revision as of 08:33, 17 February 2022
About[edit]
- Alphafold is a Highly accurate protein structure prediction program
- More information at https://github.com/deepmind/alphafold/
How to Run[edit]
- Load alphafold module. This loads alphafold, singularity, and the alphafold databases.
module load alphafold/2.1.1
- Run run_singularity.py
run_singularity.py
- The --data-dir parameter should be set to $BIODB. $BIODB points to the location of the alphafold databases
- The --
Example Job Script[edit]
#!/bin/bash # ----------------SLURM Parameters---------------- #SBATCH -n 4 #SBATCH -N 1 #SBATCH -p gpu #SBATCH --gres=gpu:2 #SBATCH --mem 80G # ----------------Load Modules-------------------- module load alphafold/2.1.1 # ----------------Commands------------------------ run_singularity.py --data-dir $BIODB --cpus $SLURM_NTASKS --use-gpu --db-preset full_dbs --output-dir output \ --fasta-paths example.fasta
Parameters[edit]
- These are all the parameters for run_singularity.py. This can be accessed by running run_singularity.py --help
--fasta-paths FASTA_PATHS [FASTA_PATHS ...], -f FASTA_PATHS [FASTA_PATHS ...] Paths to FASTA files, each containing one sequence. All FASTA paths must have a unique basename as the basename is used to name the output directories for each prediction. --is-prokaryote-list IS_PROKARYOTE_LIST [IS_PROKARYOTE_LIST ...] Optional for multimer system, not used by the single chain system. This list should contain a boolean for each fasta specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. These values determine the pairing method for the MSA. --max-template-date MAX_TEMPLATE_DATE, -t MAX_TEMPLATE_DATE Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets. --db-preset {reduced_dbs,full_dbs} Choose preset model configuration - no ensembling with uniref90 + bfd + uniclust30 (full_dbs), or 8 model ensemblings with uniref90 + bfd + uniclust30 (casp14). --model-preset {monomer,monomer_casp14,monomer_ptm,multimer} Choose preset model configuration - the monomer model, the monomer model with extra ensembling, monomer model with pTM head, or multimer model --benchmark, -b Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins. --use-precomputed-msas Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed. --data-dir DATA_DIR, -d DATA_DIR Path to directory with supporting data: AlphaFold parameters and genetic and template databases. Set to the target of download_all_databases.sh. --docker-image DOCKER_IMAGE Alphafold docker image. --output-dir OUTPUT_DIR, -o OUTPUT_DIR Output directory for results. --use-gpu Enable NVIDIA runtime to run with GPUs. --gpu-devices GPU_DEVICES Comma separated list of devices to pass to NVIDIA_VISIBLE_DEVICES. --cpus CPUS, -c CPUS Number of CPUs to use.