-
Nikos Pappas authoredNikos Pappas authored
phap - Phage Host Analysis Pipeline
A snakemake workflow that wraps various phage-host prediction tools.
- Uses
Singularity containers for execution of all tools.
When possible (i.e. the image is not larger than a few
G
s), tools and their dependencies are bundled in the same container. This means you do not need have to get models or any other external databases. - Calculates Last Common Ancestor of all tools per contig.
Current tools
Tool (source) | Publication/Preprint | Comments |
---|---|---|
HTP | Gałan W. et al., 2019 | ok |
RaFAh | Coutinho F. H. et al., 2020 | ok |
vHuLK | Amgarten D. et al., 2020 | ok |
VirHostMatcher-Net | Wang W. et al., 2020 | ok |
WIsH | Galiez G. et al., 2017 | ok (unnecessary?) |
Installation
Dependencies
To run the workflow your will need
-
snakemake > 5.x
(developed with5.30.1
) -
singularity >= 3.6
(developed with3.6.3
)
The following python packages are also required to be installed and available in the execution environment
-
biopython >= 1.78
(developed with1.78
) -
ete3 >= 3.1.2
(developed with3.1.2
)
The
ete3.NCBITaxa
class is used to get taxonomy information and calculate the LCA of all predictions, when possible. This requires ataxa.sqlite
to be available either in its default location (~/.ete3toolkit/taxa.sqlite
) or provided in the config. See more on http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html
Conda environment
It is recommended to use a
conda environment.
The file environment.txt
can be used to recreate the complete environment
used during development.
The provided
environment.txt
contains an explicit list of all packages, produced with
conda list -n phap --explicit > environment.txt
This ensures all packages are exactly the same versions/builds, so we minimize the risk of running into dependencies issues
To get a working environment
# Clone this repo and get in there
$ git clone https://git.science.uu.nl/papanikos/phap.git
$ cd phap
# Note the long notation --file flag; -f will not work.
$ conda create -n phap --file=environment.txt
# Activate it - use the name you gave above, if it is different
$ conda activate phap
# The (phap) prefix shows we have activated it
# Check the snakemake version
(phap) $ snakemake --version
5.30.1