From 3e57f718ea7d919fcb1f43661387f910ded62d67 Mon Sep 17 00:00:00 2001 From: nikos <n.pappas@uu.nl> Date: Thu, 7 Jan 2021 16:00:26 +0100 Subject: [PATCH] some updates to play well --- README.md | 36 ++++++++++++++++++++++++++---------- config/config.yaml | 5 ++++- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 2c0c47f..1388632 100644 --- a/README.md +++ b/README.md @@ -7,10 +7,12 @@ When possible, tools **and** their dependencies are bundled in ## Current tools -|Tool name (links to source) | Publication/Preprint | +|Tool (source) | Publication/Preprint | |:------|:------| -[RaFAh](https://sourceforge.net/projects/rafah/)|[Coutinho FH. et al. 2020](https://www.biorxiv.org/content/10.1101/2020.09.25.313155v1?rss=1) -[vHuLK](https://github.com/soedinglab/wish)|[Amgarten D, et al., 2020](https://www.biorxiv.org/content/10.1101/2020.12.06.413476v1) +[RaFAh](https://sourceforge.net/projects/rafah/)|[Coutinho F. H. et al. 2020](https://www.biorxiv.org/content/10.1101/2020.09.25.313155v1?rss=1) +[vHuLK](https://github.com/LaboratorioBioinformatica/vHULK)|[Amgarten D. et al., 2020](https://www.biorxiv.org/content/10.1101/2020.12.06.413476v1) +[VirHostMatcher-Net](https://github.com/WeiliWw/VirHostMatcher-Net)|[Wang W. et al., 2020](https://doi.org/10.1093/nargab/lqaa044]) +[WIsH](https://github.com/soedinglab/WIsH)|[Galiez G. et al., 2017](https://academic.oup.com/bioinformatics/article/33/19/3113/3964377) ## Installation @@ -102,13 +104,19 @@ You can - Use `snakemake`'s `--config samplesheet=/path/to/my_samples.csv` when executing the wofkflow. -### Models and other data dependencies +### Models and data dependencies +* RaFaH, vHULK For these tools there is no need to pre-download and setup anything - all -data and software dependencies are pulled with the singularity image. +data and software dependencies required for running them are bundled within +the singularity image. -* RaFaH -* vHuLK +* VirHostMatcher-Net, WIsH + +Databases and models need to be downloaded from the VirHostMatcher data repo +([see here](https://github.com/WeiliWw/VirHostMatcher-Net#downloading)). +WIsH models for the 62,493 host genomes used in their paper are also provided +and are used here for WIsH predictions. ## Usage @@ -116,9 +124,18 @@ Basic: ``` # From within this directory # Make sure you have defined a samplesheet -(hp)$ snakemake --use-singularity -j16 +(hp)$ snakemake --use-singularity -j16 \ + --singularity-args "-B /path/to/databases/:/data" ``` +where `/path/to/database/` is the directory containing tables, WIsH models and +CRISPR blasts databases + +> Note +> +> Binding the dir like this is required if the files are stored in some +> shared location and not on the local filesystem. + ## Output All output is stored under a `results` directory within the main workdir. @@ -163,8 +180,7 @@ NC_023719.1 Bacillus 0.0012575098 Bacillus 0.55 ### Per tool * `tmp` directory * Contains one fasta file per input genome, along with other intermediate -files necessary for a smooth execution of the workflow **and the raw output -of vhulk (under genomes/results/)**. +files necessary for a smooth execution of the workflow. * `rafah` * All files prefixed with `<sample_id>_` are the rafah's raw output diff --git a/config/config.yaml b/config/config.yaml index d2af963..f54b07b 100644 --- a/config/config.yaml +++ b/config/config.yaml @@ -1 +1,4 @@ -samplesheet: "/path/to/samples.csv" +samplesheet: "/home/nikos/Projects/phap/samples.csv" + +vhmnet: + data_dir: "/home/nikos/nikos2/databases/vhm_net/data" -- GitLab