Subject: [PATCH] added procedures for building containers

 resources/singularity/rafah/         |  57 ++
 resources/singularity/rafah/rafah.def         |  58 ++
 resources/singularity/vhmnet/        |  53 ++
 resources/singularity/vhmnet/environment.txt  | 112 ++++
 resources/singularity/vhulk/         |  73 +++
 resources/singularity/vhulk/     | 516 ++++++++++++++++++
 resources/singularity/vhulk/vhulk.def         |  56 ++
 .../singularity/vhulk/vhulk_explicit.txt      | 375 +++++++++++++
 resources/singularity/wish/          |  32 ++
 resources/singularity/wish/wish.def           |  45 ++
 10 files changed, 1377 insertions(+)
 create mode 100644 resources/singularity/rafah/
 create mode 100644 resources/singularity/rafah/rafah.def
 create mode 100644 resources/singularity/vhmnet/
 create mode 100644 resources/singularity/vhmnet/environment.txt
 create mode 100644 resources/singularity/vhulk/
 create mode 100755 resources/singularity/vhulk/
 create mode 100644 resources/singularity/vhulk/vhulk.def
 create mode 100644 resources/singularity/vhulk/vhulk_explicit.txt
 create mode 100644 resources/singularity/wish/
 create mode 100644 resources/singularity/wish/wish.def

+# RaFaH
+Available from `library://papanikos_182/default/rafah:0.1`
+## Procedure
+0. Create a new dir for build context and get in there
+1. Grab necessary dependencies from the 
+[RaFaH repo](
+$ wget
+$ tar -xzvf whatever_is_downloaded
+> wgetting is slow, I used another copy, locally available
+2. Edit RaFaH script to point to appropriate locations in the container
+  - Replace shebang (line 1) with `#!/usr/bin/env perl`
+  - Change line 37 to `my $valid_domains_file = "/opt/resources/HP_Ranger_Model_3_Valid_Cols.txt";`
+  - Change line 38 to `my $hmm_models_prefix = "/opt/resources/HP_Ranger_Model_3_Filtered_0.9_Valids.hmm";`
+  - Change line 39 to `my $r_script_file_name = "/src/Predict_Host_RF.R";`
+  - Change line 40 to `my $r_model_file_name = "/opt/resources/MMSeqs_Clusters_Ranger_Model_1+2+3_Clean.RData";`
+3. Make executable
+$ chmod +x
+4. Bundle tables and models in a tar archive (not the perl and R script) and 
+not the `HP_Ranger_Model_3_Valid_Cols.txt`
+$ tar -czvf rafah_resources.tar.gz ./*hmm* MMSeqs_Clusters_Ranger_Model_1+2+3_Clean.RData
+5. Build the image with the definition file
+$ sudo singularity build rafah.sif rafah.def
+6. [Optional] Sign the image
+$ singularity sign rafah.sif
+7. Push it on the cloud
+$ singulairty push rafah.sif library://papanikos_182/default/rafah:0.1
+## Usage
+$ singularity run library://papanikos_182/default/rafah:0.1 -h
+Bootstrap: docker
+From: continuumio/miniconda3 
+     Author "Felipe Coutinho"
+	 Maintainer papanikos_182
+     Version 0.1
+	 Source
+	 Preprint
+ /opt/conda/bin/
+	Predict_Host_RF.R /src/
+	HP_Ranger_Model_3_Valid_Cols.txt /opt/HP_Ranger_Model_3_Valid_Cols.txt
+    rafah_resources.tar.gz /opt/rafah_resources.tar.gz
+    export PATH=/src:$PATH
+	# Update OS
+    apt update && apt upgrade -y
+	# Set up resources dirs for running RaFAH
+	mkdir -p /opt/resources
+	mv /opt/HP_Ranger_Model_3_Valid_Cols.txt /opt/resources
+	tar -xzvf /opt/rafah_resources.tar.gz -C /opt/resources && rm /opt/rafah_resources.tar.gz
+	# Install dependencies
+	conda config --add channels conda-forge
+	conda config --add channels default
+	conda config --add channels bioconda
+	conda config --add channels r
+	conda update -y conda	
+	conda install -y mamba
+	mamba install -y r=3.6 r-ranger perl-bioperl hmmer=3.1b2 prodigal=2.6
+	conda clean --all -y
+	A container for RaFAH v0.1 [].
+	Main perl script is in /src .
+	Helper R script for models is in /src . 
+	Required data dependencies are stored in /opt/resources.
+	To run the help menu from RaFAH from this container execute
+	$ singularity exec shub://papanikos_182/rafah:0.1 perl --help
+	To run an anlysis for all genomes stored in the /path/to/genomes/ (last slash is required),
+	with all files ending with .fasta and store the results in 
+	/path/to/outdir/prefix (several files will be written in the /path/to/outdir and prefixed with prefix_ ).
+	$ singularity exec shub://papanikos_182/rafah:0.1 \
+	perl \
+	--genomes_dir /path/to/genomes/ \
+	--extension fasta \
+    --file_prefix /path/to/outdir/prefix
+# VirHostMatcher-Net
+Available from `library://papanikos_182/default/vhmnet:0.1`
+* Note that data dependencies are not included in the container.
+You need to get them with 
+wget -c    
+tar xf data_VirHostMatcher-Net_both_modes.tar.gz
+The models and genomes unpacked are 125G.
+* [My fork](
+is used for grabbing source. It mainly allows to define a directory 
+where the data are located
+## Procedure
+0. Create a new dir for build context and get in there
+1. Create a conda env with its requirements, verify it is working and export
+it explicitly
+$ conda env create -n vhmnet numpy pandas biopython blast
+...Test it runs...
+$ conda list -n vhmnet --explicit > environment.txt
+2. Build the image with the definition file
+$ sudo singularity build vhmnet.sif vhmnet.def
+3. [Optional] Sign the image
+$ singularity sign vhmnet.sif
+4. Push it on the cloud
+$ singulairty push vhmnet.sif library://papanikos_182/default/vhmnet:0.1
+## Usage
+$ singularity run library://papanikos_182/default/vhmnet:0.1 \
+ -h
+# RaFaH
+Available from `library://papanikos_182/default/vhulk:0.1`
+## Procedure
+0. Create a new dir for build context and get in there
+1. Grab source from the 
+[vHULK repo](
+$ git clone
+2. Create a conda env as per 
+[their suggestions](
+and export it explicitly with
+$ conda list -n vhulk --explicit > vhulk_explicit.txt
+This can be used for running stuff
+3. Grab its data dependencies with the auxiliary script they provide 
+$ conda activate vhulk
+(vhulk) $ python ./
+4. Extract (for testing) and rebundle under `vhulk_resources.tar.gz`. This 
+archive contains
+$ tar -tvf vhulk_resources.tar.gz
+drwxr-xr-x nikos/binf        0 2020-12-15 12:21 models/
+-rw-r----- nikos/binf 70143964 2020-09-15 00:52 models/model_species_total_fixed_relu_08mar_2020.h5
+-rw-r----- nikos/binf 70165848 2020-09-15 00:52 models/model_genus_total_fixed_relu_08mar_2020.h5
+-rw-r--r-- nikos/binf 175382277 2020-12-15 12:19 models/all_vogs_hmm_profiles_feb2018.hmm.h3f
+-rw-r----- nikos/binf  70165848 2020-09-15 00:52 models/model_genus_total_fixed_softmax_01mar_2020.h5
+-rw-r----- nikos/binf  70143964 2020-09-15 00:52 models/model_species_total_fixed_softmax_01mar_2020.h5
+-rw-r--r-- nikos/binf    333265 2020-12-15 12:19 models/all_vogs_hmm_profiles_feb2018.hmm.h3i
+-rw-r--r-- nikos/binf 374382899 2020-12-15 12:19 models/all_vogs_hmm_profiles_feb2018.hmm.h3p
+-rw-r--r-- nikos/binf 767855419 2020-12-15 12:19 models/all_vogs_hmm_profiles_feb2018.hmm
+-rw-r--r-- nikos/binf 317721714 2020-12-15 12:19 models/all_vogs_hmm_profiles_feb2018.hmm.h3m
+5. Modify the `` to mainly point to the appropriate locations 
+in the container. The rest is aesthetics done with 
+[black]( Also make it executable. 
+The copy in here is the one used in the container.
+6. Build the image with the definition file
+$ sudo singularity build vhulk.sif vhulk.def
+7. [Optional] Sign the image
+$ singularity sign vhulk.sif
+8. Push it on the cloud
+$ singulairty push vhulk.sif library://papanikos_182/default/vhulk:0.1
+## Usage
+$ singularity run library://papanikos_182/default/vhulk:0.1 -h
+#!/usr/bin/env python
+# coding: utf-8
+# Edited May, 27th 2020
+## This is vHULK: viral Host Unveiling Kit
+# Developed by Deyvid Amgarten and Bruno Iha
+# Creative commons
+# Import required Python modules
+import numpy as np
+import pandas as pd
+from Bio import SeqIO
+import re
+import sys
+import os
+os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
+import subprocess
+import datetime
+import argparse
+import warnings
+import csv
+warnings.filterwarnings("ignore", category=DeprecationWarning)
+warnings.simplefilter(action="ignore", category=FutureWarning)
+from time import gmtime, strftime
+from tensorflow.keras.layers import Dense, Activation, LeakyReLU, ReLU
+from tensorflow.keras.models import load_model
+from scipy.special import entr
+# Function declarations
+# Run prokka
+def run_prokka(binn, input_folder, threads):
+    # Check the fasta format
+    prefix = get_prefix(binn)
+    # Filehandle where the output of prokka will be saved
+    # output_prokka = open(str(prefix)+'prokka.output', mode='w')
+    # Full command line for prokka
+    command_line = (
+        "prokka --kingdom Viruses --centre X --compliant --gcode 11 --cpus "
+        + threads
+        + " --force --quiet --prefix prokka_results_"
+        + str(prefix)
+        + " --fast --norrna --notrna --outdir "
+        + input_folder
+        + "results/prokka/"
+        + str(prefix)
+        + " --cdsrnaolap --noanno "
+        + input_folder
+        + str(binn)
+    ).split()
+    return_code =, stderr=subprocess.PIPE)
+    # Check with prokka run smothly
+    if return_code == 1:
+        print("Prokka may not be correctly installed. Please check that.")
+        sys.exit(1)
+# Get prefix from bins
+def get_prefix(binn):
+    if".fasta", binn):
+        prefix = re.sub(".fasta", "", binn)
+    else:
+        prefix = re.sub(".fa", "", binn)
+    return prefix
+# Extract Matrix
+### Main code
+# Set arguments
+# Modification to use argparse
+parser = argparse.ArgumentParser(
+    description="Predict phage draft genomes in metagenomic bins."
+    "-i",
+    action="store",
+    required=True,
+    dest="input_folder",
+    help="Path to a folder containing metagenomic bins in .fa or .fasta format (required!)",
+    "-t",
+    action="store",
+    dest="threads",
+    default="1",
+    help="Number of CPU threads to be used by Prokka and hmmscan (default=1)",
+args = parser.parse_args()
+# Greeting message
+print("\n**Welcome v.HULK, a toolkit for phage host prediction!\n")
+# Verify databases
+if not os.path.isfile("/opt/vHULK/models/all_vogs_hmm_profiles_feb2018.hmm"):
+    print(
+        "**Your database and models are not set. Please, run: python \n"
+    )
+    sys.exit(1)
+# Create Filehandle for warnings
+# warnings_handle = open('marvel-warnings.txt', 'w')
+# Important variables
+input_folder = args.input_folder
+threads = args.threads
+# Fix input folder path if missing '/'
+if not"/$", input_folder):
+    input_folder = input_folder + "/"
+# Take the input folder and list all multifasta (bins) contained inside it
+# print(input_folder)
+list_bins_temp = os.listdir(input_folder)
+list_bins = []
+count_bins = 0
+# Empty folder
+if list_bins_temp == []:
+    print("**Input folder is empty. Exiting...\n")
+    sys.exit(1)
+    for each_bin in list_bins_temp:
+        if".fasta$", each_bin, re.IGNORECASE):
+            list_bins.append(each_bin)
+            count_bins += 1
+        elif".fa$", each_bin, re.IGNORECASE):
+            list_bins.append(each_bin)
+            count_bins += 1
+if count_bins == 0:
+    print(
+        "**There is no valid genome inside the input folder (%s).\n\
+        Genome or bins should be in '.fasta' or '.fa' format.\nExiting..."
+        % input_folder
+    )
+    sys.exit(1)
+    "**Arguments are OK. Checked the input folder and found %d genomes.\n"
+    % count_bins
+print("**" + str(
+# Create results folder
+    os.stat(input_folder + "results/")
+    os.mkdir(input_folder + "results/")
+# Running prokka for all the bins multfasta files in input folder
+# Perform a check in each bin, then call the execute_prokka function individually
+# It may take awhile
+count_prokka = 0
+print("**Prokka has started, this may take awhile. Be patient.\n")
+for binn in list_bins:
+    # Verify bin/Genome size
+    len_bin = 0
+    for record in SeqIO.parse(input_folder + binn, "fasta"):
+        len_bin += len(record.seq)
+    # If a bin/genome is too short, skip it
+    if len_bin < 5000:
+        print(
+            "**v.HULK has found a genome or bin, which is too short to code \
+            proteins (<5000pb). As CDSs are an important feature for v.HULK, \
+            we will be skipping this: "
+            + binn
+        )
+        continue
+    run_prokka(binn, input_folder, threads)
+    count_prokka += 1
+    if count_prokka % 10 == 0:
+        print("**Done with %d genomes..." % count_prokka)
+print("**Prokka tasks have finished!\n")
+print("**" + str(
+print("**Starting HMM scan, this may take awhile. Be patient.\n")
+# print(str(
+# Create a new results folder for hmmscan output
+    os.stat(input_folder + "results/hmmscan/")
+    os.mkdir(input_folder + "results/hmmscan/")
+# Call HMMscan to all genomes
+dic_matrices_by_genome = {}
+prop_hmms_hits = {}
+count_hmm = 0
+for binn in list_bins:
+    # Prefix for naming results
+    prefix = get_prefix(binn)
+    command_line_hmmscan = (
+        "hmmscan -o "
+        + input_folder
+        + "results/hmmscan/"
+        + prefix
+        + "_hmmscan.out --cpu "
+        + threads
+        + " --tblout "
+        + input_folder
+        + "results/hmmscan/"
+        + prefix
+        + "_hmmscan.tbl --noali /opt/vHULK/models/all_vogs_hmm_profiles_feb2018.hmm "
+        + input_folder
+        + "results/prokka/"
+        + prefix
+        + "/prokka_results_"
+        + prefix
+        + ".faa"
+    )
+    # print(command_line_hmmscan)
+    # Use -E 1 for next time running HMMscan or leave the fix down there
+    # In case hmmscan returns an error - Added only because it stopped in half
+    # if os.path.exists(input_folder + 'results/hmmscan/' + prefix + '_hmmscan.tbl'):
+    # 	continue
+    try:
+, shell=True)
+        # Comment line above and uncomment line below in case you want to run v.HULK without running hmmscan all over again
+        # True
+    except:
+        print("**Error calling HMMscan:", command_line_hmmscan)
+        sys.exit(1)
+    count_hmm += 1
+    # Iteration control
+    print("**Done with %d bins HMM searches..." % count_hmm)
+    ## Create dictionary as ref of collumns - pVOGs
+    dic_vogs_headers = {}
+    with open("/opt/vHULK/files/VOGs_header.txt", "r") as file2:
+        for line2 in file2:
+            key = re.match("(.+)\n", line2).group(1)
+            dic_vogs_headers[key] = np.float32(0.0)
+    #
+    # Parse hmmscan results by gene
+    num_proteins_bin = 0
+    with open(
+        input_folder
+        + "results/prokka/"
+        + prefix
+        + "/prokka_results_"
+        + prefix
+        + ".faa",
+        "r",
+    ) as faa:
+        for line in faa:
+            if"^>", line):
+                num_proteins_bin += 1
+                # Get gene name here
+                gene_name ="^>(.*)", line).group(1)
+    dic_matches = {}
+    # Parse hmmout
+    with open(
+        input_folder + "results/hmmscan/" + prefix + "_hmmscan.tbl", "r"
+    ) as hmmscan_out:
+        dic_genes_scores = {}
+        for line in hmmscan_out:
+            vog = ""
+            gene = ""
+            evalue = np.float32(0.0)
+            score = np.float32(0.0)
+            bias = np.float32(0.0)
+            if re.match("^VOG", line):
+                matches = re.match(
+                    "^(VOG[\d\w]+)\s+-\s+([^\s]+)[^\d]+([^\s]+)\s+([^\s]+)\s+([^\s]+)",
+                    line,
+                )
+                vog = matches[1]
+                gene = matches[2]
+                evalue = float(matches[3])
+                score = float(matches[4])
+                bias = float(matches[5])
+                if gene in dic_genes_scores:
+                    dic_genes_scores[gene].append([vog, evalue, score, bias])
+                else:
+                    dic_genes_scores[gene] = [[vog, evalue, score, bias]]
+                # Here goes the continuation
+    # Create a matrix by accession
+    dic_matrices_by_genome[prefix] = pd.DataFrame(
+        index=dic_genes_scores.keys(),
+        columns=dic_vogs_headers.keys(),
+        dtype=float,
+    )
+    dic_matrices_by_genome[prefix].fillna(value=np.float32(0.0), inplace=True)
+    # Fill in evalue values
+    for gene in dic_genes_scores:
+        for each_match in dic_genes_scores[gene]:
+            # print(each_match[1], gene)
+            # Fix for evalue values greater than 1
+            if each_match[1] > 1:
+                # print(each_match[1])
+                each_match[1] = 1
+                # print(each_match[1])
+            dic_matrices_by_genome[prefix][each_match[0]][gene] = np.float32(
+                1.0
+            ) - np.float32(each_match[1])
+print("\n**HMMscan has finished.")
+# Condense matrices to array by suming up columns
+list_condensed_matrices = []
+list_file_names = []
+for matrix in dic_matrices_by_genome:
+    temp = list(dic_matrices_by_genome[matrix].sum(axis=0, skipna=True))
+    list_file_names.append(matrix)
+    # Parse tag
+    # if'^NC_.*', matrix):
+    #    matrix = matrix.replace("NC_", "NC")
+    # [0]accession [1]genus [2]species
+    # tags = matrix.split("_")
+    # For Genus
+    # temp.append(tags[1])
+    # temp.append(tags[0])
+    # For Species
+    # temp.append(tag[1]+"_"+tag[2])
+    # temp.append(tag[0])
+    list_condensed_matrices.append(temp)
+# Convert to array
+# import numpy as np
+array = np.array(list_condensed_matrices)
+# print("ARRAY-SHAPE: ", len(array))
+# Predictions
+print("\n**Starting deeplearning predictions...")
+# load models
+model_genus_relu = load_model(
+    "/opt/vHULK/models/model_genus_total_fixed_relu_08mar_2020.h5",
+    custom_objects={"LeakyReLU": LeakyReLU, "ReLU": ReLU},
+model_genus_sm = load_model(
+    "/opt/vHULK/models/model_genus_total_fixed_softmax_01mar_2020.h5",
+    custom_objects={"LeakyReLU": LeakyReLU, "ReLU": ReLU},
+model_species_relu = load_model(
+    "/opt/vHULK/models/model_species_total_fixed_relu_08mar_2020.h5",
+    custom_objects={"LeakyReLU": LeakyReLU, "ReLU": ReLU},
+model_species_sm = load_model(
+    "/opt/vHULK/models/model_species_total_fixed_softmax_01mar_2020.h5",
+    custom_objects={"LeakyReLU": LeakyReLU, "ReLU": ReLU},
+with open(input_folder + "results/results.csv", "w") as file:
+    file.write(
+        "BIN/genome,pred_genus_relu,score_genus_relu,Pred_genus_softmax,score_genus_softmax,pred_species_relu,score_species_relu,pred_species_softmax,score_species_softmax,final_prediction,entropy\n"
+    )
+for i in range(0, len(array)):
+    # Genus ReLu
+    # print(list_file_names[i])
+    pred_gen_relu = model_genus_relu.predict(np.array([array[i]]))
+    # print("Genus:ReLu")
+    # print(pred_gen_relu)
+    position_pred_gen_relu = np.argmax(pred_gen_relu)
+    if not pred_gen_relu.any():
+        name_pred_gen_relu = "None"
+        score_pred_gen_relu = "0"
+    else:
+        list_hosts_genus = [
+            line.rstrip("\n") for line in open("/opt/vHULK/files/list_hosts_genus.txt")
+        ]
+        name_pred_gen_relu = list_hosts_genus[position_pred_gen_relu]
+        score_pred_gen_relu = str(pred_gen_relu[0][position_pred_gen_relu])
+        # print(list_hosts_genus[position_pred_gen_relu])
+        # print(position_pred_gen_relu, pred_gen_relu[0][position_pred_gen_relu])
+    # Genus softmax
+    pred_gen_sm = model_genus_sm.predict(np.array([array[i]]))
+    # print("Genus:Softmax")
+    # print(pred_gen_sm)
+    position_pred_gen_sm = np.argmax(pred_gen_sm)
+    list_hosts_genus = [
+        line.rstrip("\n") for line in open("/opt/vHULK/files/list_hosts_genus.txt")
+    ]
+    name_pred_gen_sm = list_hosts_genus[position_pred_gen_sm]
+    score_pred_gen_sm = str(pred_gen_sm[0][position_pred_gen_sm])
+    # print(list_hosts_genus[position_pred_gen_sm])
+    # print(position_pred_gen_sm, pred_gen_sm[0][position_pred_gen_sm])
+    # Species Relu
+    pred_sp_relu = model_species_relu.predict(np.array([array[i]]))
+    # print("Species:ReLu")
+    # print(pred_sp_relu)
+    position_pred_sp_relu = np.argmax(pred_sp_relu)
+    if not pred_sp_relu.any():
+        name_pred_sp_relu = "None"
+        score_pred_sp_relu = "0"
+    else:
+        list_hosts_sp = [
+            line.rstrip("\n") for line in open("/opt/vHULK/files/list_hosts_species.txt")
+        ]
+        # print(list_hosts_sp)
+        name_pred_sp_relu = list_hosts_sp[position_pred_sp_relu]
+        score_pred_sp_relu = str(pred_sp_relu[0][position_pred_sp_relu])
+        # print(list_hosts_sp[position_pred_sp_relu])
+        # print(position_pred_sp_relu, pred_sp_relu[0][position_pred_sp_relu])
+    # Species softmax
+    pred_sp_sm = model_species_sm.predict(np.array([array[i]]))
+    # print("Species:Softmax")
+    # print(pred_sp_sm)
+    position_pred_sp_sm = np.argmax(pred_sp_sm)
+    list_hosts_sp = [
+        line.rstrip("\n") for line in open("/opt/vHULK/files/list_hosts_species.txt")
+    ]
+    # print(list_hosts_sp)
+    name_pred_sp_sm = list_hosts_sp[position_pred_sp_sm]
+    score_pred_sp_sm = str(pred_sp_sm[0][position_pred_sp_sm])
+    # print(list_hosts_sp[position_pred_sp_sm])
+    # print(position_pred_sp_sm, pred_sp_sm[0][position_pred_sp_sm])
+    ##
+    # Calculate entropy
+    entropy_genus_sm = entr(pred_gen_sm).sum(axis=1)
+    # entropy_genus_sm = "{:.7f}".format(entr(pred_gen_sm).sum(axis=1))
+    #
+    # Apply decision tree
+    #
+    final_decision = "None"
+    # Relu sp
+    if float(score_pred_sp_relu) > 0.9:
+        final_decision = name_pred_sp_relu
+    # SM sp
+    if float(score_pred_sp_sm) > 0.6 and name_pred_sp_sm != final_decision:
+        final_decision = name_pred_sp_sm
+    # Coudn't predict species
+    if final_decision == "None":
+        # Put here sm sp
+        if float(score_pred_sp_sm) > 0.6:
+            final_decision = name_pred_sp_sm
+            # relu genus
+            if float(score_pred_gen_relu) >= 0.7:
+                final_decision = name_pred_gen_relu
+            # sm genus
+            if (
+                float(score_pred_gen_sm) >= 0.5
+                and name_pred_gen_sm != final_decision
+            ):
+                final_decision = name_pred_gen_sm
+        else:
+            # relu genus
+            if float(score_pred_gen_relu) >= 0.9:
+                final_decision = name_pred_gen_relu
+            # sm genus
+            if (
+                float(score_pred_gen_sm) >= 0.4
+                and name_pred_gen_sm != final_decision
+            ):
+                final_decision = name_pred_gen_sm
+    # Predicted species.
+    # Verify if genus is the same
+    else:
+        if, final_decision) or
+            name_pred_gen_sm, final_decision
+        ):
+            pass
+        else:
+            # relu genus
+            if float(score_pred_gen_relu) >= 0.9:
+                final_decision = name_pred_gen_relu
+            # sm genus
+            if (
+                float(score_pred_gen_sm) >= 0.5
+                and name_pred_gen_sm != final_decision
+            ):
+                final_decision = name_pred_gen_sm
+    # Print CSV
+    with open(input_folder + "results/results.csv", "a") as file:
+        file.write(
+            list_file_names[i]
+            + ","
+            + name_pred_gen_relu
+            + ","
+            + score_pred_gen_relu
+            + ","
+            + name_pred_gen_sm
+            + ","
+            + score_pred_gen_sm
+            + ","
+            + name_pred_sp_relu
+            + ","
+            + score_pred_sp_relu
+            + ","
+            + name_pred_sp_sm
+            + ","
+            + score_pred_sp_sm
+            + ","
+            + final_decision
+            + ","
+            + str(entropy_genus_sm[0])
+            + "\n"
+        )
+    # print(list_file_names[i]+","+name_pred_gen_relu+":"+score_pred_gen_relu+","+name_pred_gen_sm+":"+score_pred_gen_sm+","+name_pred_sp_relu+":"+score_pred_sp_relu+","+name_pred_sp_sm+":"+score_pred_sp_sm+","+final_decision+","+str(entropy_genus_sm))
+    '\n**Deep learning predictions have finished. Results are in file "results.csv" inside input_folder/results/.\n**Thank you for using v.HULK'
+Bootstrap: docker
+From: continuumio/miniconda3 
+     Maintainer papanikos_182
+     Version 0.1
+	 Source
+	 Preprint
+	vhulk_explicit.txt /opt/vHULK/
+	vHULK /opt
+	vhulk_resources.tar.gz /opt/vHULK
+	export PATH=/opt/vHULK:$PATH
+	apt update && apt upgrade -y
+	conda update -y conda	
+	conda create -n vhulk --file=/opt/vHULK/vhulk_explicit.txt
+    conda clean -ya
+	echo ". /opt/conda/etc/profile.d/" >> $SINGULARITY_ENVIRONMENT
+	echo "conda activate vhulk" >> $SINGULARITY_ENVIRONMENT
+	tar -xvzf /opt/vHULK/vhulk_resources.tar.gz -C /opt/vHULK && rm /opt/vHULK/vhulk_resources.tar.gz
+	A container for vHULK v0.1 ( ).
+	Required source scripts. models and data are stored in /opt/vHULK .
+	The main has been modified with
+	  - Portable shebang
+	  - All hardcoded paths are now hardcoded for this container
+	To run the help menu for vHULK from this container execute
+	$ singularity exec library://papanikos_182/vhulk:0.1 python --help
+	vHULK takes an input directory with one or more fasta files that are assumed to
+	be bins. It makes predictions based on its models for all bins separately.
+	That is, if you have assembled contigs you need to split them into separate files
+	in a directory and provide that as input.
+	It outputs a dir named `results`, stored within the input dir (nobody knows why..).
+	Its major output is a `results.csv` file with host predictions for each genome.
+	The results dir also contains hmmscan results and prokka annotations for all input 
+	genomes/bins. 
+	To run an anlysis for all genomes stored in the /path/to/genomes, with 8 threads 
+	$ singularity exec library://papanikos_182/default/vhulk:0.1 \
+	python -i /path/to/genomes -t 8
+# WIsH 
+Available from `library://papanikos_182/default/wish:1.0`
+* Note that data dependencies are not included in the container.
+Models from  [VirHostMatcher-Net](
+are used
+## Procedure
+1. Build the image with the definition file
+$ sudo singularity build wish.sif wish.def
+3. [Optional] Sign the image
+$ singularity sign wish.sif
+4. Push it on the cloud
+$ singulairty push wish.sif library://papanikos_182/default/wish:0.1
+## Usage
+$ singularity run library://papanikos_182/default/wish:1.0 \
+  WIsH -h
+Bootstrap: docker
+From: debian:latest 
+     Maintainer papanikos_182
+     Version 0.1
+	 Source
+	 Publication
+	export PATH=/opt/wish:$PATH
+	# Update stuff
+	apt update && apt upgrade -y
+	# Install compile tools for compiling wish
+	apt install -y build-essential cmake make git
+	# get source
+	git clone /opt/wish
+	# Get in there
+	cd /opt/wish
+	# Compile it
+	cmake . && make
+	A container for WIsH.
+	Source:
+	Models for host genomes are provided from VirHostMatcher-Net.
+	Example:
+	# Probably you need to bind the path/to/data/host_wish_model
+	$ singularity exec -B /path/to/data/host_wish_model:/data \
+	  WIsH -c predict -m /data \
+	  -g /path/to/phage/genomes/fastas \
+	  -r /path/to/results \
+	  -b