Installation¶
Required software¶
BUSCO >= 4.0.2
If you want to annotate the predicted proteins against the Pfam database:
Tips for installing the required software¶
The easiest and highly recommended way to install the required software is through conda in isolated environments. Bellow, an example of how to install Miniconda3 (on Linux) and the pipeline is presented:
- Installing miniconda (Linux)
Download miniconda:
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
Then, execute the script:
bash Miniconda3-latest-Linux-x86_64.sh
Answer “yes” to question. Then, close and then re-open your terminal window. Now the command conda will work.
It may be necessary to source, with the command:
source ~/.bashrc
- Software
To avoid package conflicts, we recommend to create independent virtual environments, as shown below:
MAKER¶
Create a virtual environment for MAKER:
conda create -n maker_env -c bioconda maker
It may be necessary to change RepeatMasker configuration. Please go to the RepeatMasker directory:
cd maker_env/share/RepeatMasker/
and follows the instruction on the link. Particularly, please watch the video
BUSCO¶
BUSCO version must be 4.0.2 or higher. Create a virtual environment for BUSCO:
conda create -n busco_env -c bioconda -c conda-forge busco
hmmer¶
Create a virtual environment for hmmer:
conda create -n annot_env -c bioconda -c conda-forge hmmer python
Download EMAGC¶
Clone the repository from GitHub:
git clone https://github.com/lfdelzam/EMAGC
This directory contains the scripts:
- EMAGCpoly.sh
- EMAGCsingle.sh
- src/extract_genes_with_pfam_best_hit.py