Software - genotoul-bioinfo

The GenoToul bioinformatics platform provides access to high-performance computing resources with softwares already installed to ease its usage. An exhaustive list is provided hereunder. Software are updated only upon user request. If you need any other software or if you need an update, fill the installation software form.

Select a category:

Search a software:

All software

Application	Description	Availability/Use
3D-DNA	3D de novo assembly (3D-DNA) pipeline.	Genobioinfo Cluster: How to use
3rdChimeraMiner	Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.	Genobioinfo Cluster: How to use
AAF	This is a package for constructing phylogeny without doing alignment or assembly.	Genobioinfo Cluster: Ask for Install
ABCtoolbox	BCtoolbox is a general-purpose program to perform Approximate Bayesian Computation. ABCtoolbox can be used for ABC inference on almost any type of model, including models arising in physics, biology or engineering.	Genobioinfo Cluster: Ask for Install
ABySS	ABySS (Assembly By Short Sequences) is a de novo, parallel, paired-end sequence assembler that is designed for short reads.	Genobioinfo Cluster: How to use
AC-DIAMOND	AC-DIAMOND attempts to speed up DIAMOND via better SIMD parallelization and compressed indexing. Experimental results show that AC-DIAMOND was about 6~7 times faster than DIAMOND on aligning DNA reads or contigs while retaining the essentially the similar sensitivity. AC-DIAMOND was developped based on DIAMOND v0.7.9.	Genobioinfo Cluster: Ask for Install
ACACIA	Allele CAlling proCedure for Illumina Amplicon sequencing data: This workflow aims at extracting allele information out of paired-end Illumina FASTQC files.	Genobioinfo Cluster: How to use
Accel-align	Accel-align is a fast alignment tool implemented in C++ programming language.	Genobioinfo Cluster: Ask for Install
ACFS	Accurate CircRNA Finder Suite. Discovering circRNAs from RNA-Seq data.	Genobioinfo Cluster: Ask for Install
AdamaJava	The AdamaJava project holds code for variant callers and pipeline tools related to next-generation sequencing (NGS).	Genobioinfo Cluster: How to use
AdapterRemoval	This program was developed to remove residual adapter sequences from next generation sequencing reads. The program handles both single end and paired end data.	Genobioinfo Cluster: How to use
adegenet	R package dedicated to the exploratory analysis of genetic data. It implements a set of tools ranging from multivariate methods to spatial genetics and genome-wise SNP data analysis	Genobioinfo Cluster: Ask for Install
ADMIXTOOLS	ADMIXTOOLS (Patterson et al. 2012) is a software package that supports formal tests of whether admixture occurred, and makes it possible to infer admixture proportions and dates.	Genobioinfo Cluster: How to use
Admixture	ADMIXTURE is a software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm.	Genobioinfo Cluster: How to use
AGAT	Another Gff Analysis Toolkit: suite of tools to handle gene annotations in any GTF/GFF format. Some examples what AGAT can do: standardise any GTF/GFF file into a comprehensive GFF3 format (script with agat_sp prefix): add missing parent features (e.g. gene and mRNA if only CDS/exon exist). add missing features (e.g. exon and UTR). add missing mandatory attributes (i.e. ID, Parent). fix identifier to be uniq. fix feature location. remove duplicated features. group related features (if spread in different places in the file). sort features. merge overlapping loci into one single locus (only if option activated).	Genobioinfo Cluster: How to use
AGC	Assembled Genomes Compressor (AGC) is a tool designed to compress collections of de-novo assembled genomes. It can be used for various types of datasets: short genomes (viruses) as well as long (humans).	Genobioinfo Cluster: How to use
ALDER	The ALDER software computes the weighted linkage disequilibrium (LD) statistic for making inference about population admixture	Genobioinfo Cluster: Ask for Install
ALFATClust	ALignment-Free Adaptive Threshold Clustering:Biological sequence clustering tool with dynamic threshold for individual clusters. Suitable for clustering multiple groups of homologous sequences.	Genobioinfo Cluster: How to use
Alfred	BAM Statistics, Feature Counting and Annotation	Genobioinfo Cluster: Ask for Install
AlleleSeq	pipeline which constructs a diploid personal genome from genomic sequence variants of a family trio, including SNPs, indels and structural variants and maps functional genomic data onto this personal genome.	Genobioinfo Cluster: Ask for Install
ALLHiC	Phasing and scaffolding polyploid genomes based on Hi-C data	Genobioinfo Cluster: How to use
AllPaths-LG	ALLPATHS-LG is a whole genome shotgun assembler that can generate high quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers. The significant difference between ALLPATHS and traditional assemblers such as Arachne is that ALLPATHS assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies.	Genobioinfo Cluster: Ask for Install
Alphafold2-Pytorch	An unofficial working Pytorch implementation of Alphafold2, a 3D protein predictor.	Genobioinfo Cluster: Ask for Install
AlphaImpute	AlphaImpute is a software package for imputing and phasing genotype data in diploid populations with pedigree information.	Genobioinfo Cluster: Ask for Install
AMAS	Calculate summary statistics and manipulate multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
AMOS	A Modular, Open-Source whole genome assembler.	Genobioinfo Cluster: Ask for Install
Ampliconnoise	AmpliconNoise is a collection of programs for the removal of noise from 454 sequenced PCR amplicons. It involves two steps the removal of noise from the sequencing itself and the removal of PCR point errors. This project also includes the Perseus algorithm for chimera removal.	Genobioinfo Cluster: Ask for Install
AmpliSAT	AmpliSAT (Amplicon Sequencing Analysis Tools) are a set of online tools that make easy the analysis of Amplicon Sequencing experiments.	Genobioinfo Cluster: How to use
ancIBD	Identify IBD segments between pairs of individuals in ancient human DNA data. The software package `ancIBD` detects Identity-by-Descent (IBD) segments in typical human aDNA data, implementing an algorithm described in this preprint. The input data are imputed and phased genotype data. The default parameters of `ancIBD` are optimized for imputed data using the software GLIMPSE using the 1000 Genome haplotype reference panel. Software documentation here.	Genobioinfo Cluster: How to use
ANGEL	Robust Open Reading Frame prediction (ANGLE re-implementation)	Genobioinfo Cluster: Ask for Install
ANGSD	ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities.	Genobioinfo Cluster: How to use
AnnotSV	An integrated tool for Structural Variations annotation and ranking.	Genobioinfo Cluster: How to use
ANNOVAR	ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others)	Genobioinfo Cluster: Ask for Install
Anvio	Anvi’o is an analysis and visualization platform for ‘omics data. It brings together many aspects of today’s cutting-edge genomic, metagenomic, and metatranscriptomic analysis practices to address a wide array of needs.	Genobioinfo Cluster: How to use
ApoplastP	ApoplastP is a machine learning method for predicting localization of proteins to the plant apoplast. ApoplastP can distinguish non-apoplastic proteins from apoplastic proteins for both plant proteins and pathogen proteins. In particular, ApoplastP can predict if an effector localizes to the plant apoplast.	Genobioinfo Cluster: How to use
Apptainer	Apptainer is an open source container platform designed to be simple, fast, and secure.	Genobioinfo Cluster: How to use
Apscale	Advanced Pipeline for Simple yet Comprehensive AnaLysEs of DNA metabarcoding data	Genobioinfo Cluster: How to use
Aquila	Diploid personal genome assembly and comprehensive variant detection based on linked-reads.	Genobioinfo Cluster: Ask for Install
ARAGORN	ARAGORN is a program to detect tRNA genes and tmRNA genes in nucleotide sequence	Genobioinfo Cluster: Ask for Install
ARBitR	ARBitR is an overlap aware genome assembly scaffolder for linked sequencing reads.	Genobioinfo Cluster: Ask for Install
ARC	ARC is a pipeline which facilitates iterative, reference guided de novo assemblies with the intent of: - Reducing time in analysis and increasing accuracy of results by only considering those reads which should assemble together. - Reducing/removing reference bias as compared to mapping based approaches.	Genobioinfo Cluster: Ask for Install
ARCS	Scaffolding genome sequence assemblies using 10X Genomics GemCode/Chromium data	Genobioinfo Cluster: Ask for Install
ARGweaver	The ARGweaver/ARGweaver-D software package contains programs and libraries for sampling and manipulating ancestral recombination graphs (ARGs).	Genobioinfo Cluster: How to use
ARKS	Scaffolding genome sequence assemblies using 10X Genomics GemCode/Chromium data. This project is a new kmer-based (alignment free) implementation of ARCS. It provides improved runtime performance over the original ARCS implementation by removing the requirement to perform alignments with bwa mem.	Genobioinfo Cluster: Ask for Install
Armatus	Multiresolution domain calling software for chromosome conformation capture interaction matrices. Armatus is a Topologically Associated Domain caller. Follow the Web page to know more about Armatus.	Genobioinfo Cluster: Ask for Install
Arriba	Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data.	Genobioinfo Cluster: How to use
ArrowGrid	The distribution is a parallel wrapper around the Arrow consensus framework within the SMRT Analysis Software	Genobioinfo Cluster: How to use
ART	ART is a set of simulation tools to generate synthetic next-generation sequencing reads.	Genobioinfo Cluster: Ask for Install
art_modern	A modern re-implementation of the popular ART simulator with enhanced performance and functionality.	Genobioinfo Cluster: How to use
ASGART	ASGART (A Segmental duplications Gathering and Refinement Tool) is a multiplatform (GNU/Linux, macOS, Windows) tool designed to search for large duplications amongst one or two DNA strands.	Genobioinfo Cluster: Ask for Install
ASHURE	Python-based pipeline for analyzing Nanopore sequencing metabarcoding data. ASHURE can take a reference database in order to improve accuracy.	Genobioinfo Cluster: Ask for Install
ASMC	Ascertained Sequentially Markovian Coalescent (contains ASMC and an extension, FastSMC, together with python bindings for both)	Genobioinfo Cluster: How to use
assemblathon2	This repo contains a motley assortment of unpublished scripts and commands used by Ian Korf, Keith Bradnam, and Joe Fass in the analysis of Assemblathon 2 competition entries (assemblies).	Genobioinfo Cluster: How to use
assembly-stats	Get assembly statistics from FASTA and FASTQ files.	Genobioinfo Cluster: Ask for Install
Assemblytics	Assemblytics is a bioinformatics tool to detect and analyze structural variants from a genome assembly by comparing it to a reference genome.	Genobioinfo Cluster: Ask for Install
Assexon	Assembling Exon Using Gene Capture Data	Genobioinfo Cluster: Ask for Install
ASTER	A family of ASTRAL-like algorithms.	Genobioinfo Cluster: How to use
ASTRAL	ASTRAL is a tool for estimating an unrooted species tree given a set of unrooted gene trees.	Genobioinfo Cluster: How to use
ASTRAL-Pro	ASTRAL-Pro stands for ASTRAL for PaRalogs and Orthologs. ASTRAL is a tool for estimating an unrooted species tree given a set of unrooted gene trees.	Genobioinfo Cluster: Ask for Install
atac dnase pipelines	ATAC-seq and DNase-seq processing pipeline. This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.	Genobioinfo Cluster: Ask for Install
Atropos	Atropos is tool for specific, sensitive, and speedy trimming of NGS reads. It is a fork of Cutadapt read trimmer.	Genobioinfo Cluster: Ask for Install
ATTRACT	ATTRACT program suite for macromolecular docking (protein-protein, protein-nucleic acid, protein-peptide).	Genobioinfo Cluster: How to use
Augustus	Augustus is a program that predicts genes in eukaryotic genomic sequences	Genobioinfo Cluster: How to use
AutoHiC	AutoHiC is a deep learning tool that uses Hi-C data to support genome assembly. It can automatically correct errors during genome assembly and generate genomes at the chromosome level.	Genobioinfo Cluster: How to use
AvP	AvP performs automatic detection of HGT candidates within a phylogenetic framework.	Genobioinfo Cluster: How to use
awscli	The AWS Command Line Interface (AWS CLI) is an open source tool that enables you to interact with AWS services using commands in your command-line shell.	Genobioinfo Cluster: How to use
Back_to_sequences	Given a set of kmers (fasta / fastq <.gz> format) and a set of sequences (fasta / fastq <.gz> format), this tool will extract the sequences containing some of those kmers.	Genobioinfo Cluster: How to use
Badread	Badread is a long-read simulator tool that makes – you guessed it – bad reads! It can imitate many kinds of problems one might encounter in real long-read sets: chimeras, low-quality regions, systematic basecalling errors and more.	Genobioinfo Cluster: How to use
BAli-Phy	BAli-Phy is software by Ben Redelings that estimates multiple sequence alignments and evolutionary trees from DNA, amino acid, or codon sequences. It uses likelihood-based evolutionary models of substitutions and insertions and deletions to place gaps.	Genobioinfo Cluster: Ask for Install
bam2plot	Make coverage plots from bam files.	Genobioinfo Cluster: How to use
BamBam	several simple-to-use tools to facilitate NGS analysis	Genobioinfo Cluster: Ask for Install
BAMM	A program for multimodel inference on speciation and trait evolution.	Genobioinfo Cluster: Ask for Install
BAMscorer	BAMscorer can be used to conduct genomic assignment tests from BAM files. Assignments can be done on genomic regions, inversions, and whole-genome datasets.	Genobioinfo Cluster: Ask for Install
Bamstats (notsame as BAMstats)	Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.	Genobioinfo Cluster: Ask for Install
bamtofastq	Tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files.	Genobioinfo Cluster: How to use
Bamtools	BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.	Genobioinfo Cluster: How to use
bamUtil	bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.	Genobioinfo Cluster: How to use
Bandage-NG	Bandage-NG is a GUI program that allows users to interact with the assembly graphs made by de novo assemblers such as SPAdes, MEGAHIT and others.	Genobioinfo Cluster: How to use
Barrnap	Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).	Genobioinfo Cluster: Ask for Install
BayesAss3-SNPs	Modification of BayesAss 3.0.4 to allow handling of large SNP datasets.	Genobioinfo Cluster: How to use
BayeScan	Detecting natural selection from population-bases genetic data using differences in alleles frequencies between populations.	Genobioinfo Cluster: How to use
BayeScEnv	BayeScEnv is a Fst-based, genome-scan method that uses environmental variables to detect local adaptation.	Genobioinfo Cluster: Ask for Install
BayesTraits	BayesTraits is a computer package for performing analyses of trait evolution among groups of species for which a phylogeny or sample of phylogenies is available. This new package incoporates our earlier and separate programes Multistate, Discrete and Continuous. BayesTraits can be applied to the analysis of traits that adopt a finite number of discrete states, or to the analysis of continuously varying traits. Hypotheses can be tested about models of evolution, about ancestral states and about correlations among pairs of traits.	Genobioinfo Cluster: Ask for Install
BayPass	The package BayPass is a population genomics software which is primarily aimed at identifying genetic markers subjected to selection and/or associated to population-specific covariates (e.g., environmental variables, quantitative or categorical phenotypic characteristics).	Genobioinfo Cluster: How to use
BBMap	a short read aligner, as well as various other bioinformatic tools.	Genobioinfo Cluster: How to use
BCALM2	A bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.	Genobioinfo Cluster: Ask for Install
BCFtools	utilities for variant calling and manipulating VCFs and BCFs.	Genobioinfo Cluster: How to use
bcl2fastq	The Bcl2FastQ conversion software is a new tool to handle bcl conversion and demultiplexing of both unzipped and zipped bcl files, which have reduced footprint and were introduced as an optional output of the HCS Software version 2.0	Genobioinfo Cluster: How to use
BCOOL	BCOOL is a read corrector for NGS sequencing data that align reads on a de Bruijn graph.	Genobioinfo Cluster: Ask for Install
Beagle	BEAGLE is a state of the art software package for analysis of large-scale genetic data sets with hundreds of thousands of markers genotyped on thousands of samples.	Genobioinfo Cluster: How to use
Beagle_Utilities	Simple utility programs for manipulating text files, especially VCF files.	Genobioinfo Cluster: Ask for Install
Beagle-lib	BEAGLE-lib is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages	Genobioinfo Cluster: How to use
BEAST	BEAST is a software package for phylogenetic analysis with an emphasis on time-scaled trees. BEAST is a cross-platform program for Bayesian analysis of molecular sequences using MCMC. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. We include a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.	Genobioinfo Cluster: How to use
BEAST2	BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences.	Genobioinfo Cluster: How to use
BEDOPS	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	Genobioinfo Cluster: How to use
bedtools	The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage.	Genobioinfo Cluster: How to use
BELLA	A computationally-efficient and highly-accurate long-read to long-read aligner and overlapper.	Genobioinfo Cluster: Ask for Install
BeXY	BeXY is a tool to jointly infer sex karyotypes and sex-linked scaffolds from read count data. It can also be used to genetically sex single individuals. BeXY is a command-line tool, and we provide an easy-to-use R package to visualize and parse the results.	Genobioinfo Cluster: How to use
bgc	bgc implements Bayesian estimation of genomic clines to quantify introgression at many loci.	Genobioinfo Cluster: Ask for Install
Bifrost	Highly parallel construction and indexing of colored and compacted de Bruijn graphs.	Genobioinfo Cluster: How to use
BIG-SCAPE	Biosynthetic Genes Similarity Clustering and Prospecting Engine. Defines a distance metric between Gene Clusters using a combination of three indices (Jaccard Index of domain types, Domain Sequence Similarity the Adjacency Index)	Genobioinfo Cluster: Ask for Install
BigDataScript	BigDataScript is intended as a scripting language for big data pipeline	Genobioinfo Cluster: Ask for Install
Bioawk	Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.	Genobioinfo Cluster: How to use
biohazard-tools	This is a collection of command line utilities that do useful stuff involving BAM files for Next Generation Sequencing data.	Genobioinfo Cluster: Ask for Install
Biopieces	The Biopieces are a collection of bioinformatics tools that can be pieced together in a very easy and flexible manner to perform both simple and complex tasks.	Genobioinfo Cluster: Ask for Install
BIOPYTHON	Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.	Genobioinfo Cluster: in Python-3.11.1 (see "search_Python_module" script to search in others Python versions)
BiSCoT	BiSCoT is a tool that aims to improve the contiguity of scaffolds and contigs generated after a Bionano scaffolding.	Genobioinfo Cluster: Ask for Install
BISCUIT	BISulfite-seq CUI Toolkit (BISCUIT) is a utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data.	Genobioinfo Cluster: How to use
Bismark	A tool to map bisulfite converted sequence reads and determine cytosine methylation states	Genobioinfo Cluster: How to use
BisSNP	Accurate combined SNP/Methylation calling.	Genobioinfo Cluster: Ask for Install
Blasr	Reference-based alignment	Genobioinfo Cluster: ( See SMRTLink)
blat	The BLAST-Like Alignment Tool: similarity search in databanks. BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more.	Genobioinfo Cluster: How to use
BLAZE	Barcode identification from Long reads for AnalyZing single-cell gene Expression. SingleCell Nanopore sequencing data analysis.	Genobioinfo Cluster: How to use
BlobTools	A modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.	Genobioinfo Cluster: How to use
BlockClust	BlockClust is an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. BlockClust allows both clustering and classification of small non-coding RNAs.	Genobioinfo Cluster: Ask for Install
blue-crab	blue-crab is a conversion tool to convert from ONT's POD5 format to the community maintained SLOW5/BLOW5 format.	Genobioinfo Cluster: How to use
BOLDigger	Python program to query .fasta files against the different databases of www.boldsystems.org	Genobioinfo Cluster: How to use
BOLDigger2	An even better Python program to query .fasta files against the COI database of www.boldsystems.org	Genobioinfo Cluster: How to use
BOLT-LMM	The BOLT-LMM software package consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).	Genobioinfo Cluster: How to use
bonsaitree	Algorithm for automatically building pedigrees using IBD, Age, and Sex information.	Genobioinfo Cluster: How to use
Bowtie	Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).	Genobioinfo Cluster: How to use
Bowtie2	Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.	Genobioinfo Cluster: How to use
BPP	Bayesian analysis of genomic sequence data under the multispecies coalescent model.	Genobioinfo Cluster: Ask for Install
Bracken	Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.	Genobioinfo Cluster: How to use
Braker	BRAKER(1,2,3) is a tool for fully automated genome annotation with GeneMark-ET and AUGUSTUS	Genobioinfo Cluster: How to use
BRAKER4	BRAKER4 is a complete rewrite of the BRAKER pipeline in Snakemake ( tool for fully automated genome annotation with GeneMark-ET and AUGUSTUS) The gene prediction logic is the same: GeneMark trains on extrinsic evidence, AUGUSTUS is trained on GeneMark predictions, and TSEBRA merges the results. What changed is how this logic is orchestrated.	Genobioinfo Cluster: How to use
BreakDancer	SV detection from paired end reads mapping.	Genobioinfo Cluster: How to use
breseq	breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data for haploid microbial-sized genomes.	Genobioinfo Cluster: How to use
Bridger	Bridger is an efficient de novo trascriptome assembler for RNA-Seq data.	Genobioinfo Cluster: How to use
BS-Seeker2-	BS-Seeker2 is a seamless and versatile pipeline for accurately and fast mapping the bisulfite-treated reads.	Genobioinfo Cluster: Ask for Install
BS-SNPer	BS-SNPer is an ultrafast and memory-efficient package, a program for BS-Seq variation detection from alignments in standard BAM/SAM format using approximate Bayesian modeling.	Genobioinfo Cluster: Ask for Install
BSMAP	BSMAP is a short reads mapping software for bisulfite sequencing reads. Bisulfite treatment converts unmethylated Cytosines into Uracils (sequenced as Thymine) and leave methylated Cytosines unchanged, hence provides a way to study DNA cytosine methylation at single nucleotide resolution. BSMAP aligns the Ts in the reads to both Cs and Ts in the reference	Genobioinfo Cluster: Ask for Install
Btrim	A fast and accurate adapter, barcodes, and low-quality region trimming and binning program written in C for next-generating sequencing reads. The search algorithm is based on Eugene Myers' fast bit-vector algorithm.	Genobioinfo Cluster: Ask for Install
BUCKy	BUCKy is a free program to combine molecular data from multiple loci. BUCKy estimates the dominant history of sampled individuals, and how much of the genome supports each relationship, using Bayesian concordance analysis.	Genobioinfo Cluster: Ask for Install
BUSCO	BUSCO v2 provides quantitative measures for the assessment of genome assembly, gene set, and transcriptome completeness, based on evolutionarily-informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB v9. BUSCO assessments are implemented in open-source software, with a large selection of lineage-specific sets of Benchmarking Universal Single-Copy Orthologs. These conserved orthologs are ideal candidates for large-scale phylogenomics studies, and the annotated BUSCO gene models built during genome assessments provide a comprehensive gene predictor training set for use as part of genome annotation pipelines.	Genobioinfo Cluster: How to use
BUSCO_phylogenomics	This is a Python pipeline to construct species phylogenies using BUSCO proteins. It works directly from BUSCO output and can generate concatenated supermatrix alignments and also gene trees of BUSCO families. The pipeline identifies BUSCO proteins that are complete and single-copy in all input samples. Alternatively, you can account for missing data and choose to include BUSCO proteins that are complete and single-copy in a certain percentage of input samples. Each BUSCO family is individually aligned, trimmed, and then concatenated together to generate a supermatrix alignment. The pipeline also identifies BUSCO proteins that are complete and single-copy in at least 4 input samples, and generates gene trees for each of these families.	Genobioinfo Cluster: How to use
busco2fasta	A script to turn a set of BUSCO results into a directory of multisequence FASTA files.	Genobioinfo Cluster: Ask for Install
bustools	bustools is a program for manipulating BUS files for single cell RNA-Seq datasets. It can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatibility count matrices, and is useful for many other tasks.	Genobioinfo Cluster: Ask for Install
bwa	Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates.	Genobioinfo Cluster: How to use
bwa-mem2	Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces alignment identical to bwa and is ~80% faster.	Genobioinfo Cluster: How to use
bwa-meth	Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.	Genobioinfo Cluster: How to use
bwtools	bwtool is a command-line utility for bigWig files.	Genobioinfo Cluster: Ask for Install
Cabal	Cabal is the standard package system for Haskell software. It helps people to configure, build and install Haskell software and to distribute it easily to other users and developers.	Genobioinfo Cluster: Ask for Install
Cactus	Cactus is a reference-free whole-genome alignment program, as well as a pagenome graph construction toolkit.	Genobioinfo Cluster: How to use
CAFE	Software for Computational Analysis of gene Family Evolution. The purpose of CAFE is to analyze changes in gene family size in a way that accounts for phylogenetic history and provides a statistical foundation for evolutionary inferences.	Genobioinfo Cluster: How to use
CAID	The CAID software produces all outputs necessary for Critical Assessment of Intrinsic Disorder (CAID) edition, including baselines, references, metrics and plots, starting from predictions and a reference (see Data Availability section to know how to obtain this data).	Genobioinfo Cluster: How to use
CAMI-AMBER	AMBER is an evaluation package for the comparative assessment of genome reconstructions and taxonomic assignments from metagenome benchmark datasets.	Genobioinfo Cluster: Ask for Install
CAMISIM	CAMISIM is a software to model abundance distributions of microbial communities and to simulate corresponding shotgun metagenome datasets.	Genobioinfo Cluster: How to use
canu	A single molecule sequence assembler for genomes large and small.	Genobioinfo Cluster: How to use
CAP3	A DNA Sequence Assembly Program	Genobioinfo Cluster: How to use
CARNAC-LR	Clustering coefficient-based Acquisition of RNA Communities in Long Reads.	Genobioinfo Cluster: Ask for Install
CarpeDeam	CarpeDeam is a damage-aware metagenome assembler for ancient metagenomic DNA datasets. It takes (merged) reads and a damage matrix as input and prooved to work best for heavily damaged datasets.	Genobioinfo Cluster: How to use
Carthagene	CarthaGene is a genetic/radiated hybrid mapping software. CarthaGene looks for multiple populations maximum likelihood consensus maps using a fast EM algorithm for maximum likelihood estimation and powerful ordering algorithms. CarthaGene can handle data made up of several distinct populations which t may each be either F2 backcross, recombinant inbred lines, F2 t intercross, phase known outbreds and/or radiated hybrids (haploid t and diploid data).	Genobioinfo Cluster: Ask for Install
Cas-OFFinder	Cas-OFFinder is OpenCL based, ultrafast and versatile program that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases (RGEN).	Genobioinfo Cluster: How to use
CAT	This project aims to provide a straightforward end-to-end pipeline that takes as input a HAL-format multiple whole genome alignment as well as a GFF3 file representing annotations on one high quality assembly in the HAL alignment, and produces a output GFF3 annotation on all target genomes chosen.	Genobioinfo Cluster: Ask for Install
CATCH	A package for designing compact and comprehensive capture probe sets.	Genobioinfo Cluster: How to use
CCMetagen	CCMetagen processes sequence alignments produced with KMA, which implements the ConClave sorting scheme to achieve highly accurate read mappings. CCMetagen processes sequence alignments produced with KMA, which implements the ConClave sorting scheme to achieve highly accurate read mappings. CCMetagen produces ranked taxonomic results in user-friendly formats that are ready for publication or downstream statistical analyses.	Genobioinfo Cluster: Ask for Install
cctools	The Cooperative Computing Tools (cctools) enable large scale distributed computations to harness hundreds to thousands of machines from clusters, clouds, and grids.	Genobioinfo Cluster: Ask for Install
cd-hit	CD-HIT stands for Cluster Database at High Identity with Tolerance. The program (cd-hit) takes a fasta format sequence database as input and produces a set of 'non-redundant' (nr) representative sequences as output. In addition cd-hit outputs a cluster file, documenting the sequence 'groupies' for each nr sequence representative.	Genobioinfo Cluster: How to use
cdbfasta	This is a brief introduction to a couple of platform independent file-based hashing tools (cdbfasta and cdbyank) that can be used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files.	Genobioinfo Cluster: How to use
cDNA_Cupcake	cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data.	Genobioinfo Cluster: in Cogent module
CEGMA	CEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.	Genobioinfo Cluster: How to use
CellRanger	Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.	Genobioinfo Cluster: How to use
CellRanger ARC	Cell Ranger ARC's pipelines analyze sequencing data produced from Chromium Single Cell Multiome ATAC + Gene Expression.	Genobioinfo Cluster: How to use
CellRanger ATAC	Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.	Genobioinfo Cluster: How to use
CellRanger DNA	Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.	Genobioinfo Cluster: Ask for Install
Cellsnp-lite	Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs.	Genobioinfo Cluster: How to use
CENSOR	CENSOR compares and masks protein or nucleotide sequences.	Genobioinfo Cluster: How to use
Centrifuge	Classifier for metagenomic sequences. Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species on desktop computers.	Genobioinfo Cluster: How to use
centroAnno	centroAnno is a prior-independent tool for automatic and efficient centromere/tendem repeat structural analysis across multiple species. centroAnno supports the analysis of repeat units and higher-order tandem repeat units (HORs) in genome/assembly, centromere sequence, and single sequencing long read.	Genobioinfo Cluster: How to use
cgMLSTFinder	Core genome Multi-Locus Sequence Typing cgMLSTFinder runs KMA <1> against a chosen core genome MLST (cgMLST) database and outputs the detected alleles in a matrix file.	Genobioinfo Cluster: Ask for Install
CheckM	Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes.	Genobioinfo Cluster: How to use
CheckM2	Assessing the quality of metagenome-derived genome bins using machine learning.	Genobioinfo Cluster: How to use
CHEUI	CHEUI (Methylation (CH₃) Estimation Using Ionic current) is an RNA modification detection software for Oxford Nanopore direct RNA sequencing data. CHEUI can be used to detect m6A and m5C in individual reads at single-nucleotide resolution from any sample (e.g. single condition), or detect differential m6A or m5C between any two conditions. CHEUI uses a two-stage deep learning method to detect m6A and m5C transcriptome-wide at single-read and single-site resolution in any sequence context (i.e. without any sequence constrains).	Genobioinfo Cluster: How to use
chewBBACA	chewBBACA is a software suite for the creation and evaluation of core genome and whole genome MultiLocus Sequence Typing (cg/wgMLST) schemas and results. The "BBACA" stands for "BSR-Based Allele Calling Algorithm". BSR stands for BLAST Score Ratio as proposed by Rasko DA et al.. The "chew" part adds extra coolness to the name and could be thought of as "Comprehensive and Highly Efficient Workflow".	Genobioinfo Cluster: How to use
chimerascan	chimerascan is a software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets.	Genobioinfo Cluster: Ask for Install
ChimPipe	ChimPipe is a computational method for the detection of novel transcription-induced chimeric transcripts and fusion genes from Illumina Paired-End RNA-seq data. It combines junction spanning and paired-end read information to accurately detect chimeric splice junctions at base-pair resolution.	Genobioinfo Cluster: Ask for Install
chopper	This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. chopper is a tool that reunites the now outdated softwares NanoFilt and NanoLyse. It permits to filter QC files and has a faster execution time than NanoFilt and NanoLyse.	Genobioinfo Cluster: How to use
ChopStitch	Exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data.	Genobioinfo Cluster: Ask for Install
chromeister	A dotplot generator for large chromosomes.	Genobioinfo Cluster: How to use
Chromonomer	Chromonomer is a program designed to integrate a genome assembly with a genetic map.	Genobioinfo Cluster: Ask for Install
Chromosight	Python package to detect chromatin loops (and other patterns) in Hi-C contact maps.	Genobioinfo Cluster: How to use
CIRCexplorer2	CIRCexplorer2 is a comprehensive and integrative circular RNA analysis toolset	Genobioinfo Cluster: How to use
circfull	A tool to detect and quantify full-length circRNA isoforms from circFL-seq.	Genobioinfo Cluster: How to use
Circlator	A tool to circularize genome assemblies	Genobioinfo Cluster: How to use
circos	Circos is a software package for visualizing data and information.	Genobioinfo Cluster: How to use
circtools	A modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software.	Genobioinfo Cluster: Ask for Install
Circuitscape	Circuitscape borrows algorithms from electronic circuit theory to predict patterns of movement, gene flow, and genetic differentiation among plant and animal populations in heterogeneous landscapes.	Genobioinfo Cluster: How to use
CIRI	CIRI (circRNA identifier) is a novel chiastic clipping signal based algorithm, which can unbiasedly and accurately detect circRNAs from transcriptome data by employing multiple filtration strategies.	Genobioinfo Cluster: Ask for Install
CIRI-long	Circular RNA Identification for Long-Reads Nanopore Sequencing Data.	Genobioinfo Cluster: How to use
CITE-seq-Count	A tool that allows to get UMI counts from a single cell protein assay.	Genobioinfo Cluster: Ask for Install
Clair3	Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.	Genobioinfo Cluster: How to use
CLARC	Connected Linkage and Alignment Redefinition of COGs: a tool that uses sequence identity, linkage patterns and functional annotations to identify and reduce the over-splitting of accessory genes into multiple clusters of orthologous genes (COGs) in a pangenome analysis. In summary, CLARC is meant to compliment existing bacterial pangenome tools by polishing their COG definitions. As input, the pipeline currently takes the presence absence matrix generated with Roary (but can also accept inputs from Panaroo, PPanGGOLiN and RIBAP). We believe CLARC is particularly helpful for researchers that plan to perform downstream analyses that rely on COG frequencies, such as studying the evolutionary dynamics of accessory genes or running a panGWAS.	Genobioinfo Cluster: How to use
CleaveLand4	Analysis of degradome data to find sliced miRNA and siRNA targets	Genobioinfo Cluster: How to use
ClinSV		Genobioinfo Cluster: How to use
ClipAndMerge	Clip&Merge is a tool to clip off adapters from sequencing reads and merge overlapping paired end reads together.	Genobioinfo Cluster: How to use
ClonalFrameML	A software package that performs efficient inference of recombination in bacterial genomes.	Genobioinfo Cluster: How to use
CLUMPAK	Clustering Markov Packager Across K - was developed in order to aid users analyse the results of STRUCTURE-like programs. The software offers a few alternative modes of action, please go to the Help section for detailed about these modes.	Genobioinfo Cluster: How to use
Clumppling	CLUster Matching and Permutation Program that uses integer Linear programmING: a framework for aligning mixed-membership clustering results of population structure analysis.	Genobioinfo Cluster: How to use
Clustal Omega	Clustal Omega is the latest addition to the Clustal family. It offers a significant increase in scalability over previous versions, allowing hundreds of thousands of sequences to be aligned in only a few hours. It will also make use of multiple processors, where present. In addition, the quality of alignments is superior to previous versions, as measured by a range of popular benchmarks	Genobioinfo Cluster: How to use
ClustalW	Multiple sequence alignment program for DNA or proteins.	Genobioinfo Cluster: Ask for Install
CMfinder	CMfinder is a RNA motif prediction tool.	Genobioinfo Cluster: Ask for Install
cnD	cnD is a program to detect copy number variants from short-read sequence data.	Genobioinfo Cluster: How to use
CNVkit	A command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing. CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent. Read the full documentation at: http://cnvkit.readthedocs.io	Genobioinfo Cluster: How to use
CNVnator	A tool for CNV discovery and genotyping from depth of read mapping.	Genobioinfo Cluster: Ask for Install
Cnvpipelines	A pipeline to detect copy number variations (CNV) on several samples.	Genobioinfo Cluster: Ask for Install
code-server	Run VSCode on any machine anywhere and access it in the browser.	Genobioinfo Cluster: How to use
Cogent	Cogent is a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences. It is designed to be used on Iso-Seq data and in cases where there is no reference genome or the ref genome is highly incomplete.	Genobioinfo Cluster: How to use
ColabFold	ColabFold is an easy-to-use Notebook based environment for fast and convenient protein structure predictions. Its structure prediction is powered by AlphaFold2 and RoseTTAFold combined with a fast multiple sequence alignment generation stage using MMseqs2.	Genobioinfo Cluster: How to use
COMEBin	COMEBin allows effective binning of metagenomic contigs using COntrastive Multi-viEw representation learning.	Genobioinfo Cluster: How to use
Comp-D	A program for comprehensive computation of D-statistics and population summaries (serial version).	Genobioinfo Cluster: Ask for Install
COMPADRE	COMPADRE integrates genome-wide IBD sharing estimates from PRIMUS and shared segments length and distribution data from ERSA to improve relationship estimation accuracy in family networks ahead of pedigree generation. COMPADRE aims to extend the number and variety of constructed pedigrees derived from populations with increased data heterogeneity.	Genobioinfo Cluster: How to use
Computel	Computel is designed for measuring mean telomere length and abundance of canonical and variant telomeric repeats from Illumina Whole Genome NGS Sequencing data.	Genobioinfo Cluster: How to use
CONCOCT	A program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.	Genobioinfo Cluster: How to use
Concrete Autoencoders	The concrete autoencoder is an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features.	Genobioinfo Cluster: Ask for Install
Consensify	Consensify is a method for generating a consensus pseudohaploid genome sequence with greatly reduced error rates compared to standard pseudohaploidisation.	Genobioinfo Cluster: Ask for Install
CONSENT	CONSENT (sCalable self-cOrrectioN of long reads with multiple SEquence alignmeNT) is a self-correction method for long reads.	Genobioinfo Cluster: Ask for Install
Conterminator	Conterminator is an efficient method for detecting incorrectly labeled sequences across kingdoms by an exhaustive all-against-all sequence comparison.	Genobioinfo Cluster: Ask for Install
ContextMap2	Fast and accurate context-based RNA-seq mapping. ContextMap determines the most likely origin of a read by evaluating the context of the read in the form of alignments of other reads to the same genomic region. In the original implementation, the focus was on improving initial mappings provided by other mapping tools.	Genobioinfo Cluster: Ask for Install
CONTRAST	CONTRAST predicts protein-coding genes from a multiple genomic alignment using a combination of discriminative machine learning techniques.	Genobioinfo Cluster: Ask for Install
ContScout	ContScout is a pipeline developed for the identification and removal of contaminating sequences in draft genomes.	Genobioinfo Cluster: How to use
Cooler	Cooler is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.	Genobioinfo Cluster: How to use
coolpuppy	A versatile tool to perform pile-up analysis on Hi-C data in .cool format.	Genobioinfo Cluster: Ask for Install
CRABS	CRABS (Creating Reference databases for Amplicon-Based Sequencing) is a versatile software program that generates curated reference databases for metagenomic analysis.	Genobioinfo Cluster: How to use
CREPE	CREPE is a batch primer design and specificity analysis tool. CREPE is a batch primer design and specificity analysis tool. It uses Primer3 (https://primer3.org/) to create primers from an input CSV of target sites. It then uses UCSC's In-Silico PCR (https://genome.ucsc.edu/cgi-bin/hgPcr) to identify off-target enrichment sites for each primer pair. Lastly, a custom Python evaluation script (E-script) performs specificity analysis to determine the quality of predicted off-target sites from ISPCR.	Genobioinfo Cluster: How to use
CRISP	CRISP is a software program to detect SNPs and short indels from pooled sequencing data.	Genobioinfo Cluster: Ask for Install
CRISPOR	CRISPOR predicts off-targets in the genome, ranks guides, highlights problematic guides, designs primers and helps with cloning.	Genobioinfo Cluster: How to use
CRISPR-broad	CRISPR-broad is a standalone tool that enables user to scan genome for regions that has high frequency of gRNA with user-supplied variation. The package is developed for the design of gRNA for the targeted epigenome modifications on a broader region.	Genobioinfo Cluster: How to use
CRISPR-HAWK	CRISPR-HAWK is a comprehensive and scalable tool for designing guide RNAs (gRNAs) and assessing genetic variants impact on on-target sites in CRISPR-Cas systems. This makes CRISPR-HAWK particularly suitable for both personalized and population-wide gRNA design. CRISPR-HAWK automates the entire workflow—from variant-aware preprocessing to gRNA discovery—delivering comprehensive outputs including ranked tables, annotated sequences, and high-quality figures. Its modular design ensures easy integration with existing pipelines and tools, such as CRISPRme or CRISPRitz, for subsequent off-target prediction and analysis of prioritized gRNAs.	Genobioinfo Cluster: How to use
CRISPResso2	CRISPResso is a software pipeline designed to enable rapid and intuitive interpretation of genome editing experiments. Briefly, CRISPResso: aAligns sequencing reads to a reference sequence, quantifies insertions, mutations and deletions to determine whether a read is modified or unmodified by genome editing, summarizes editing results in intuitive plots and datasets.	Genobioinfo Cluster: How to use
CRISPRme	CRISPRme is a comprehensive tool designed for thorough off-target assessment in CRISPR-Cas systems. CRISPRme accounts for single-nucleotide variants (SNVs) and indels, considers bona fide haplotypes, and allows for spacer:protospacer mismatches and bulges, making it well-suited for both population-wide and personal genome analyses. CRISPRme automates the entire workflow, from data download to executing the search, and delivers detailed reports complete with tables and figures through an interactive web-based interface.	Genobioinfo Cluster: How to use
CroCo	A program to detect potential cross contaminations in HTS assembled transcriptomes using expression level quantification.	Genobioinfo Cluster: Ask for Install
csem	CSEM is a ChIP-Seq multi-read allocator. CSEM stands for ChIP-Seq multi-read allocation using Expectation-Maximization.	Genobioinfo Cluster: How to use
csvtk	A cross-platform, efficient and practical CSV/TSV toolkit in Golang.	Genobioinfo Cluster: Ask for Install
csvtk	A cross-platform, efficient and practical CSV/TSV toolkit in Golang.	Genobioinfo Cluster: How to use
Cufflinks	Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.	Genobioinfo Cluster: How to use
currentNE	Estimation of current effective population using artificial neural networks.	Genobioinfo Cluster: How to use
cutadapt	Cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.	Genobioinfo Cluster: How to use
cuteSV	Long read based human genomic structural variation detection with cuteSV.	Genobioinfo Cluster: How to use
cyvcf2	cyvcf2 is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.	Genobioinfo Cluster: How to use
d2SBin	Improving the binning of metagenomic contigs on d2S oligonucleotide frequency dissimilarity	Genobioinfo Cluster: Ask for Install
dadi	dadi implements a method for demographic inference from genetic data, based on a diffusion approximation to the allele frequency spectrum.	Genobioinfo Cluster: How to use
DALIGNER	The commands below permit one to find all significant local alignments between reads encoded in Dazzler database. The assumption is that the reads are from a PACBIO RS IIlong read sequencer.	Genobioinfo Cluster: Ask for Install
DamageProfiler	A Java based tool to determine damage patterns on ancient DNA as a replacement for mapDamage. DamageProfiler calculates damage profiles of mapped reads and provides a graphical as well as text based representation. It creates damage plots fragment length distribution read identity distribution base frequency table of reference table of different base misincorporations and their occurrences	Genobioinfo Cluster: How to use
DAmar	Long read QC, assembly and scaffolding pipeline for PacBio or Oxford Nanopore long-read sequencing data. T he pipeline produces a number of QC metrics at various stages as well as incorporating further technologies including Bionano, 10x and HiC data to scaffold the created contigs. DAmar, is a hybrid of the earlier Marvel, Dazzler, and Daccord systems of the Eugene Myers lab.	Genobioinfo Cluster: Ask for Install
DANPOS2	A toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing, version 2	Genobioinfo Cluster: Ask for Install
DARIC	A complete framework for identifying quantitatively differential compartments from Hi-C and Micro-C data. `DARIC`, or Differential Analysis for genomic Regions' Interaction with Compartments, is a computational framework to identify the quantitatively differential compartments from Hi-C-like data. For more details about the design and implementation of the framework, please check our paper published at BMC Genomics.	Genobioinfo Cluster: How to use
DAS_Tool	An automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.	Genobioinfo Cluster: Ask for Install
Dasel	Dasel (short for data-selector) allows you to query and modify data structures using selector strings.	Genobioinfo Cluster: Ask for Install
datamash	GNU datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files.	Genobioinfo Cluster: default system
DATES	DATES (Distribution of Ancestry Tracts of Evolutionary Signals) is a method to estimate the time of admixture in ancient DNA samples described in Narasimhan, Patterson et al. 2018	Genobioinfo Cluster: How to use
DAZZ_DB	To facilitate the multiple phases of the dazzler assembler, we organize all the read data into what is effectively a "database" of the reads and their meta-information.	Genobioinfo Cluster: Ask for Install
dbcAmplicons	Analysis of Double Barcoded Illumina Amplicon Data.	Genobioinfo Cluster: Ask for Install
dbCAN3	run_dbcan (dbCAN3) is the standalone version of the dbCAN3 annotation tool for automated CAZyme annotation.	Genobioinfo Cluster: How to use
DBG2OLC	The genome assembler that reduces the computational time of human genome assembly from 400,000 CPU hours to 2,000 CPU hours, utilizing long erroneous 3GS sequencing reads and short accurate NGS sequencing reads.	Genobioinfo Cluster: Ask for Install
DBSCAN-SWA	An integrated tool for rapid prophage detection and annotation.	Genobioinfo Cluster: Ask for Install
DeChat	Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat.	Genobioinfo Cluster: How to use
decOM	decOM is a high-accuracy microbial source tracking method that is suitable for contamination quantification in paleogenomics, namely the analysis of collections of possibly contaminated ancient oral metagenomic data sets.	Genobioinfo Cluster: How to use
DeconSeq	Detect and remove contaminations from your sequence data.	Genobioinfo Cluster: Ask for Install
DECX	This is the DECX (DEC eXtended) model for historical biogeographic inference	Genobioinfo Cluster: Ask for Install
DeDup	A merged read deduplication tool capable to perform merged read deduplication on single end data.	Genobioinfo Cluster: Ask for Install
Deepbinner	Deepbinner is a tool for demultiplexing barcoded Oxford Nanopore sequencing reads. It does this with a deep convolutional neural network classifier, using many of the architectural advances that have proven successful in image classification. Unlike other demultiplexers (e.g. Albacore and Porechop), Deepbinner identifies barcodes from the raw signal (a.k.a. squiggle) which gives it greater sensitivity and fewer unclassified reads.	Genobioinfo Cluster: Ask for Install
DeepSignal	Detecting methylation using signal-level features from Nanopore sequencing reads.	Genobioinfo Cluster: Ask for Install
DeepTMHMM	A Deep Learning Model for Transmembrane Topology Prediction and Classification	Genobioinfo Cluster: How to use
deepTools	Tools to process and analyze deep sequencing data.	Genobioinfo Cluster: How to use
DeepVariant	DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.	Genobioinfo Cluster: How to use
Delly	DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.	Genobioinfo Cluster: How to use
demuxlet	Genetic multiplexing of barcoded single cell RNA-seq	Genobioinfo Cluster: How to use
DENTIST	DENTIST is a sensitive, highly-accurate and automated pipeline method to close gaps in (short read) assemblies with long reads.	Genobioinfo Cluster: Ask for Install
DESMAN	De novo Extraction of Strains from MetAgeNomes.	Genobioinfo Cluster: Ask for Install
detettore	A program to detect transposable element polymorphisms	Genobioinfo Cluster: Ask for Install
devCellPy	devCellPy is a Python package designed for hierarchical multilayered classification of cells based on single-cell RNA-sequencing (scRNA-seq).	Genobioinfo Cluster: How to use
DFE-alpha	DFE-alpha was initially written to estimate the distribution of fitness effects (DFE) of new deleterious mutations using within-species nucleotide polymorphism data.	Genobioinfo Cluster: How to use
DIAMOND	Accelerated BLAST compatible local sequence aligner.	Genobioinfo Cluster: How to use
DIAMOND2GO	Diamond2GO is a set of tools that can rapidly assign gene ontology and perform enrichment for functional genomics.	Genobioinfo Cluster: How to use
diffTF	Quantification of Differential Transcription Factor Activity and Multiomics-Based Classification into Activators and Repressor.	Genobioinfo Cluster: How to use
DinuQ	The DinuQ (Dinucleotide Quantification) Python3 package provides a range of metrics for quantifying nucleotide, dinucleotide and synonymous codon representation in genetic sequences.	Genobioinfo Cluster: Ask for Install
Discovar	Assemble genomes and find variants with DISCOVAR & DISCOVAR de novo	Genobioinfo Cluster: Ask for Install
DIYABC	A user-friendly approach to Approximate Bayesian Computation for inference on population history using molecular markers.	Genobioinfo Cluster: Ask for Install
DLCpar	DLCpar is a reconciliation method for inferring gene duplications, losses, and coalescence (accounting for incomplete lineage sorting).	Genobioinfo Cluster: How to use
drap	De novo RNA-seq Assembly Pipeline	strong>Genobioinfo Cluster: How to use
dRep	dRep is a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.	Genobioinfo Cluster: How to use
DSK	DSK is a k-mer counting software, similar to Jellyfish. DSK supports large values of k, and runs with (almost-)arbitrarily low memory usage and reasonably low temporary disk usage. DSK can count k-mers of large Illumina datasets on laptops and desktop computers.	Genobioinfo Cluster: Ask for Install
Dsuite	Fast calculation of Paterson's D (ABBA-BABA) and the f4-ratio statistics across many populations/species	Genobioinfo Cluster: How to use
dysgu	dysgu-SV is a collection of tools for calling structural variants using short or long reads.	Genobioinfo Cluster: How to use
E2P2	Ensemble Enzyme Prediction Pipeline.	Genobioinfo Cluster: How to use
Eagle	The Eagle software estimates haplotype phase either within a genotyped cohort or using a phased reference panel.	Genobioinfo Cluster: Ask for Install
EarlGrey	A fully automated TE curation and annotation pipeline.	Genobioinfo Cluster: How to use
ecoPCR	ecoPCR is an electronic PCR software developed by LECAand Helix-Project . It helps you to estimate Barcode primers quality. In conjunction with OBItools, you can postprocess ecoPCR output to compute barcode coverage and barcode speci?city.	Genobioinfo Cluster: How to use
ecoPrimers	ecoPrimer is a barcoding software which is written in C language. It finds universal primers from a set of input DNA sequences by finding conserved regions without "a priori" on candidate sequences. It also evaluates the quality of the primers and barcode regions by measuring the "barcode specificity" and "barcode coverage" indices	Genobioinfo Cluster: How to use
EDirect	Entrez Direct (EDirect) provides access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window	Genobioinfo Cluster: How to use
EDTA	This package is developed for automated whole-genome de-novo TE annotation and benchmarking the annotation performance of TE libraries.	Genobioinfo Cluster: How to use
EEMS	EEMS method for analyzing and visualizing spatial population structure from geo-referenced genetic samples.	Genobioinfo Cluster: How to use
EffectorP	EffectorP is a machine learning method for fungal effector prediction in secretomes and has been trained to distinguish secreted proteins from secreted effectors in plant-pathogenic fungi.	Genobioinfo Cluster: How to use
EGA_download_client	The EgaDemoClient is a JAVA based data streamer that enables EGA account holders to securely download files and datasets, either through an interactive shell (IS) or using direct command line mode (DCLM).	Genobioinfo Cluster: Ask for Install
EggLib	EggLib is a C++/Python library and program package for evolutionary genetics and genomics.	Genobioinfo Cluster: Ask for Install
eggNog-mapper	eggnog-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database to transfer functional information from fine-grained orthologs only.	Genobioinfo Cluster: How to use
Eigen	Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.	Genobioinfo Cluster: Ask for Install
Eigensoft	The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006).	Genobioinfo Cluster: How to use
ELAI	The software performs local ancestry inference for admixed individuals.	Genobioinfo Cluster: How to use
elPrep	elPrep is a high-performance tool for analyzing .sam/.bam files (up to and including variant calling) in sequencing pipelines.	Genobioinfo Cluster: How to use
eLSA	Extended Local Similarity Analysis -- Finding Time-Dependent Associations in Time Series Datasets	Genobioinfo Cluster: Ask for Install
EMA	EMA uses a latent variable model to align barcoded short-reads (such as those produced by 10x Genomics' sequencing platform).	Genobioinfo Cluster: Ask for Install
emacs	Universal text editor.	Genobioinfo Cluster: How to use
EMBOSS	EMBOSS is "The European Molecular Biology Open Software Suite". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.	Genobioinfo Cluster: How to use
EMMAX	EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each loci explains only a small fraction of complex traits, which allows us to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.	Genobioinfo Cluster: How to use
Emu	Emu is a relative abundance estimator for 16S genomic sequences. The method is optimized for error-prone full-length reads, but can also be utilized for short-read data.	Genobioinfo Cluster: How to use
enaBrowserTools	enaBrowserTools is a set of scripts that interface with the ENA web services to download data from ENA easily, without any knowledge of scripting required.	Genobioinfo Cluster: How to use
Ensembl-API	Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlying database schemes and more specific application programmes. The APIs aim to encapsulate the database layout by providing efficient high-level access to data tables and isolate applications from data layout changes. Ensembl's API is written in Perl	Genobioinfo Cluster: Ask for Install
EPIK	EPIK is a program for rapid alignment-free phylogenetic placement.	Genobioinfo Cluster: How to use
ErmineJ	ErmineJ performs analyses of gene sets in high-throughput genomics data such as gene expression profiling studies.	Genobioinfo Cluster: Ask for Install
ERPIN	ERPIN (Easy RNA Profile IdentificatioN) is an RNA motif search program developped by Daniel Gautheret and André Lambert.	Genobioinfo Cluster: Ask for Install
ERVmap	ERVmap is one part curated database of human proviral ERV loci and one part a stringent algorithm to determine which ERVs are transcribed in their RNA seq data.	Genobioinfo Cluster: Ask for Install
ESM-2	This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR), including our state-of-the-art ESM-2 and ESMFold, as well as MSA Transformer, ESM-1v for predicting variant effects and ESM-IF1 for inverse folding	Genobioinfo Cluster: How to use
est-sfs	est-sfs implements a maximum likelihood method to infer the unfolded site frequency spectrum (the uSFS) and ancestral state probabilities for DNA sequence data.	Genobioinfo Cluster: Ask for Install
ETE	A Python framework for the analysis and visualization of trees.	Genobioinfo Cluster: in Python-3.11.1 and How to use
EuGeneEP	EuGene is an open integrative gene finder for eukaryotic and prokaryotic genomes. EuGene-EP (Eukaryote Pipeline) facilitates the application of EuGene on eukaryote genomes.	Genobioinfo Cluster: Ask for Install
EukCC	EukCC is a completeness and contamination estimator for metagenomic assembled microbial eukaryotic genomes.	Genobioinfo Cluster: Ask for Install
EukRep	Classification of Eukaryotic and Prokaryotic sequences from metagenomic datasets.	Genobioinfo Cluster: How to use
EUPAN	Toolkit that integrates various software in order to build eukaryotic pangenomes.	Genobioinfo Cluster: Ask for Install
eva-sub-cli	The eva-sub-cli tool is a command line interface tool for data validation and upload.	Genobioinfo Cluster: How to use
evalAdmix	evalAdmix allows to evaluate the results of an admixture analysis (i.e. the result of applying ADMIXTURE, STRUCTURE, NGSadmix and similar).	Genobioinfo Cluster: How to use
EVE	EVE is a set of protein-specific models providing for any single amino acid mutation of interest a score reflecting the propensity of the resulting protein to be pathogenic.	Genobioinfo Cluster: How to use
EviAnn	EviAnn (Evidence Annotation) is novel genome annotation software. It is purely evidence-based. EviAnn derives protein-coding gene and long non-coding RNA annotations from RNA-seq data and/or transcripts, and alignments of proteins from related species. EviAnn outputs annotations in GFF3 format. EviAnn does not require genome repeats to be soft-masked prior to running annotation.	Genobioinfo Cluster: How to use
EVidenceModeler (EVM)	The EVidenceModeler (aka EVM) software combines ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. EVM provides a flexible and intuitive framework for combining diverse evidence types into a single automated gene structure annotation system.	Genobioinfo Cluster: How to use
EvoBind	EvoBind (v2) designs novel peptide binders based only on a protein target sequence. It is not necessary to specify any target residues within the protein sequence or the length of the binder (although this is possible). Cyclic binder design is also possible.	Genobioinfo Cluster: How to use
ExaML	Exascale Maximum Likelihood (ExaML) code for phylogenetic inference using MPI.	Genobioinfo Cluster: How to use
Exonerate	A generic tool for sequence alignment.	Genobioinfo Cluster: How to use
eXpress	eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences.	Genobioinfo Cluster: How to use
f5c	Ultra-fast methylation calling and event alignment tool for nanopore sequencing data.	Genobioinfo Cluster: How to use
FABuLOUS	A gap-closing software tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly. Initially called TGS-GapCloser.	Genobioinfo Cluster: Ask for Install
FALCON	Falcon: a set of tools for fast aligning long reads for consensus and assembly	Genobioinfo Cluster: Ask for Install
FALCON_unzip	Making diploid assembly becomes common practice for genomic study	Genobioinfo Cluster: Ask for Install
FALCON-Phase	FALCON-Phase integrates PacBio long-read assemblies with Phase Genomics Hi-C data to create phased, diploid, chromosome-scale scaffolds.	Genobioinfo Cluster: Ask for Install
FaMoz	FaMoz, a software written in the C language and in TclTk, uses likelihood calculation and simulation to perform parentage studies with codominant, dominant, cytoplasmic markers or combinations of the different types.	Genobioinfo Cluster: Ask for Install
FAMSA	Algorithm for large-scale multiple sequence alignments (400k proteins in 2 hours and 8BG of RAM)	Genobioinfo Cluster: Ask for Install
FAN-C	Framework for the ANalysis of C-like data.	Genobioinfo Cluster: How to use
FaST-LMM	FaST-LMM (Factored Spectrally Transformed Linear Mixed Models) is a program for performing genome-wide association studies (GWAS) on large data sets.	Genobioinfo Cluster: Ask for Install
Fast-Plast	Fast-Plast is a pipeline that leverages existing and novel programs to quickly assemble, orient, and verify whole chloroplast genome sequences.	Genobioinfo Cluster: Ask for Install
FASTA	FASTA is a sequence similarity search tool which uses heuristics for fast local alignment searching.	Genobioinfo Cluster: How to use
FASTA Composition	finds the overall composition of sequences in a FASTA file	Genobioinfo Cluster: How to use
FASTA_Length	FASTA Length finds the lengths of sequences in a FASTA file.	Genobioinfo Cluster: How to use
fasta_validator	C code to validate a fasta file.	Genobioinfo Cluster: How to use
FastaGrep	FastaGrep is a tool for searching oligonucleotide binding sites from FastA genomic sequences. It can do both match/mismatch based and thermodynamic binding energy searches.	Genobioinfo Cluster: Ask for Install
FastANI	FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI)	Genobioinfo Cluster: How to use
fastGLOBETROTTER	fastGLOBETROTTER is an updated version of the same GLOBETROTTER model, using the same input, but that is ~4-20 times faster than GLOBETROTTER without sacrificing accuracy. fastGLOBETROTTER: an efficient method to identify, date and describe admixture events using haplotype information	Genobioinfo Cluster: How to use
fastix	A simple command line tool to add prefixes to FASTA headers.	Genobioinfo Cluster: How to use
fastk-medians	A set of utilities to calculate the median number of times the k-mers in a sequence of interest occur across the whole set.	Genobioinfo Cluster: How to use
FastME	FastME provides distance algorithms to infer phylogenies. FastME is based on balanced minimum evolution, which is the very principle of NJ. FastME improves over NJ by performing topological moves using fast, sophisticated algorithms.	Genobioinfo Cluster: How to use
fastNGSadmix	Program for infering admixture proportions and doing PCA with a single NGS sample. Inferences based on reference panel.	Genobioinfo Cluster: Ask for Install
fastp	A tool designed to provide fast all-in-one preprocessing for FastQ files.	Genobioinfo Cluster: How to use
fastPHASE	A tool for genotype imputation and estimating missing haplotypes.	Genobioinfo Cluster: Ask for Install
fastplong	Ultra-fast preprocessing and quality control for long-read sequencing data.	Genobioinfo Cluster: How to use
fastprofkernel	fastprofkernel is a Debian package that uses an accelerated version of the original profile kernel <1> to automatically train SVM based classification models. It can assign user-defined classes to so far uncharacterized proteins.	Genobioinfo Cluster: How to use
FastQ Screen	FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.	Genobioinfo Cluster: How to use
fastq_illumina_filter	This program can filter FASTQ files produced by CASAVA 1.8, and keep/discard reads based on this filter flag.	Genobioinfo Cluster: How to use
fastq-tools	A collection of small and efficient programs for performing some common and uncommon tasks with FASTQ files.	Genobioinfo Cluster: Ask for Install
FastQC	A Quality Control application for FastQ files. FastQC is an application which takes a FastQ file and runs a series of tests on it to generate a comprehensive QC report.	Genobioinfo Cluster: How to use
fastQValidator	The fastQValidator validates the format of fastq files	Genobioinfo Cluster: Ask for Install
FastSimBac	FastSimBac is a simulator of the coalescent process with bacterial recombination that simulates genealogies spatially across chromosomes as a Markov process.	Genobioinfo Cluster: Ask for Install
fastsimcoal2	Fast sequential Markov coalescent simulation of genomic data under complex evolutionary models	Genobioinfo Cluster: How to use
fastStructure	fastStructure is an algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.	Genobioinfo Cluster: How to use
FastTree	FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.	Genobioinfo Cluster: How to use
FASTX-Toolkit	The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.	Genobioinfo Cluster: How to use
FBAT	FBAT is an acronym for Family-Based Association Tests in genetic analyses. Family-based association designs, as opposed to case-control study designs, are particularly attractive, since they test for linkage as well as association, avoid spurious associations caused by admixture of populations, and are convenient for investigators interested in refining linkage findings in family samples.	Genobioinfo Cluster: Ask for Install
FCS	The NCBI Foreign Contamination Screen (FCS) is a tool suite (FCS-adaptator et FCS-gx) for identifying and removing contaminant sequences in genome assemblies.	Genobioinfo Cluster: How to use
FCS-GX	FCS-GX detects contamination from foreign organisms in genome sequences. This tool is one module within the NCBI Foreign Contamination Screening (FCS) program suite.	Genobioinfo Cluster: How to use
FEELnc	FlExible Extraction of LncRNA.	Genobioinfo Cluster: How to use
FFmpeg	FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created.	Genobioinfo Cluster: How to use
fgbio	A set of tools to analyze genomic data with a focus on Next Generation Sequencing.	Genobioinfo Cluster: Ask for Install
FigTree	FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures.	Genobioinfo Cluster: How to use
Filtlong	Filtlong is a tool for filtering long reads by quality.	Genobioinfo Cluster: How to use
FindingOverCovRegions	FindOverCovRegions.py search for genomic regions with abnormal read coverage (e.g. depth). To do so, this program requieres begraph-like file (e.g. bedtools genomecov per-base reports) where for each position of the genome, the coverage depth is reported (even 0 values).	Genobioinfo Cluster: Ask for Install
fineRADstructure	A complete, easy to use, and fast population inference package for RAD-seq data.	Genobioinfo Cluster: Ask for Install
fineSTRUCTURE	fineSTRUCTURE is a fast and powerful algorithm for identifying population structure using dense sequencing data.	Genobioinfo Cluster: How to use
FLAIR	FLAIR (Full-Length Alternative Isoform analysis of RNA) for the correction, isoform definition, and alternative splicing analysis of noisy reads. FLAIR has primarily been used for nanopore cDNA, native RNA, and PacBio sequencing reads.	Genobioinfo Cluster: Ask for Install
FLAMES	Full-length transcriptome splicing and mutation analysis.	Genobioinfo Cluster: How to use
flare	The flare program uses a set of reference haplotypes to infer the ancestry of each allele in a set of admixed study samples. The flare program is fast, accurate, and memory-efficient.	Genobioinfo Cluster: How to use
FLAS	FLAS is software that makes self-correction for PacBio long reads with fast speed and high throughput.	Genobioinfo Cluster: Ask for Install
FLASH	FLASH, Fast Length Adjustment of SHort reads, is a very accurate fast tool to merge paired-end reads from fragments that are shorter than twice the length of reads. The extended length of reads has a significant positive impact on improvement of genome assemblies.	Genobioinfo Cluster: How to use
Flexbar	Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases read mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.	Genobioinfo Cluster: How to use
Flye	Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies.	Genobioinfo Cluster: How to use
FMLRC	FMLRC, or FM-index Long Read Corrector, is a tool for performing hybrid correction of long read sequencing using the BWT and FM-index of short-read sequencing data.	Genobioinfo Cluster: Ask for Install
FMLRC2	FMLRC2 performs error correction/polishing of long erroneous sequences with accurate short reads. As such, it can be used as both an error-correction tool <1> for raw long reads (ex. Oxford Nanopore) and a polishing tool <2> for de novo assemblies.	Genobioinfo Cluster: How to use
fpa	Filter Pairwise Alignment	Genobioinfo Cluster: Ask for Install
fqtools	fqtools is a software suite for fast processing of FASTQ files; Various file manipulations are supported.	Genobioinfo Cluster: Ask for Install
FragGeneScan	FragGeneScan is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.	Genobioinfo Cluster: Ask for Install
FragGeneScanRs	FragGeneScanRs is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes. It is a re-implementation of FragGeneScan in Rust.	Genobioinfo Cluster: How to use
fragmatic	Simple program for in silico restriction digest of genomic sequences, to simulate RAD-family NGS library prep methods.	Genobioinfo Cluster: Ask for Install
FrameBot	RDP FrameBot is a tool for correcting frameshift errors caused by insertions and deletions in DNA sequences.	Genobioinfo Cluster: Ask for Install
FrameDP	Sensitive peptide detection on noisy matured sequences. Available with command line interface on the cluster.	Genobioinfo Cluster: Ask for Install
FreeBayes	FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.	Genobioinfo Cluster: How to use
Funannotate	Funannotate is a genome prediction, annotation, and comparison software package.	Genobioinfo Cluster: How to use
G-PhoCS	G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences.	Genobioinfo Cluster: How to use
g3d	Genomics 3D visualizer tool sets. `g3d` is a binary file format for storing genomic 3D structure data, `g3d` is short for genomic 3D format.	Genobioinfo Cluster: How to use
G4Hunter	Re-evaluation of G-quadruplex propensity with G4Hunter. G4-Hunter : un nouvel algorithme pour la prédiction des G-quadruplexes. G-quadruplexes are involved in gene expression regulation, DNA replication, RNA processing, and genome maintenance
GAAS	Genome Assembly Annotation Service: Suite of tools related to Genome Assembly Annotation Service tasks.	Genobioinfo Cluster: Ask for Install
GALBA	GALBA is a pipeline for fully automated prediction of protein coding gene structures with AUGUSTUS in novel eukaryotic genomes for the scenario where high quality proteins from a closely related species are available.	Genobioinfo Cluster: Ask for Install
Gamma-SMC	This is an alternative and an upgrade of the widely used PSMC method, which infers population size trajectories from VCF files.	Genobioinfo Cluster: How to use
GapCloser	The GapCloser is designed to close the gaps emerging during the scaffolding process by SOAPdenovo, using the abundant pair relationships of short reads.	Genobioinfo Cluster: How to use
gappa	A toolkit for analyzing and visualizing phylogenetic (placement) data.	Genobioinfo Cluster: How to use
gapseq	Informed prediction and analysis of bacterial metabolic pathways and genome-scale networks.	Genobioinfo Cluster: How to use
gargammel	gargammel is an ancient DNA simulator	Genobioinfo Cluster: How to use
GARLI	GARLI, Genetic Algorithm for Rapid Likelihood Inference is a program for inferring phylogenetic trees.	Genobioinfo Cluster: Ask for Install
GATK	The GATK is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.	Genobioinfo Cluster: How to use
Gblocks	Gblocks is a computer program written in ANSI C language that eliminates poorly aligned positions and divergent regions of an alignment of DNA or protein sequences. These positions may not be homologous or may have been saturated by multiple substitutions and it is convenient to eliminate them prior to phylogenetic analysis.	Genobioinfo Cluster: How to use
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
gcloud	gcloud CLI is a set of tools for creating and managing Google Cloud resources.	Genobioinfo Cluster: How to use
gCluster	The gCluster algorithm is a general clustering method that predicts clusters of any biological word or combination of them, relying only on the DNA sequence and the statistical significance. When using CG as word, gCluster works similarly to CpGcluster, our method to predict CpG islands. More broadly, gCluster has much in common with wordCluster but uses an improved distance model.	Genobioinfo Cluster: How to use
GCTA	GCTA (Genome-wide Complex Trait Analysis) was originally designed to estimate the proportion of phenotypic variance explained by genome- or chromosome-wide SNPs for complex traits (the GREML method), and has subsequently extended for many other analyses to better understand the genetic architecture of complex traits.	Genobioinfo Cluster: How to use
GDAL	a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation.	Genobioinfo Cluster: Ask for Install
GEM	GEM is a scientific software for studying protein-DNA interaction at high resolution using ChIP-seq/ChIP-exo data. It can also be applied to CLIP-seq and Branch-seq data.	Genobioinfo Cluster: How to use
GEM-library	A set of very optimized tools for indexing/querying huge genomes/files.	Genobioinfo Cluster: How to use
GEM-Tools	GEM-Tools is a C API and a Python module to support and simplify usage of the GEM Mapper.	Genobioinfo Cluster: How to use
gemBS	gemBS is a high performance bioinformatic pipeline designed for highthroughput analysis of DNA methylation data from whole genome bisulfites sequencing data (WGBS). It combines GEM3, a high performance read aligner and bs_call, a high performance variant and methyation caller, into a streamlined and efficient pipeline for bisulfite sequence analysis.	Genobioinfo Cluster: How to use
GEMMA	GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS).	Genobioinfo Cluster: How to use
GeMoMa	Gene Model Mapper (GeMoMa) is a homology-based gene prediction program.	Genobioinfo Cluster: Ask for Install
geneid	geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure.	Genobioinfo Cluster: How to use
GeneMark-ES	Unsupervised training is an important feature of the GeneMark-ES algorithm that identifies protein coding genes in eukaryotic genomes. This is the only eukaryotic gene finder that can perform gene prediction without curated training sets.	Genobioinfo Cluster: How to use
GeneMark-ET	a semi-supervised version of GeneMark-ES, called GeneMark-ET that uses RNA-Seq reads to improve training.	Genobioinfo Cluster: How to use
Genepop	Population genetics software that computes estimates of F-statistics.	Genobioinfo Cluster: Ask for Install
GeneSeqer	Sensitive spliced-alignment of cDNAs or proteins.	Genobioinfo Cluster: Ask for Install
Genewise	Wise2 is a package focused on comparisons of biopolymers, commonly DNA sequence and protein sequence.	Genobioinfo Cluster: How to use
GenMap	Fast and Exact Computation of Genome Mappability.	Genobioinfo Cluster: How to use
genomegaMap	Within-species genome-wide dN/dS estimation from very many genomes.	Genobioinfo Cluster: How to use
GenomeScope	Fast genome analysis from unassembled short reads	Genobioinfo Cluster: How to use
GenomeScope2.0	Reference-free profiling of polyploid genomes	Genobioinfo Cluster: How to use
GenomeSTRiP	Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.	Genobioinfo Cluster: Ask for Install
GenomeThreader	GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments.	Genobioinfo Cluster: How to use
GenomeTools	Collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named "gt".	Genobioinfo Cluster: How to use
GenomOrder	GenomOrder is a Nextflow pipeline reordering and renaming scaffolds from up to 5 assemblies using a reference. It is also able to produce D-Genies back-up files allowing rapid visual comparison of chromosomes of the assemblies versus the reference. These files can be uploaded and visualized with the online tool D-Genies : http://dgenies.toulouse.inra.fr/ The assembly mapping versus the reference is performed with minimap2. These assemblies can be scaffolded or not. If they are not, an option enables to scaffold them according to the reference. The pipeline produces D-Genies back-up file for a user defined list of reference chromosomes. The chromosome file contains one reference chromosome name per line.	Genobioinfo Cluster: Ask for Install
Gerbil	A basic task in bioinformatics is the counting of k-mers in genome strings.	Genobioinfo Cluster: Ask for Install
GEVA	Genealogical Estimation of Variant Age. We have developed a method for estimating the age of genetic variants; that is, the time of origin of an allele through mutation at a single locus. Our approach, which we refer to as the Genealogical Estimation of Variant Age (GEVA), is similar to existing methods that involve coalescent modeling to infer the time to the most recent common ancestor (TMRCA) between individual genomes <13, 23, 24>. However, these methods typically operate on a discretized timescale <13>, utilize only a fraction of the information available in larger sample data <25>, or employ approximations to overcome computational complexity <14, 15, 26>.	Genobioinfo Cluster: Ask for Install
GFAffix		Genobioinfo Cluster: How to use
gfatools	gfatools is a set of tools for manipulating sequence graphs in the GFA or the rGFA format. It has implemented parsing, subgraph and conversion to FASTA/BED.	Genobioinfo Cluster: How to use
GfaViz	Graphical interactive tool for the visualization of sequence graphs in GFA format.	Genobioinfo Cluster: How to use
gff3sort	A Perl Script to sort gff3 files and produce suitable results for tabix tools	Genobioinfo Cluster: Ask for Install
gff3toembl	Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI	Genobioinfo Cluster: Ask for Install
gffcompare	gffcompare can be used to compare, merge, annotate and estimate accuracy of one or more GFF files (the “query” files), when compared with a reference annotation (also provided as GFF).	Genobioinfo Cluster: How to use
gffread	GFF/GTF utility providing format conversions, region filtering, FASTA sequence extraction and more.	Genobioinfo Cluster: How to use
ggCaller	A de Bruijn graph-based gene-caller and pangenome analysis tool.	Genobioinfo Cluster: How to use
gget	gget enables efficient querying of genomic reference databases.	Genobioinfo Cluster: Ask for Install
gh-cli	gh is GitHub on the command line. It brings pull requests, issues, and other GitHub concepts to the terminal next to where you are already working with git and your code.	Genobioinfo Cluster: Ask for Install
GHC	GHC is a state-of-the-art, open source, compiler and interactive environment for the functional language Haskell	Genobioinfo Cluster: Ask for Install
gIMble	A genome-wide IM blockwise likelihood estimation toolkit	Genobioinfo Cluster: Ask for Install
GINGER	GINGER is a tool that is implemented an integrated method for gene structure prediction in higher eukaryotes.	Genobioinfo Cluster: How to use
GlimmerHMM	GlimmerHMM is a new gene finder based on a Generalized Hidden Markov Model (GHMM). Although the gene finder conforms to the overall mathematical framework of a GHMM, additionally it incorporates splice site models adapted from the GeneSplicer program and a decision tree adapted from GlimmerM. It also utilizes Interpolated Markov Models for the coding and noncoding models . Currently, GlimmerHMM's GHMM structure includes introns of each phase, intergenic regions, and four types of exons (initial, internal, final, and single).	Genobioinfo Cluster: How to use
GLIMPSE	GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.	Genobioinfo Cluster: How to use
GMAP-GSNAP	GMAP: A Genomic Mapping and Alignment Program for mRNA and EST SequencesGSNAP: Genomic Short-read Nucleotide Alignment Program	Genobioinfo Cluster: How to use
Gmove	Gmove is a genome annotation tool. This combiner takes as input mapping of RNA-seq or protein or ab initio data.	Genobioinfo Cluster: Ask for Install
Goalign	Goalign is a set of command line tools to manipulate multiple alignments.	Genobioinfo Cluster: How to use
goatools	Python library to handle Gene Ontology (GO) terms.	Genobioinfo Cluster: How to use
GONE	This program calculates and uses linkage disequilibrium at genomic marker loci to infer the effective population size trajectories over a period of about 100-200 hundred generations back in time.	Genobioinfo Cluster: How to use
Gotree	Gotree is a set of command line tools and an API to manipulate phylogenetic trees.	Genobioinfo Cluster: How to use
Gradle	Gradle is the open source build system of choice for Java, Android, and Kotlin developers.	Genobioinfo Cluster: How to use
GraffiTE	GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.	Genobioinfo Cluster: How to use
GrAnnoT	GrAnnoT is an annotation transfer tool for pangenome graphs.	Genobioinfo Cluster: How to use
GraphAligner	Seed-and-extend program for aligning long error-prone reads to genome graphs.	Genobioinfo Cluster: How to use
GraPhlAn	GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.	Genobioinfo Cluster: Ask for Install
GraphMap	A highly sensitive and accurate mapper for long, error-prone reads.	Genobioinfo Cluster: Ask for Install
graphtyper	graphtyper is a graph-based variant caller capable of genotyping population-scale short read data sets.	Genobioinfo Cluster: How to use
GraphUnzip	Unzip assembly graphs with Hi-C data and/or long reads.	Genobioinfo Cluster: Ask for Install
grenedalf	grenedalf is a collection of commands for working with pool sequencing population genetic data.	Genobioinfo Cluster: How to use
GRIDSS	GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.	Genobioinfo Cluster: How to use
Grinder	Grinder is a versatile open-source bioinformatic tool to create simulated omic shotgun and amplicon sequence libraries for all main sequencing platforms.	Genobioinfo Cluster: Ask for Install
GROMACS	GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.	Genobioinfo Cluster: How to use
GSAlign	An ultra-fast sequence alignment algorithm for intra-species genome comparison.	Genobioinfo Cluster: How to use
GTDB-Tk	GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.	Genobioinfo Cluster: How to use
GTFtools	GTFtools provides a set of functions to analyze various modes of gene models.	Genobioinfo Cluster: How to use
Gtools	GTOOL is a program for transforming sets of genotype data for use with the programs SNPTEST and IMPUTE.	Genobioinfo Cluster: Ask for Install
Gubbins	Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences. Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.	Genobioinfo Cluster: How to use
hal2vg	Convert HAL to vg-compatible sequence graph.	Genobioinfo Cluster: How to use
HANNO	Efficient High-throughput ANNOtation of protein coding genes in eukaryote genomes.	Genobioinfo Cluster: How to use
hap-ibd	The hap-ibd program detects identity-by-descent (IBD) segments and homozygosity-by-descent (HBD) segments in phased genotype data.	Genobioinfo Cluster: How to use
hap.py		Genobioinfo Cluster: How to use
Hap10	The goal is to reconstruct accurate and long haplotypes polyploid genome using linked reads.	Genobioinfo Cluster: Ask for Install
HapCUT2	Software tools for haplotype assembly from sequence data	Genobioinfo Cluster: How to use
hapflk	hapflk is a software implementing the hapFLK <1> and FLK <2> tests for the detection of selection signatures based on multiple population genotyping data.	Genobioinfo Cluster: How to use
HapHiC	HapHiC is an allele-aware scaffolding tool that uses Hi-C data to scaffold haplotype-phased genome assemblies into chromosome-scale pseudomolecules.	Genobioinfo Cluster: How to use
HAPHPIPE	NGS viral assembly and population genetics.	Genobioinfo Cluster: How to use
Haplogrep3	Haplogrep is a command-line tool for mtDNA haplogroup classification.	Genobioinfo Cluster: How to use
HaploHiC	Comprehensive haplotype division of Hi-C PE-reads based on local contacts ratio.	Genobioinfo Cluster: Ask for Install
HaploMerger2	<HaploMerger2 (HM2) is an important upgrade over HaploMerger. HM2 is an easy-to-use automated pipeline for improving genome assembly in the post-assembly stage. It consists of a set of executables as well as wrappers for several third-part software.	Genobioinfo Cluster: Ask for Install
Haplostrips	Haplostrips produce plots that depict variants in a genomic window among different samples. Visualize similarities between haplotypes with respect to a reference haplotype through haplotype clustering and sorting, useful for revealing hidden population structure.	Genobioinfo Cluster: How to use
Haploview	Haploview is designed to simplify and expedite the process of haplotype analysis by providing a common interface to several tasks relating to such analyses.	Genobioinfo Cluster: Ask for Install
Hapo-G	Hapo-G is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.	Genobioinfo Cluster: How to use
HarvestTools	HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations.	Genobioinfo Cluster: How to use
HASLR	HASLR is a tool for rapid genome assembly of long sequencing reads. HASLR is a hybrid tool which means it requires long reads generated by Third Generation Sequencing technologies (such as PacBio or Oxford Nanopore) together with Next Generation Sequencing reads (such as Illumina) from the same sample.	Genobioinfo Cluster: Ask for Install
HAT-phasing	HAT is a haplotype assembly tool that use NGS and TGS data along a reference genome to reconstruct haplotypes.	Genobioinfo Cluster: Ask for Install
Hclust2	Hclust2 is a handy tool for plotting heat-maps with several useful options to produce high quality figures that can be used in publication.	Genobioinfo Cluster: How to use
hcluster_sg	A hierarchical clustering software for sparse graphs	Genobioinfo Cluster: Ask for Install
HDFView	HDFView is a visual tool for browsing and editing HDF4 and HDF5 files.	Genobioinfo Cluster: Ask for Install
HECIL	Hybrid Error Correction of Long Reads using Iterative Learning	Genobioinfo Cluster: Ask for Install
HELEN	HELEN (Homopolymer Encoded Long-read Error-corrector for Nanopore) uses a Recurrent-Neural-Network (RNN) based Multi-Task Learning (MTL) model that can predict a base and a run-length for each genomic position using the weights generated by MarginPolish. This installation includes MarginPolish.	Genobioinfo Cluster: Ask for Install
hexamer	Find likely coding segments in DNA using composition-normalised hexamer tables.	Genobioinfo Cluster: How to use
HG-CoLoR	HG-CoLoR (Hybrid method based on a variable-order de bruijn Graph for the error Correction of Long Reads) is a hybrid method for the error correction of long reads that both aligns the short reads to the long reads, and uses a variable-order de Bruijn graph, in a seed-and-extend approach.	Genobioinfo Cluster: Ask for Install
HGT-ID	An efficient and sensitive program for detecting viral insertion sequences from known viral reference genome in the genome of human cancers.	Genobioinfo Cluster: Ask for Install
HGTector	Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.	Genobioinfo Cluster: Ask for Install
HiCAssembler	Software to assemble contigs/scaffolds into chromosomes using Hi-C data.	Genobioinfo Cluster: How to use
HiCExplorer	HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.	Genobioinfo Cluster: How to use
Hickit	Hickit is a set of tools initially developed to process diploid single-cell Hi-C data. It extracts contact pairs from read alignment, identifies phases of contacts overlapping with SNPs of known phases, imputes missing phases, infers the 3D structure of a single cell and visualizes the structure.	Genobioinfo Cluster: How to use
HiCLift	A fast and efficient tool for converting chromatin interaction data between genome assemblies.	Genobioinfo Cluster: Ask for Install
HiFiAdapterFilt	Convert .bam to .fastq and remove reads with remnant PacBio adapter sequences.	Genobioinfo Cluster: How to use
hifiasm	Hifiasm is a fast haplotype-resolved de novo assembler for PacBio Hifi reads. Unlike most existing assemblers, hifiasm starts from uncollapsed genome.	Genobioinfo Cluster: How to use
hifiasm-meta	De novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.	Genobioinfo Cluster: How to use
HISAT2	HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome).	Genobioinfo Cluster: How to use
HMMER	HMMER is a package used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs).	Genobioinfo Cluster: How to use
Homer	HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. It is a collection of command line programs for unix-style operating systems written in Perl and C++.	Genobioinfo Cluster: How to use
hssp	Create DSSP and HSSP files. A series of PDB-related databanks for everyday needs.	Genobioinfo Cluster: How to use
HTSeq	HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.	Genobioinfo Cluster: How to use
HUMAnN	HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.	Genobioinfo Cluster: How to use
HyPhy	HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses.	Genobioinfo Cluster: How to use
HyPO	HyPo (a Hybrid Polisher): Super Fast & Accurate Polisher for Long Read Genome Assemblies	Genobioinfo Cluster: How to use
i-ADHoRe	i-ADHoRe is a highly sensitive software tool to detect degenerated homology relations within and between different genomes.	Genobioinfo Cluster: How to use
IBA	Python script to assemble AHE (Anchored Hybrid Enrichment) data loci by loci. To summarize, raw sequencing reads for each species were filtered for quality using Trim Galore! v. 0.4.0 (Krueger, 2015), and assembled using the iterative baited assembly (IBA) Python script which employs USEARCH v. 7.0 (Edgar, 2010) and Bridger v. 2014-12-01 (Chang et al., 2015) to assemble loci for each taxon. MAFFT v. 7.245 (Katoh and Standley, 2013) was used to align assembled sequences, and the probe and flanking regions were separated with the Python script Extract_probe_region.py (Breinholt et al., 2018).	Genobioinfo Cluster: Ask for Install
IBDNe	The IBDNe program estimate ancestry-specific historical effective population size.	Genobioinfo Cluster: How to use
ICEscreen	ICEscreen is a bioinformatic pipeline for the detection and annotation of ICEs (Integrative and Conjugative Elements) and IMEs (Integrative and Mobilizable Elements) in Bacillota genomes.	Genobioinfo Cluster: How to use
IDR	The IDR (Irreproducible Discovery Rate) framework is a uniﬁed approach to measure the reproducibility of ﬁndings identiﬁed from replicate experiments and provide highly stable thresholds based on reproducibility.	Genobioinfo Cluster: How to use
IGV	The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.	Genobioinfo Cluster: How to use
IMPUTE5	IMPUTE 5 is a genotype imputation method that can scale to reference panels with millions of samples.	Genobioinfo Cluster: How to use
Infernal	Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).	Genobioinfo Cluster: How to use
Infomap	Multi-level network clustering based on the Map Equation.	Genobioinfo Cluster: Ask for Install
InSilicoSeq	InSilicoSeq is a sequencing simulator producing realistic Illumina reads. Primarily intended for simulating metagenomic samples, it can also be used to produce sequencing data from a single genome.	Genobioinfo Cluster: How to use
Inspector	A tool for evaluate long-read de novo assembly results.	Genobioinfo Cluster: How to use
inStrain	inStrain is python program for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification.	Genobioinfo Cluster: How to use
InterProScan	InterProScan is a tool that combines different protein signature recognition methods into one resource. No less than 14 pattern/profiles databanks can be interrogated.	Genobioinfo Cluster: How to use
IPA	Improved Phased Assembler (IPA) is the official PacBio software for HiFi genome assembly. IPA was designed to utilize the accuracy of PacBio HiFi reads to produce high-quality phased genome assemblies. IPA is an end-to-end solution, starting with input reads and resulting in a polished assembly.	Genobioinfo Cluster: How to use
IPK	IPK is a tool for computing phylo-k-mers for a fixed phylogeny.	Genobioinfo Cluster: How to use
ipyrad	An interactive toolkit for assembly and analysis of restriction-site associated genomic data sets (e.g., RAD, ddRAD, GBS) for population genetic and phylogenetic studies.	Genobioinfo Cluster: How to use
IQ-TREE	Efficient phylogenomic software by maximum likelihood	Genobioinfo Cluster: How to use
iREAD	iREAD (intron REtention Analysis and Detector)is a tool to detect intron retention(IR) events from RNA-seq datasets.	Genobioinfo Cluster: Ask for Install
IRFinder	Detecting intron retention from RNA-Seq experiments	Genobioinfo Cluster: How to use
Iris	A module which corrects the sequences of structural variant calls (currently only insertions).	Genobioinfo Cluster: How to use
IRMA	IRMA was designed for the robust assembly, variant calling, and phasing of highly variable RNA viruses. Currently IRMA is deployed with modules for influenza and ebolavirus. IRMA is free to use and parallelizes computations for both cluster computing and single computer multi-core setups.	Genobioinfo Cluster: How to use
ISEScan	A python pipeline to identify IS (Insertion Sequence) elements in genome and metagenome. ISEScan can be used to identify/annotate full-length or non-full-length IS elements in any DNA sequence but ISEScan was only tested on prokarytoic genome including draft genome and meta-genome. Among the existing tools identifying IS elements, ISEScan might be the only one that gives TIR (Terminal Inverted Repeat) sequences.	Genobioinfo Cluster: How to use
iSMC	This software extend the sequentially Markovian coalescent model to jointly infer the spatial variation in recombination rate (rho) from a single pair of unphased genomes.	Genobioinfo Cluster: Ask for Install
IsoLasso	IsoLasso is an algorithm to assemble transcripts and estimate their expression levels from RNA-Seq reads.	Genobioinfo Cluster: Ask for Install
IsoSeq	Scalable De Novo Isoform Discovery from Single-Molecule PacBio Reads.	Genobioinfo Cluster: How to use
ITSx	Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for use in environmental sequencing	Genobioinfo Cluster: How to use
ITSxpress	Software to trim the ITS region of FASTQ sequences for amplicon sequencing analysis.	Genobioinfo Cluster: How to use
IVA	Iterative Virus Assembler is a de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high and variable depth	Genobioinfo Cluster: Ask for Install
iVar	Var is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates. We implemented the following functions in iVar: (1) trimming of primers and low-quality bases, (2) consensus calling, (3) variant calling - both iSNVs and insertions/deletions, and (4) identifying mismatches to primer sequences and excluding the corresponding reads from alignment files.	Genobioinfo Cluster: How to use
Jabba	A hybrid error correction tool for sequencing reads.	Genobioinfo Cluster: Ask for Install
JAGS	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS.	Genobioinfo Cluster: How to use
Jasmine	JASMINE: Jointly Accurate Sv Merging with Intersample Network Edges. This tool is used to merge structural variants (SVs) across samples. Each sample has a number of SV calls, consisting of position information (chromosome, start, end, length), type and strand information, and a number of other values. Jasmine represents the set of all SVs across samples as a network, and uses a modified minimum spanning forest algorithm to determine the best way of merging the variants such that each merged variants represents a set of analogous variants occurring in different samples. Manual : Jasmine User Manual · mkirsche/Jasmine Wiki · GitHub Jasmine also includes a module for automating the creation of IGV screenshots of variants of interest.	Genobioinfo Cluster: How to use
JASPER	JASPER (Jellyfish based Assembly Sequence Polisher for Error Reduction) is an efficient polishing tool for draft genomes. It uses accurate reads (PacBio HiFi or Illumina) to evaluate consensus quality and correct consensus errors in genome assemblies. JASPER is substantially faster than polishing methods based on sequence alignment, and more accurate than currently available k-mer based methods. The efficiency and scalability of JASPER allows one to use it to create personalized reference genomes for specific populations very efficiently, even for large sequenced populations, by polishing the reference genome, such as GRCh38 or chm13v2.0 for human, with Illumina reads sequenced from many individuals from the population. Please see this manuscript for more details: Guo A, Salzberg SL, Zimin AV. JASPER: A fast genome polishing tool that improves accuracy of genome assemblies. PLoS Comput Biol. 2023 Mar 31;19(3):e1011032. doi: 10.1371/journal.pcbi.1011032. PMID: 37000853; PMCID: PMC10096238.	Genobioinfo Cluster: How to use
JCVI	Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.	Genobioinfo Cluster: How to use
Jellyfish	JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA.	Genobioinfo Cluster: How to use
jModeltest	jModelTest is a tool to carry out statistical selection of best-fit models of nucleotide substitution.	Genobioinfo Cluster: Ask for Install
jpHMM	jpHMM (jumping profile Hidden Markov Model) is a probabilistic approach to compare a sequence to a multiple alignment of a sequence family. The jpHMM web server at GOBICS is a tool for the detection of recombinations in HIV-1 and hepatitis B virus (HBV) genomes. For a query sequence phylogenetic recombination breakpoints are predicted and each region of the sequence is assigned to one HIV-1 subtype/HBV genotype. This prediction is based on a pre-calculated multiple alignment of the major HIV-1 subtypes/HBV genotypes. A detailed description of the algorithm and some information about the evaluation can be found here. For information about the output format please see the online submission page.	Genobioinfo Cluster: How to use
Juicebox	Software for visualizing data from Hi-C and other proximity mapping experiments	Genobioinfo Cluster: Ask for Install
juicebox_scripts	A collection of scripts for working with Hi-C data, Juicebox, and other genomic file formats	Genobioinfo Cluster: Ask for Install
Juicer	A One-Click System for Analyzing Loop-Resolution Hi-C Experiments	Genobioinfo Cluster: How to use
Julia	Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library.	Genobioinfo Cluster: How to use
JustOrthologs	A Fast, Accurate, and User-Friendly Ortholog-Finding Algorithm	Genobioinfo Cluster: Ask for Install
Jvarkit	Java utilities for Bioinformatics (only requested tools are compiling)	Genobioinfo Cluster: Ask for Install
KAD	KAD is designed for evaluating the accuracy of nucleotide base quality of genome assemblies.	Genobioinfo Cluster: Ask for Install
Kaiju	Fast taxonomic classification of metagenomic sequencing reads using a protein reference database	Genobioinfo Cluster: How to use
kallisto	kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.	Genobioinfo Cluster: How to use
KAT	KAT (The K-mer Analysis Toolkit) is a suite of tools that generate, analyse and compare k-mer spectra produced from sequence files.	Genobioinfo Cluster: How to use
KCOSS	A fast and space-saving multi-threaded k-mer frequency statistics algorithm	Genobioinfo Cluster: Ask for Install
kentUtils	UCSC command line bioinformatic utilities	Genobioinfo Cluster: How to use
khmer	In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more. The khmer software for advanced biological sequencing data analysis — khmer 3.0.0a1+98.gfe0ce11 documentation	Genobioinfo Cluster: How to use
klocate	Standalone tool based on the bwa index to locate a set of kmers along a reference genome. klocate searches each kmer (full and perfect match) in the index and outputs all positions the kmer maps to (output to sdtout in bed format).	Genobioinfo Cluster: Ask for Install
klumpy	Klumpy is a bioinformatic tool for identifying possibly incorrectly assembled regions in a long-read based assembly, with the additional capabilities of annotating sequences given a set of query sequences.	Genobioinfo Cluster: How to use
KMA	KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend.	Genobioinfo Cluster: How to use
kmap	Standalone tool based on the bwa index to locate a set of kmers along a reference genome.	Genobioinfo Cluster: Ask for Install
KMC		Genobioinfo Cluster: How to use
kmdiif	kmdiff provides differential k-mers analysis between two populations (control and case). Each population is represented by a set of short-read sequencing. Outputs are differentially represented k-mers between controls and cases.	Genobioinfo Cluster: How to use
kmer-counter	A fast k-mer counter written in Rust.	Genobioinfo Cluster: How to use
KmerGenie	KmerGenie estimates the best k-mer length for genome de novo assembly.	Genobioinfo Cluster: How to use
KmerGO	KmerGO is a user-friendly tool to identify the group-specific sequences on two groups or trait-associated sequences of high throughput sequencing datasets.	Genobioinfo Cluster: How to use
kmersGWAS	A library for running k-mers based GWAS.	Genobioinfo Cluster: How to use
KofamScan	KofamScan is a gene function annotation tool based on KEGG Orthology and hidden Markov model	Genobioinfo Cluster: How to use
komplexity	A command-line tool built in Rust to quickly calculate and/or mask low-complexity sequences from a FAST file. This uses the number of unique k-mers over a sequence divided by the length to assess complexity.	Genobioinfo Cluster: How to use
Kraken	Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	Genobioinfo Cluster: Ask for Install
Kraken2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	Genobioinfo Cluster: How to use
KrakenTools	KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.	Genobioinfo Cluster: How to use
KrakenUniq	KrakenUniq (formerly KrakenHLL) is a novel metagenomics classifier that combines the fast k-mer-based classification of Kraken with an efficient algorithm for assessing the coverage of unique k-mers found in each species in a dataset.	Genobioinfo Cluster: How to use
Krona	Krona allows hierarchical data to be explored with zoomable pie charts. Krona charts can be created using an Excel template or KronaTools, which includes support for several bioinformatics tools and raw data formats.	Genobioinfo Cluster: How to use
kSNP4	kSNP4 identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs.	Genobioinfo Cluster: How to use
LACHESIS	Software that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies.	Genobioinfo Cluster: Ask for Install
lamassemble	Merge overlapping "long" DNA reads into a consensus sequence.	Genobioinfo Cluster: Ask for Install
LAST	LAST finds similar regions between sequences.	Genobioinfo Cluster: How to use
lastp_aai	A simple Python script for calculating pairwise amino acid identity (AAI) between protein files (extension .faa)	Genobioinfo Cluster: Ask for Install
LASTZ	A tool for aligning two DNA sequences, and inferring appropriate scoring parameters automatically.	Genobioinfo Cluster: How to use
lcMLkin	lcMLkin is a C++ program that allows users to infer biological relatedness from low coverage 2nd generation sequencing data	Genobioinfo Cluster: Ask for Install
LDhat	LDhat is a package written in the C and C++ languages for the analysis of recombination rates from population genetic data.	Genobioinfo Cluster: Ask for Install
LDhelmet	LDhelmet performs statistical inference for fine-scale variable recombination rate estimation.	Genobioinfo Cluster: How to use
leeHom	A program for the Bayesian reconstruction of ancient DNA.	Genobioinfo Cluster: Ask for Install
LEfSe	LEfSe (Linear discriminant analysis effect size) is a tool developed by the Huttenhower group to find biomarkers between 2 or more groups using relative abundances.	Genobioinfo Cluster: How to use
Lep-MAP3	Lep-MAP3 is a novel and free software for linkage mapping. It can construct linkage maps on very large number of markers and individuals on single or multiple families.	Genobioinfo Cluster: How to use
libplinkio	This is a small C and Python library for reading Plink genotype files.	Genobioinfo Cluster: How to use
libstree	libstree is a generic suffix tree implementation, written in C.	Genobioinfo Cluster: How to use
Liftoff	Liftoff is a tool that accurately maps annotations in GFF or GTF between assemblies of the same, or closely-related species.	Genobioinfo Cluster: How to use
lima	Demultiplex Barcoded PacBio Samples.	Genobioinfo Cluster: Ask for Install
Linker	Linker is a suite of C++ tools useful for interpreting long and linked read sequencing of cancer genomes.	Genobioinfo Cluster: Ask for Install
LINKS	LINKS is a scalable genomics application for scaffolding or re-scaffolding genome assembly drafts with long reads, such as those produced by Oxford Nanopore Technologies Ltd and Pacific Biosciences.	Genobioinfo Cluster: How to use
LIQA	Long-read Isoform Quantification and Analysis) is an Expectation-Maximization based statistical method to quantify isoform expression and detect differential alternative splicing (DAS) events using long-read RNA-seq data.	Genobioinfo Cluster: Ask for Install
LJA	La Jolla Assembler (LJA) is a tool for genome assembly from HiFI reads based on de Bruijn graphs.	Genobioinfo Cluster: How to use
llvm	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.	Genobioinfo Cluster: Ask for Install
LOCALIZER	LOCALIZER is a machine learning method for predicting the subcellular localization of both plant proteins and pathogen effectors in the plant cell.	Genobioinfo Cluster: How to use
LocARNA	LocARNA is a tool for multiple alignment of RNA molecules. LocARNA requires only RNA sequences as input and will simultaneously fold and align the input sequences.	Genobioinfo Cluster: Ask for Install
locator	A supervised machine learning method for predicting the geographic origin of a sample from genotype or sequencing data.	Genobioinfo Cluster: How to use
loco-pipe	loco-pipe is an automated Snakemake pipeline that streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data.	Genobioinfo Cluster: How to use
Loctree3	Protein Subcelullar Localization Sequenced-Based Predictor	Genobioinfo Cluster: How to use
LoFreq	A sequence-quality aware, ultra-sensitive variant caller for NGS data.	Genobioinfo Cluster: How to use
longdust	Longdust identifies long highly repetitive STRs, VNTRs, satellite DNA and other low-complexity regions (LCRs) in a genome.	Genobioinfo Cluster: How to use
LongQC	LongQC is a tool for the data quality control of the PacBio and ONT long reads, and it has two functionalities: sample qc and platform qc.	Genobioinfo Cluster: How to use
LongRanger	Long Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants.	Genobioinfo Cluster: How to use
longshot	Longshot is a variant calling tool for diploid genomes using long error prone reads such as Pacific Biosciences (PacBio) SMRT and Oxford Nanopore Technologies (ONT).	Genobioinfo Cluster: Ask for Install
LongStitch	A genome assembly correction and scaffolding pipeline using long reads. Basically runs Tigmint, ntLink, ARKS.	Genobioinfo Cluster: Ask for Install
Look4TRs	A de-novo tool for detecting simple tandem repeats using self-supervised hidden Markov models.	Genobioinfo Cluster: How to use
LoRDEC	LoRDEC is a program to correct sequencing errors in long reads from 3rd generation sequencing with high error rate, and is especially intended for PacBio reads. It uses a hybrid strategy, meaning that it uses two sets of reads: the reference read set, whose error rate is assumed to be small, and the PacBio read set, which is then corrected using the reference set. Typically, the reference set contains Illumina reads.	Genobioinfo Cluster: How to use
LR_Gapcloser	LR_Gapcloser is a gap closing tool using uncorrected or corrected long reads generated from Pacbio platform or Nanopore platform.	Genobioinfo Cluster: Ask for Install
LRez	Standalone tool and library allowing to work with barcoded linked-reads.	Genobioinfo Cluster: Ask for Install
LRScaf	TGS scaffolding . Improving draft genomes using long noisy reads.	Genobioinfo Cluster: Ask for Install
LRSIM	Simulator for Linked Reads: this package simulates whole genome sequencing using 10X Genomics Linked Read technology.	Genobioinfo Cluster: Ask for Install
LSx	LS^X is a script in R that runs the LS³ and LS⁴ algorithms of data subsampling for multigene phylogenetic inference. Both of these algorithms do a gene-by-gene inspection of the heterogeneity of evolutionary rates among user-defined lineages of interest (LOI). Then, using criteria that differ in both algorithms (see details here or in the papers), they try to find a subsample of sequences that evolve at a homogeneous rate across all LOIs. If this subset is found, an alignment of the gene is produced with only the sequences that evolve homogeneously. At the same time, a table is also produced showing which sequences were “flagged” (the sequences that were removed), and which sequences were kept. If a subset of sequences that evolve at a homogeneous rate is not found, the gene is flagged entirely.	Genobioinfo Cluster: How to use
LtrDetector	A tool-suite for detecting long terminal repeat retrotransposons de-novo on the genomic scale.	Genobioinfo Cluster: How to use
LUMPY	A general probabilistic framework for structural variant discovery	Genobioinfo Cluster: How to use
MACH	MACH is a Markov Chain based haplotyper that can resolve long haplotypes or infer missing genotypes in samples of unrelated individuals.	Genobioinfo Cluster: Ask for Install
MACS	We present Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.	Genobioinfo Cluster: How to use
MACSE	Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons: a wide range of molecular analyses relies on multiple sequence alignments (MSA).	Genobioinfo Cluster: How to use
MacSyFinder	Detection of macromolecular systems in protein datasets using systems modelling and similarity search. Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system.	Genobioinfo Cluster: How to use
MAESTRO	A Multi AgEnt STability pRedictiOn tool for changes in unfolding free energy upon point mutation. MAESTRO is structure based and distinguishes from similar approaches in the following points: (i) MAESTRO implements a multi-agent machine learning system. (ii) It provides predicted ΔΔG values along with a corresponding prediction quality measure. (iii) MAESTRO is applicable to biological assemblies. (iv) It provides high throughput scanning for multi-point mutations where sites and types of mutation can be comprehensively controlled. (v) Finally, the software provides a specific mode for the prediction of stabilizing disulfide bonds.	Genobioinfo Cluster: How to use
MAFFT	MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <?200 sequences), FFT-NS-2 (fast; for alignment of <?10,000 sequences), etc.	Genobioinfo Cluster: How to use
MAGeCK	Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens technology. MAGeCK-VISPR is a comprehensive quality control, analysis and visualization workflow for CRISPR/Cas9 screens. MAGeCKFlute (R package):Integrative analysis pipeline for pooled CRISPR functional genetic screens Manual and video tutorials : liulab / mageck-vispr — Bitbucket	Genobioinfo Cluster: How to use
MAGIC	A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data.	Genobioinfo Cluster: Ask for Install
Magic-BLAST	Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome.	Genobioinfo Cluster: How to use
MAKER	MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.	Genobioinfo Cluster: How to use
MALDER	This is a version of ALDER (http://groups.csail.mit.edu/cb/alder/) that has been modified to allow multiple admixture events.	Genobioinfo Cluster: How to use
MALT	MALT (MEGAN alignment tool) is an extension of MEGAN (metagenome analyzer). MALT performs alignment of metagenomic reads against a database of reference sequences (such as NR, GenBank or Silva) and produces a MEGAN RMA file as output. The software is currently under development.	Genobioinfo Cluster: How to use
Manta	Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads.	Genobioinfo Cluster: How to use
mapDamage	tracking and quantifying damage patterns in ancient DNA sequences.	Genobioinfo Cluster: How to use
Mapsembler2	Mapsembler2 is a targeted assembly software. It takes as input any number of NGS raw read set(s) (fasta or fastq, gzipped or not) and a set of input sequences (starters).	Genobioinfo Cluster: Ask for Install
MapSplice	Accurate mapping of RNA-seq reads for splice junction discovery.	Genobioinfo Cluster: Ask for Install
MapThin	Reduce the number of SNPs in a gene marker dense map computed by PLINK. First, by eliminating linked SNPs. Then, by applying different criteria.	Genobioinfo Cluster: Ask for Install
MARVEL	MARVEL consists of a set of tools that facilitate the overlapping, patching, correction and assembly of noisy (not so noisy ones as well) long reads.	Genobioinfo Cluster: Ask for Install
MARVEL_bins	MARVEL (Metagenomic Analysis and Retrieval of Viral Elements) is a tool for recovery of draft phage genomes from whole community shotgun metagenomic sequencing data.	Genobioinfo Cluster: Ask for Install
Mash		Genobioinfo Cluster: How to use
Mash	Fast genome and metagenome distance estimation using MinHash. documentation : Publications — Mash 2.0 documentation	Genobioinfo Cluster: How to use
MashMap	MashMap implements a fast and approximate algorithm for computing local alignment boundaries between long DNA sequences. It can be useful for mapping genome assembly or long reads (PacBio/ONT) to reference genome(s). Given a minimum alignment length and an identity threshold for the desired local alignments, Mashmap computes alignment boundaries and identity estimates using k-mers. It does not compute the alignments explicitly, but rather estimates a k-mer based Jaccard similarity using a combination of Minimizers and MinHash. This is then converted to an estimate of sequence identity using the Mash distance.	Genobioinfo Cluster: Ask for Install
mashtree	Create a tree using Mash distances.	Genobioinfo Cluster: How to use
MaSuRCA	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454)	Genobioinfo Cluster: How to use
MAtCHap	An ultra fast algorithm for solving the single individual haplotype assembly problem.	Genobioinfo Cluster: Ask for Install
Mauve	Mauve is a system for efficiently constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion. Multiple genome alignment provides a basis for research into comparative genomics and the study of evolutionary dynamics. Aligning whole genomes is a fundamentally different problem than aligning short sequences.	>Genobioinfo Cluster: How to use
MaxBin2	MaxBin is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.	Genobioinfo Cluster: Ask for Install
mCaller	This program is designed to call m6A from nanopore data using the differences between measured and expected currents.	Genobioinfo Cluster: Ask for Install
MCHelper	An automatic tool to curate transposable element libraries.	Genobioinfo Cluster: How to use
MCL	The MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs.	Genobioinfo Cluster: How to use
MCScanX	MCScan is an algorithm to scan multiple genomes or subgenomes to identify putative homologous chromosomal regions, then align these regions using genes as anchors.	Genobioinfo Cluster: How to use
MECAT	MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads.	Genobioinfo Cluster: Ask for Install
Medaka	Medaka demonstrates a framework for error correcting sequencing data, particularly aimed at nanopore sequencing. Tools are provided for both training and inference. The code exploits the keras deep learning library.	Genobioinfo Cluster: How to use
MEGA-CC	Software suite for analyzing DNA and protein sequence data from species and populations.	Genobioinfo Cluster: Ask for Install
MEGAHIT	An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph	Genobioinfo Cluster: How to use
Megalodon	Megalodon provides "basecalling augmentation" for raw nanopore sequencing reads, including direct, reference-guided SNP and modified base calling.	Genobioinfo Cluster: Ask for Install
MeGAMerge	A tool to merge assembled contigs, long reads from metagenomic sequencing runs	Genobioinfo Cluster: Ask for Install
MEGAN	MEtaGenome ANalyzer : Metagenomic data analysis : taxonomic and functionnal (SEED and KEGG classification) analysis.	Genobioinfo Cluster: How to use
MegaTools	Open-source command line tools for accessing Mega.co.nz cloud storage.	Genobioinfo Cluster: How to use
MEME	The MEME Suite allows you to: (1)&nbspdiscover motifs using MEME or GLAM2 on groups of related DNA or protein sequences, (2)&nbspsearch sequence databases using motifs, (3)&nbspcompare a motif to all motifs in a database of motifs, and (3)&nbspassociate motifs with Gene Ontology terms via their putative target genes.	Genobioinfo Cluster: How to use
Merfin	Evaluate variant calls and its combination with k-mer multiplicity.	Genobioinfo Cluster: Ask for Install
merge-ibd-segments	Remove any breaks and short gaps in IBD segments. Haplotype phase errors and genotype errors can cause breaks and gaps in the detected IBD segments. You can use this program to remove any breaks and short gaps in IBD segments. We usually remove gaps between IBD segments that have at most one discordant homozygote and that are less than 0.6 cM in length.	Genobioinfo Cluster: How to use
MeroX	MeroX is based on StavroX. It is specialized for cleavable cross-linkers. In addition to peptide backbone fragments, MeroX identifies cross-linker specific fragments in MS-MS data.	Genobioinfo Cluster: How to use
Merqury	Evaluate genome assemblies with k-mers and more	Genobioinfo Cluster: How to use
Meryl	A genomic k-mer counter (and sequence utility) with nice features.	Genobioinfo Cluster: How to use
Met4j	Met4J is an open-source Java library dedicated to the structural analysis of metabolic networks. It also came with a toolbox gathering CLI for several analyses relevant to metabolism-related research.	Genobioinfo Cluster: Ask for Install
MetaBat	An adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies.	Genobioinfo Cluster: How to use
metabinkit	From metagenomic or metabarcoding data, it is often necessary to assign taxonomy to DNA sequences. This is generally performed by aligning sequences to a reference database, usually resulting in multiple database alignments for each query sequence. Using these alignment results, metabinkit assigns a single taxon to each query sequence, based on user-defined percentage identity thresholds. In essence, for each query, the alignments are filtered based on the percentage identity thresholds and the lowest common ancestor for all alignments passing the filters is determined. The metabin program is not limited to BLAST alignments, and can accept alignment results produced using any program, provided the input format is correct. However, functionality is also available to create BLAST databases and to perform BLAST alignments, which can be passed directly to metabin.	Genobioinfo Cluster: How to use
metaBIT	An integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data.	Genobioinfo Cluster: How to use
METABOLIC	METabolic And BiogeOchemistry anaLyses In miCrobes	Genobioinfo Cluster: How to use
MetaCHIP	Horizontal gene transfer (HGT) identification pipeline among prokaryotes.	Genobioinfo Cluster: How to use
metaDMG	A fast and accurate ancient DNA damage toolkit for metagenomic data.	Genobioinfo Cluster: How to use
MetaEuk	MetaEuk is a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs.	Genobioinfo Cluster: How to use
METAL	The METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a convenient, rapid and memory efficient manner.	Genobioinfo Cluster: Ask for Install
MetaMaps	MetaMaps is tool specifically developed for the analysis of long-read (PacBio/ONT) metagenomic datasets. It simultaenously carries out read assignment and sample composition estimation. It is faster than classical exact alignment-based approaches, and its output is more information-rich than that of kmer-spectra-based methods. For example, each MetaMaps alignment comes with an approximate alignment location, an estimated alignment identity and a mapping quality.	Genobioinfo Cluster: Ask for Install
MetaMDBG	A lightweight assembler for long and accurate metagenomics reads.	Genobioinfo Cluster: How to use
MetaPhlAn	MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.	Genobioinfo Cluster: How to use
MetaPhlAn2	MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: How to use
MetaPhlAn3	MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: Ask for Install
MetaPhlAn4	MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With StrainPhlAn, it is possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: How to use
MetaWRAP	A flexible pipeline for genome-resolved metagenomic data analysis.	Genobioinfo Cluster: How to use
Metaxa2	Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data.	Genobioinfo Cluster: How to use
MethyLasso	A segmentation approach to analyze DNA methylation patterns and identify differentially methylation regions from whole-genome datasets.	Genobioinfo Cluster: How to use
MethylDackel	MethylDackel (formerly named PileOMeth, which was a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.	Genobioinfo Cluster: How to use
methylKit	DNA methylation analysis from high-throughput bisulfite sequencing results	Genobioinfo Cluster: Ask for Install
MethylScore	Identification of differentially methylated regions between multiple epigenomes from BS-treated read mappings via methylated region calling.	Genobioinfo Cluster: How to use
MFA	The Montreal Forced Aligner is a command line utility for performing forced alignment of speech datasets using Kaldi (http://kaldi-asr.org/).	Genobioinfo Cluster: How to use
mgatk	A mitochondrial genome analysis toolkit.	Genobioinfo Cluster: How to use
MGSE	MGSE can harness the power of files generated in genome sequencing projects to predict the genome size. Required are the FASTA file containing a high continuity assembly and a BAM file with all available reads mapped to this assembly.	Genobioinfo Cluster: Ask for Install
micromamba	micromamba is a single-file executable that is statically linked and can be dropped anywhere on the operating to get started with powerful package management and virtual environments.	Genobioinfo Cluster: How to use
Migraine	Migraine implements coalescent algorithms for maximum likelihood analysis of population genetic data. It considers Allelic counts and DNA sequences.	Genobioinfo Cluster: How to use
Migrate	Migrate estimates effective population sizes,past migration rates between n population assuming a migration matrix model with asymmetric migration rates and different subpopulation sizes, and population divergences or admixture.	Genobioinfo Cluster: How to use
MiMiC2	MiMiC2 is a bioinformatic pipeline for the selection of a few microbial genomes that functionally represent an entire ecosystem, termed a synthetic community (SynCom).	Genobioinfo Cluster: How to use
MinCED	MinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such as metagenomes, in which sequence size can be anywhere from 100 to 800 bp. MinCED runs from the command-line and was derived from CRT	Genobioinfo Cluster: Ask for Install
Miniasm	Ultrafast de novo assembly for long noisy reads (though having no consensus step).	Genobioinfo Cluster: How to use
minibar	Dual barcode and primer demultiplexing for MinION sequenced reads.	Genobioinfo Cluster: How to use
Miniforge	Miniforge is a minimal installer for Conda specific to conda-forge.	Genobioinfo Cluster: How to use
Minigraph	Minigraph is a sequence-to-graph mapper and graph constructor.	Genobioinfo Cluster: How to use
Minimac4	Minimac4 is a latest version in the series of genotype imputation software - preceded by Minimac3 (2015), Minimac2 (2014), minimac (2012) and MaCH (2010). Minimac4 is a lower memory and more computationally efficient implementation of the original algorithms with comparable imputation quality.	Genobioinfo Cluster: How to use
Minimap	Experimental tool to find approximate mapping positions between long sequences	Genobioinfo Cluster: How to use
MinIONQC	Fast and effective quality control for MinION and PromethION sequencing data	Genobioinfo Cluster: Ask for Install
Minipolish	A tool for Racon polishing of miniasm assemblies.	Genobioinfo Cluster: How to use
Miniprot	Aligning proteins to genomes with splicing and frameshift.	Genobioinfo Cluster: How to use
MiniScrub	MiniScrub is a de novo long sequencing read preprocessing method that improves read quality by predicting and removing ("scrubbing") read segments that have a high concentration of errors.	Genobioinfo Cluster: Ask for Install
MiPepid	MiPepid is a software specifically for predicting the coding capabilities of sORFs. sORFs / smORFs are short open reading frames with length <= 303 bp (including the stop codon), and if translated, they encode micropeptides that are <= 100 amino acids.	Genobioinfo Cluster: How to use
MIPgen	Use MIPgen to design custom mip panels for target enrichment of moderate to high complexity DNA targets ranging from 120 to 250bp in size.	Genobioinfo Cluster: How to use
MIRA	Whole genome shotgun and EST sequence assembler for Sanger, 454, and Solexa / Illumina.	Genobioinfo Cluster: How to use
miRDeep2	miRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples. Last, a new module for preprocessing of raw Illumina sequencing data produces files for downstream analysis with the miRDeep2 or quantifier module.	Genobioinfo Cluster: How to use
MiRfold	MiRfold searches for a good miRNA-like folding in the sequence surrounding a putative miRNA. It was optimized on plant miRNAs.	Genobioinfo Cluster: Ask for Install
MISO	MISO (Mixture-of-Isoforms) is a probabilistic framework that quantitates the expression level of alternatively spliced genes from RNA-Seq data, and identifies differentially regulated isoforms or exons across samples.	Genobioinfo Cluster: How to use
MITGARD	MITGARD (Mitochondrial Genome Assembly from RNA-seq Data) is a computational tool designed to recover the mitochondrial genome from RNA-seq data of any Eukaryote species.	Genobioinfo Cluster: How to use
MITObim	The MITObim procedure (mitochondrial baiting and iterative mapping) represents a highly efficient approach to assembling novel mitochondrial genomes of non-model organisms directly from total genomic DNA derived NGS reads.	Genobioinfo Cluster: How to use
MitoHPC	MitoHPC : Mitochondrial High Performance Caller. For Calling Mitochondrial Homoplasmies and Heteroplasmies.	Genobioinfo Cluster: How to use
MitoSeek	MitoSeek is an open-source software tool to reliably and easily extract mitochondrial genome information from exome sequencing data. MitoSeek evaluates mitochondrial genome alignment quality, estimates relative mitochondrial copy numbers, and detects heteroplasmy, somatic mutation, and structural variance of the mitochondrial genome.	Genobioinfo Cluster: Ask for Install
MitoZ	MitoZ is a Python3-based toolkit which aims to automatically filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences from the genome assembly result, annotate mitogenome (genbank file as result), and mitogenome visualization.	Genobioinfo Cluster: How to use
mity	A highly sensitive mitochondrial variant analysis pipeline for whole genome sequencing data	Genobioinfo Cluster: How to use
MLST	Multi-Locus sequence Typing. The method enables investigators to determine the ST based on WGS data.	Genobioinfo Cluster: Ask for Install
mmannot	mmannot annotates reads, or quantifies the features. For instance, suppose that you have sequenced your organism of interest with sRNA-Seq (RNA-Seq works too), and you want to know how many times you have sequenced miRNAs, rRNAs, tRNAs, etc. This is what mmannot does. A huge proportion of the reads may actually map at several locations. These multi-mapping reads are usually handled poorly by similar quantification tools. In our methods, when a read maps at several locations, all these locations are inspected: If all these locations belong to the same feature (e.g. miRNAs, in case of a duplicated gene family), the read is still annotated as a miRNA. If the location belong to different features (e.g. 3'UTR and miRNA), the read is ambiguous, and is flagged as 3'UTR--miRNA. In case 1, we say when have rescued a read.	Genobioinfo Cluster: Ask for Install
mmquant	This tool counts the number of reads (produced by RNA-Seq) per gene, much like HTSeq-count and featureCounts. The main difference with other tools is that multi-mapping reads are counted differently: if a read is mapped to gene A, gene B, and gene C, the tool will create a new feature, "geneA--geneB--geneC", that will be counted once.	Genobioinfo Cluster: How to use
MMseqs2	MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge proteins/nucleotide sequence sets.	Genobioinfo Cluster: How to use
MOB-suite	Software tools for clustering, reconstruction and typing of plasmids from draft assemblies. The MOB-suite is designed to be a modular set of tools for the typing and reconstruction of plasmid sequences from WGS assemblies.	Genobioinfo Cluster: How to use
MobileElementFinder	MobileElementFinder is a tool for identifying Mobile Genetic Elements (MGEs) in Whole Genome Shotgun sequence data.	Genobioinfo Cluster: How to use
Mobster	Mobster is used to detect novel (non-reference) Mobile Element Insertion (MEI) events in BAM files and uses both a discordant read pair method and a split-read method.	Genobioinfo Cluster: Ask for Install
modbam2bed	A program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.	Genobioinfo Cluster: How to use
ModelTest-NG	ModelTest-NG is a tool for selecting the best-fit model of evolution for DNA and protein alignments. ModelTest-NG supersedes jModelTest and ProtTest in one single tool, with graphical and command console interfaces.	Genobioinfo Cluster: How to use
Modkit	A bioinformatics tool for working with modified bases from Oxford Nanopore. Specifically for converting modBAM to bedMethyl files using best practices, but also manipulating modBAM files and generating summary statistics.	Genobioinfo Cluster: How to use
MOSAIK	MOSAIK is a reference-guided assembler comprising of two main modular programs	Genobioinfo Cluster: Ask for Install
mosdepth	Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. mosdepth can output: per-base depth about 2x as fast samtools depth--about 25 minutes of CPU time for a 30X genome. mean per-window depth given a window size--as would be used for CNV calling. the mean per-region given a BED file of regions. a distribution of proportion of bases covered at or above a given threshhold for each chromosome and genome-wide. quantized output that merges adjacent bases as long as they fall in the same coverage bins e.g. (10-20) threshold output to indicate how many bases in each region are covered at the given thresholds. when appropriate, the output files are bgzipped and indexed for ease of use.	Genobioinfo Cluster: How to use
mothur	The one-stop source for your computational microbial ecology needs. mothur offers the ability to go from raw sequences to the generation of visualization tools to describe alpha and beta diversity.	Genobioinfo Cluster: How to use
mPTP	A tool for single-locus species delimitation.	Genobioinfo Cluster: How to use
MrBayes	MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.	Genobioinfo Cluster: How to use
mreps	Software for tandem repeat identification in DNA.	Genobioinfo Cluster: How to use
ms	A program for generating samples under neutral models.	Genobioinfo Cluster: How to use
msamtools	msamtools provides useful functions that are commonly used in microbiome data analysis, especially when analyzing shotgun metagenomics or metatranscriptomics data.	Genobioinfo Cluster: How to use
msBayes	msBayes allows complex and flexible comparative phylogeographic inference.	Genobioinfo Cluster: Ask for Install
MSI	MSI was designed for sequencing reads with higher error rates (e.g., as the ones produced by Nanopore's sequencers) but also works with reads with lower error rates (e.g., Illumina).	Genobioinfo Cluster: Ask for Install
MSIsensor-pro	MSIsensor-pro is an updated version of msisensor. MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data.	Genobioinfo Cluster: How to use
msmc	This software implements MSMC, a method to infer population size and gene flow from multiple genome sequences	Genobioinfo Cluster: How to use
msmc2	This program implements MSMC2, a method to infer population size history and population separation history from whole genome sequencing data	Genobioinfo Cluster: How to use
MSPC	Improve Sensitivity and Specificity of Peak Calling, and Identify Consensus Regions	Genobioinfo Cluster: How to use
msprime	`msprime` is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consistent with a given demographic model) under a range of different models and evolutionary processes. Msprime can also simulate mutations on a given ancestral history (which can be produced by msprime or other programs supporting tskit) under a variety of genome sequence evolution models. Please see the documentation for more details	Genobioinfo Cluster: How to use
msums	A program for the efficient computation of a number of population genetics summary statistics. msums can read ms-format data on (nearly) arbitrary numbers of populations.	Genobioinfo Cluster: Ask for Install
MTG-Link	MTG-Link is a novel gap-filling tool for draft genome assemblies, dedicated to linked read data.	Genobioinfo Cluster: Ask for Install
Mugsy	Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for pairwise alignment, a custom graph based segmentation procedure for identifying collinear regions, and the segment-based progressive multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft genomes in the form of multi-FASTA files and does not require a reference genome. Angiuoli SV and Salzberg SL. Mugsy: Fast multiple alignment of closely related whole genomes. Bioinformatics 2011 27(3):334-4	Genobioinfo Cluster: How to use
MultAlin	Multiple sequence alignment with hierarchical clustering.	Genobioinfo Cluster: Ask for Install
MultiQC	Aggregate results from bioinformatics analyses across many samples into a single report.	Genobioinfo Cluster: How to use
MUMmer	MUMmer is a package for rapidly aligning entire genomes, whether in complete or draft form.	Genobioinfo Cluster: How to use
MuMRescueLite	MuMRescueLite is the software that enable to use the tag sequencies of mapped to multiple loci to the genome, for the expression analysis. At the mapping of short sequence tags of CAGE or ChIP-Seq to the genome, sequence tags that map to multiple genomic loci (multi-mapping tags or MuMs), are routinely omitted from further analysis, leading to experimental bias and reduced coverage. MuMRescueLite probabilistically reincorporates multi-mapping tags into mapped short read data with acceptable computational requirements.	Genobioinfo Cluster: Ask for Install
MUSCLE	Multiple sequence alignment (nucleic or proteic).	Genobioinfo Cluster: How to use
Musket	Musket is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing.	Genobioinfo Cluster: Ask for Install
MVP	MVP stands for Multi-choice Viromics Pipeline. It is a simplified pipeline that utilizes a suite of state-of-art tools to easily get from a set of contigs to a vOTU heatmap (and more).	Genobioinfo Cluster: How to use
MyCC	Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.	Genobioinfo Cluster: Ask for Install
myloasm	Myloasm is a de novo metagenome assembler for long-read sequencing data. It takes sequencing reads and outputs polished contigs in a single command.	Genobioinfo Cluster: How to use
MZmine	MZmine is an open-source software for mass-spectrometry data processing.	Genobioinfo Cluster: How to use
NAMD	NAMD, recipient of a 2002 Gordon Bell Award and a 2012 Sidney Fernbach Award, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	Genobioinfo Cluster: How to use
NaMeco	Pipeline for the Nanopore 16S long read clustering and taxonomy classification.	Genobioinfo Cluster: How to use
Nano-Q	Python script for conservatively cleaning ONT reads from bam files and estimate variant frequencies.	Genobioinfo Cluster: Ask for Install
NanoASV	Nanopore full-length 16S metabarcoding amplicon data analysis	Genobioinfo Cluster: How to use
NanoCaller	NanoCaller is a computational method that integrates long reads in deep convolutional neural network for the detection of SNPs/indels from long-read sequencing data.	Genobioinfo Cluster: How to use
NanoCLUST	NanoCLUST is an analysis pipeline for UMAP-based classification of amplicon-based full-length 16S rRNA nanopore reads.	Genobioinfo Cluster: How to use
NanoComp	Compare multiple runs of long read sequencing data and alignments.	Genobioinfo Cluster: in Python-3.11.1
NanoCount	NanoCount estimates transcripts abundance from Oxford Nanopore direct-RNA sequencing datasets, using an expectation-maximization approach like RSEM, Kallisto, salmon, etc to handle the uncertainty of multi-mapping reads.	Genobioinfo Cluster: Ask for Install
NanoFilt	Filtering and trimming of Oxford Nanopore sequencing data	Genobioinfo Cluster: Ask for Install
NanoLyse	Remove reads mapping to the lambda phage genome from a fastq file	Genobioinfo Cluster: How to use
NanoPlot	Plotting tool for Oxford Nanopore sequencing data and alignments.	Genobioinfo Cluster: How to use
Nanopolish	A nanopore consensus algorithm using a signal-level hidden Markov model. Signal-level algorithms for MinION data.	Genobioinfo Cluster: How to use
NanoSim	NanoSim is a fast and scalable read simulator that captures the technology-specific features of ONT data, and allows for adjustments upon improvement of nanopore sequencing technology.	Genobioinfo Cluster: How to use
NanoSPC	NanoSPC is a scalable, portable and cloud compatible pipeline for analyzing Nanopore sequencing data.	Genobioinfo Cluster: How to use
NanoStat	Create statistic summary of an Oxford Nanopore read dataset	Genobioinfo Cluster: How to use
NaS	NaS is a hybrid approach developed to take advantage of data generated using MinION device. It combines Illumina and Oxford Nanopore technologies to produce NaS (Nanopore Synthetic-long) reads	Genobioinfo Cluster: Ask for Install
natsort	Simple yet flexible natural sorting in Python	Genobioinfo Cluster: Ask for Install
NCBI_Blast	Similarity search against databanks.	Genobioinfo Cluster: How to use
NCBI_Blast+	Similarity search against databanks.	Genobioinfo Cluster: How to use
NCBI_datasets	NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or NCBI Datasets web interface.	Genobioinfo Cluster: How to use
NCBI_tools	NCBI portable software toolkit	Genobioinfo Cluster: How to use
NCBI_tools++	NCBI C++ Toolkit provides free, portable, public domain libraries.	Genobioinfo Cluster: How to use
NECAT	NECAT is an error correction and de-novo assembly tool for Nanopore long noisy reads.	Genobioinfo Cluster: How to use
NetLogo	NetLogo is a multi-agent programmable modeling environment.	Genobioinfo Cluster: Ask for Install
Neural-ADMIXTURE	Neural ADMIXTURE is an unsupervised global ancestry inference technique based on ADMIXTURE. By using neural networks, Neural ADMIXTURE offers high quality ancestry assignments with a running time which is much faster than ADMIXTURE's.	Genobioinfo Cluster: How to use
Newbler	Newbler is a software package for de novo DNA sequence assembly. It is designed specifically for assembling sequence data generated by the 454 GS-series of pyrosequencing platforms sold by 454 Life Science, a Roche diagnostic.	Genobioinfo Cluster: Ask for Install
Newick_Utilities	The Newick Utilities are a suite of Unix shell tools for processing phylogenetic trees. We distribute the package under the BSD License. Functions include re-rooting, extracting subtrees, trimming, pruning, condensing, drawing (ASCII graphics or SVG).	Genobioinfo Cluster: How to use
NextCloudcmd	A command line client that can be used to synchronize Nextcloud files to client machines.	Genobioinfo Cluster: Ask for Install
NextDenovo	NextDenovo is a string graph-based de novo assembler for TGS long reads.	Genobioinfo Cluster: How to use
Nextflow	Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.	Genobioinfo Cluster: How to use
NextPolish	NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both.	Genobioinfo Cluster: How to use
nf-core workflows	This module provide access to workflows nf-core, there are automatically downloaded into your home. More info at nf-core/config page.	Genobioinfo Cluster: How to use
NGMLR	NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations SV detection from paired end reads mapping	Genobioinfo Cluster: How to use
ngsLD	ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype's assignation into account.	Genobioinfo Cluster: How to use
ngsRelate	NgsRelate can be used to infer relatedness coefficients for pairs of individuals from low coverage Next Generation Sequencing (NGS) data by using genotype likelihoods instead of called genotypes.	Genobioinfo Cluster: How to use
ngsTools	ngsTools is a collection of programs for population genetics analyses from NGS data, taking into account its statistical uncertainty. The methods implemented in these programs do not rely on SNP or genotype calling, and are particularly suitable for low sequencing depth data.	Genobioinfo Cluster: How to use
ngsutils	Tools for next-generation sequencing analysis.	Genobioinfo Cluster: How to use
NINJA	Nearly Infinite Neighbor Joining Application	Genobioinfo Cluster: How to use
NLR-Annotator	Disease resistance genes encoding nucleotide-binding and leucine-rich repeat (NLR) intracellular immune receptor proteins detect pathogens by the presence of pathogen effectors. Although developed for wheat, we demonstrate the universal applicability of NLR-Annotator across diverse plant taxa. NLR-Annotator is a tool to annotate loci associated with NLRs in large sequences.	Genobioinfo Cluster: Ask for Install
NLRtracker	NLRtracker extracts and functionally annotates NLRs from protein or transcript files based on the core features found in the RefPlantNLR dataset.	Genobioinfo Cluster: How to use
non-B_gfa	gfa programs for Non-B site at NCI/FNLCR. gfa is a Suite of programs developed at NCI-Frederick/Frederick National Lab to find sequences associated with non-B DNA forming motifs.	Genobioinfo Cluster: Ask for Install
NOVOLoci	NOVOLoci is a haplotype aware assembler for targeted assembly or whole genome assembly of small genomes. We currently recommend limiting the assembly size to regions <20 Mb in targeted-mode and diploid genomes that are <250 Mb in WG-mode, with a minimum sequencing depth of 10x per haplotype. If you do need to phase accuratly and you have HiFi or R10 ONT data, it is adviced to use Hifiasm, as it is has a much shorter runtime. Currently it is only available for Nanopore, PacBio and hybrid options will be available soon.	Genobioinfo Cluster: How to use
NOVOPlasty	NOVOPlasty is a de novo assembler and variance caller for short circular genomes.	Genobioinfo Cluster: How to use
nPhase	nPhase is a ploidy agnostic tool developed in python which predicts the haplotypes of a sample that was sequenced by both long and short reads by aligning them to a reference. It should work with any ploidy.	Genobioinfo Cluster: Ask for Install
nQuire	A statistical framework for ploidy estimation using NGS short-read data.	Genobioinfo Cluster: How to use
NSEG	NSEG is used to mask nucleic acid sequences, needed by RepeatScout.	Genobioinfo Cluster: How to use
ntEdit	ntEdit is a fast and scalable genomics application for polishing genome assembly drafts. It simplifies polishing and "haploidization" of gene and genome sequences with its re-usable Bloom filter design.	Genobioinfo Cluster: Ask for Install
ntHits	ntHits is a method for identifying reapeats in high-throughput DNA sequencing data.	Genobioinfo Cluster: Ask for Install
ntJoin	Scaffolding draft assemblies using reference assemblies and minimizer graphs	Genobioinfo Cluster: How to use
numpy	NumPy is a package needed for scientific computing with Python.	Genobioinfo Cluster: Ask for Install
oarfish	oarfish is a program, written in Rust, for quantifying transcript-level expression from long-read (i.e. Oxford nanopore cDNA and direct RNA and PacBio) sequencing technologies. `oarfish` requires a sample of sequencing reads aligned to the transcriptome (currntly not to the genome). It handles multi-mapping reads through the use of probabilistic allocation via an expectation-maximization (EM) algorithm	Genobioinfo Cluster: How to use
Oases	Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz). Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms.	Genobioinfo Cluster: How to use
oatk	A organelle de novo genome assembly toolkit. (Install include OatkDB)	Genobioinfo Cluster: How to use
OBITools	OBITools is a set of python programs developed to simplify the manipulation of sequence files in our labs. They were mainly designed to help us for analyzing Next Generation Sequencer outputs (454 or Illumina) in the context of DNA Metabarcoding.	Genobioinfo Cluster: How to use
odgi	odgi provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.	Genobioinfo Cluster: How to use
Ollama	Ollama is an open-source tool that allows you to run large language models (LLMs).	Genobioinfo Cluster: How to use
OMA	The OMA (Orthologous MAtrix) database is a well-established resource for identifying orthologs among publicly available complete genomes.	Genobioinfo Cluster: How to use
OMArk	OMArk is a software for proteome (protein-coding gene repertoire) quality assessment.	Genobioinfo Cluster: How to use
OMBlast	An alignment tool for optical mapping data.	Genobioinfo Cluster: Ask for Install
omegaplus	A scalable tool for rapid detection of selective sweeps in whole-genome datasets.	Genobioinfo Cluster: How to use
Onecodetofindthemall	One code to find them all is a set of perl scripts to extract useful information from RepeatMasker about transposable elements, retrieve their sequences and get some quantitative information.	Genobioinfo Cluster: Ask for Install
ont_fast5_api	ont_fast5_api is a simple interface to HDF5 files of the Oxford Nanopore fast5 file format.	Genobioinfo Cluster: How to use
OpenBabel	Open Babel is a chemical toolbox designed to speak the many languages of chemical data.	Genobioinfo Cluster: How to use
Openfold3	A fully open source biomolecular structure prediction model based on AlphaFold3.	Genobioinfo Cluster: How to use
openSMILE	Python package for openSMILE (open-source Speech and Music Interpretation by Large-space Extraction).	Genobioinfo Cluster: How to use
ORA	Bio::ORA is a featherweight object for identifying mammalian olfactory receptor genes. The sequences should not be longer than 40kb. The returned array include location, sequence and statistic for the putative olfactory receptor gene. Fully functional with DNA and EST sequence, no intron supported.	Genobioinfo Cluster: Ask for Install
ORFfinder	ORFfinder searches for open reading frames (ORFs) in the DNA sequence you enter.	Genobioinfo Cluster: How to use
ORG.asm	The ORGanelle ASseMbler aims to target the assembling of small sequences over-represented in a whole genome shotgun sequence dataset.	Genobioinfo Cluster: How to use
Organelle_PBA	OrganelleRef_PBA is a script to perform a de-novo PacBio assemblies of any organelle (chloroplast or mitochondrial genomes) using several programs.	Genobioinfo Cluster: Ask for Install
orthAgogue	a tool for high speed estimation of homology relations within and between species in massive data sets. orthAgogue is easy to use and offers flexibility through a range of optional parameters.	Genobioinfo Cluster: Ask for Install
OrthoFinder	OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses.	Genobioinfo Cluster: How to use
OrthoLoger	Standalone pipeline for delineation of orthologs.	Genobioinfo Cluster: Ask for Install
OrthoMCL	OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences.	Genobioinfo Cluster: How to use
Pacasus	Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification.	Genobioinfo Cluster: Ask for Install
pairtools	pairtools is a simple and fast command-line framework to process sequencing data from a Hi-C experiment.	Genobioinfo Cluster: How to use
PALEOMIX	The PALEOMIX pipeline is a set of free and open-source pipelines and tools designed to enable the rapid processing of Next Generation Sequencing (NGS) data, starting from de-multiplexed reads from one or more samples, through sequence processing and alignment, and ending with genotyping, phylogenetic inference on the samples, as well as metagenomic analysis of the samples.	Genobioinfo Cluster: How to use
palm_annot	Scripts, HMMs and search databases for identifying and classifying viral RdRp sequences	Genobioinfo Cluster: How to use
Palmscan	Palmscan is software to detect viral polymerase palmprint barcode sequences in longer sequences such as virus genomes and ORFs. Palmprints can be used to classify RNA viruses.	Genobioinfo Cluster: How to use
PAML	PAML is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.	Genobioinfo Cluster: How to use
PanACoTA	PANgenome with Annotations, COre identification, Tree and corresponding Alignments.	Genobioinfo Cluster: How to use
panacus		Genobioinfo Cluster: How to use
Pandoc	Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.	Genobioinfo Cluster: How to use
PanGenie	A short-read genotyper for various types of genetic variants (such as SNPs, indels and structural variants) represented in a pangenome graph.	Genobioinfo Cluster: How to use
pantera	Identification of transposable element families from pangenome polymorphisms. A pangenome is a collection of genomes or haplotypes that can be aligned and stored as a variation graph in gfa format. pantera receives as input a list of gfa files of non overlapping variation graphs and produces a library of transposable elements found to be polymorphic on that pangenome.	Genobioinfo Cluster: How to use
PanTools	PanTools is a toolkit for comparative analysis of large number of genomes.	Genobioinfo Cluster: How to use
Paragraph	Graph realignment tools for structural variants.	Genobioinfo Cluster: How to use
parallel	GNU parallel is a shell tool for executing jobs in parallel using one or more computers.	Genobioinfo Cluster: default system
parallel-fastq-dump	NCBI fastq-dump can be very slow sometimes, even if you have the resources (network, IO, CPU) to go faster, even if you already downloaded the sra file (see the protip below). This tool speeds up the process by dividing the work into multiple threads.	Genobioinfo Cluster: How to use
ParGenes	A massively parallel tool for model selection and tree inference on thousands of genes.	Genobioinfo Cluster: Ask for Install
Parselmouth	Parselmouth aim to provide a complete and Pythonic interface to the internal Praat code.	Genobioinfo Cluster: How to use
parseRM	Few scripts facilitating the extraction of info from Repeat Masker .out files	Genobioinfo Cluster: Ask for Install
Parsnp	Parsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection.	Genobioinfo Cluster: How to use
PartitionFinder	PartitionFinder is free open source software to select best-fit partitioning schemes and models of molecular evolution for phylogenetic analyses.	Genobioinfo Cluster: How to use
PASApipeline	PASA, acronym for Program to Assemble Spliced Alignments, is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.	Genobioinfo Cluster: How to use
PASTA	PASTA estimates alignments and ML trees from unaligned sequences using an iterative approach. In each iteration, it first estimates a multiple sequence alignment using the current tree as a guide and then estimates an ML tree on (a masked version of) the alignment.	Genobioinfo Cluster: How to use
PathoFact	PathoFact is an easy-to-use modular pipeline for the metagenomic analyses of toxins, virulence factors and antimicrobial resistance.	Genobioinfo Cluster: Ask for Install
pathPhynder	A workflow for integrating ancient lineages into present-day phylogenies.	Genobioinfo Cluster: How to use
PAUP	Tools for inferring and interpreting phylogenetic trees	Genobioinfo Cluster: How to use
pb-assembly	PacBio® tools : everything needed to run Falcon and Unzip	Genobioinfo Cluster: Ask for Install
pblat	Parallelized blat with multi-threads support.	Genobioinfo Cluster: How to use
pbmm2	A minimap2 frontend for PacBio native data format.	Genobioinfo Cluster: How to use
PBSIM3	A simulator for all types of Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long reads.	Genobioinfo Cluster: How to use
pbtk	PacBio BAM toolkit	Genobioinfo Cluster: How to use
PCAdmix	PCAdmix is a method that estimates local ancestry via principal components analysis (PCA) using phased haplotypes. The method considers data chromosome by chromosome.	Genobioinfo Cluster: Ask for Install
PCAngsd	Framework for analyzing low depth next-generation sequencing (NGS) data in heterogeneous populations using principal component analysis (PCA).	Genobioinfo Cluster: How to use
PEAR	PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.	Genobioinfo Cluster: How to use
PECAT	PECAT is a phased error correction and assembly tool for long reads. It includes a haplotype-aware correction method and an efficient diploid assembly method.	Genobioinfo Cluster: How to use
Pelican	Pelican is a reimplementation of the model of Tamuri et al. (2009) to identify sites undergoing different kinds of directional selection in different parts of a phylogenetic tree.	Genobioinfo Cluster: How to use
Peregrine	Peregrine is a fast genome assembler for accurate long reads (length > 10kb, accuraccy > 99%). It can assemble a human genome from 30x reads within 20 cpu hours from reads to polished consensus.	Genobioinfo Cluster: How to use
PETfold	PETfold performs Probabilistic Evolutionary and Thermodynamic folding of a multiple alignment of RNA sequences.	Genobioinfo Cluster: Ask for Install
PfamScan	A program that searches a FASTA file against a library of Pfam HMMs.	Genobioinfo Cluster: How to use
pftools3	The pftools package contains all the software necessary to build protein and DNA generalized profiles and use them to scan and align sequences, and search databases	Genobioinfo Cluster: How to use
PGAP	The NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids).	Genobioinfo Cluster: How to use
PGDSpider	PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances)	Genobioinfo Cluster: How to use
pggb	Pangenome graph builder.	Genobioinfo Cluster: How to use
PHASE	PHASE is a package that performs molecular phylogenetic inference. The software seeks to accurately compare molecular sequences to determine the likely evolutionary relationships between a group of species.	Genobioinfo Cluster: How to use
phasebook	phasebook is a novel approach for reconstructing the haplotypes of diploid genomes from long reads de novo, that is without the need for a reference genome.	Genobioinfo Cluster: Ask for Install
PhaseTank	To systemically characterize phasiRNAs/tasiRNAs and their regulatory cascades 'miRNA/phasiRNA -&gt	Genobioinfo Cluster: Ask for Install
PHAST	Phylogenetic Analysis with Space/Time models (PHAST) is a freely available software package consisting of a collection of command-line programs and supporting libraries for comparative and evolutionary genomics.	Genobioinfo Cluster: How to use
PHASTER_scripts	Small utility scripts to query the PHASTER API endpoint, to identify and annotate prophage sequences within bacterial genomes and plasmids.	Genobioinfo Cluster: How to use
PhiPack	The Phi Test is a simple, rapid, and statistically efficient test for recombination.	Genobioinfo Cluster: Ask for Install
PhiSpy	PhiSpy identifies prophages in Bacterial (and probably Archaeal) genomes. Given an annotated genome it will use several approaches to identify the most likely prophage regions.	Genobioinfo Cluster: Ask for Install
Phobius	A combined transmembrane topology and signal peptide predictor.	Genobioinfo Cluster: Ask for Install
Phy-Mer	A novel alignment-free and reference-independent mitochondrial haplogroup classifier.	Genobioinfo Cluster: Ask for Install
PhyKIT	PhyKIT is a UNIX shell toolkit for processing and analyzing phylogenomic data.	Genobioinfo Cluster: How to use
PHYLIP	PHYLIP (PHYLogeny Inference Package), is a package composed by 34 programs dedicated to phylogeny inference. Methods that are available in the package include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees. Data types that can be handled include molecular sequences, gene frequencies, restriction sites and fragments, distance matrices, and discrete characters.	Genobioinfo Cluster: How to use
PhyloBayes	PhyloBayes is a Bayesian Monte Carlo Markov Chain (MCMC) sampler for phylogenetic reconstruction and molecular dating using protein and nucleic acid alignments.	Genobioinfo Cluster: How to use
Phylobayes_MPI	PhyloBayes (Lartillot et al, 2009) is a Bayesian Monte Carlo Markov Chain (MCMC) sampler for phylogenetic reconstruction. With MPI.	Genobioinfo Cluster: How to use
phyloFlash	phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.	Genobioinfo Cluster: How to use
PhyloNet		Genobioinfo Cluster: How to use
PhyloPhlAn	PhyloPhlAn is a computational pipeline for reconstructing highly accurate and resolved phylogenetic trees based on whole-genome sequence information. The pipeline is scalable to thousands of genomes and uses the most conserved 400 proteins for extracting the phylogenetic signal. PhyloPhlAn also implements taxonomic curation, estimation, and insertion operations.	Genobioinfo Cluster: Ask for Install
phyluce	phyluce (phy-loo-chee) is a software package that was initially developed for analyzing data collected from ultraconserved elements in organismal genomes.	Genobioinfo Cluster: How to use
PhyML	PhyML is a phylogeny software based on the maximum-likelihood principle.	Genobioinfo Cluster: How to use
phyx	phyx performs phylogenetics analyses on trees and sequences.	Genobioinfo Cluster: Ask for Install
picard-tools	Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.	Genobioinfo Cluster: How to use
PICRUSt	PICRUSt (pronounced ﾓpie crustﾔ) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.	Genobioinfo Cluster: How to use
PILER	Genomic repeat analysis software.	Genobioinfo Cluster: How to use
PILERCR	PILERCR is public domain software for finding CRISPR repeats.	Genobioinfo Cluster: Ask for Install
Pilon	Pilon is an automated genome assembly improvement and variant detection tool.	Genobioinfo Cluster: How to use
piMASS	posterior inference via Model Averaging and Subset Selection: performs genome-wide joint analysis of all SNPs in association with a phenotype.	Genobioinfo Cluster: Ask for Install
Pindel	Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads.	Genobioinfo Cluster: How to use
PingPongPro	Find ping-pong signatures in piRNA-Seq data like a pro.	Genobioinfo Cluster: Ask for Install
pixy	pixy is a command-line tool for painlessly and correctly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.	Genobioinfo Cluster: How to use
Pizzly	A program for detecting gene fusions from RNA-Seq data of cancer samples.	Genobioinfo Cluster: Ask for Install
PlantiSMASH	PlantiSMASH is a specialized extension of antiSMASH for the identification and analysis of biosynthetic gene clusters (BGCs) in plant genomes. It supports advanced plant-specific detection rules and features for comparative genomics, visualization, and more.	Genobioinfo Cluster: How to use
PlasFlow	Software for prediction of plasmid sequences in metagenomic assemblies.	Genobioinfo Cluster: How to use
Plasmer	An accurate and sensitive bacterial plasmid identification tool based on deep machine-learning of shared k-mers and genomic features.	Genobioinfo Cluster: How to use
PlasmidFinder	The service identifies plasmids in total or partial sequenced isolates of bacteria.	Genobioinfo Cluster: How to use
Plass	Plass (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level.	Genobioinfo Cluster: Ask for Install
plassembler	plassembler is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing.	Genobioinfo Cluster: How to use
PLAST	PLAST is a fast, accurate and NGS scalable bank-to-bank sequence similarity search tool providing significant accelerations of seeds-based heuristic comparison methods, such as the Blast suite of algorithms.	Genobioinfo Cluster: Ask for Install
Platanus	Platanus is a novel de novo sequence assembler that can reconstruct genomic sequences of highly heterozygous diploids from massively parallel shotgun sequencing data.	Genobioinfo Cluster: Ask for Install
Platanus_trim	Platanus_trim is a tool for trimming adaptor sequences and low quality regions. In contrast, Platanus_internal_trim is a tool for trimming internal adaptor sequence, adaptor sequences, and low quality regions. Platanus_trim is designed for paired-end library and Platanus_internal_trim is for mate-pair library.	Genobioinfo Cluster: Ask for Install
Platanus2	Platanus-allee (Platanus2) is a de novo haplotype assembler (phasing tool), which assembles each haplotype sequence in a diploid genome.	Genobioinfo Cluster: Ask for Install
PLINK	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	Genobioinfo Cluster: How to use
ploidyNGS	A model-free, open source tool to visualize and explore ploidy levels in a newly sequenced genome, exploiting short read data.	Genobioinfo Cluster: Ask for Install
plotsr	Tool to plot synteny and structural rearrangements between genomes.	Genobioinfo Cluster: How to use
PMERGE	PMERGE, is a software, which implements a new method that identifies candidate PSVs by building networks of loci that share high levels of nucleotide similarity. The PMERGE is embedded in the analysis pipeline of the widely used Stacks software, and it is straightforward to apply it as an additional filter in population-genomic studies using RAD-seq data.	Genobioinfo Cluster: Ask for Install
POD5	The pod5 Python package contains the tools and python API wrapping the compiled bindings for the POD5 file format from lib_pod5.	Genobioinfo Cluster: How to use
Polypolish	Polypolish is a tool for polishing genome assemblies with short reads.	Genobioinfo Cluster: How to use
Pomoxis	Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing.	Genobioinfo Cluster: Ask for Install
pong	pong is a freely available software package, released by Behr et al. (2016, Bioinformatics), for post-processing output from clustering inference using population genetic data.	Genobioinfo Cluster: Ask for Install
PopART	PopART (Population Analysis with Reticulate Trees) is free, open source population genetics software that was developed as part of the Allan Wilson Centre Imaging Evolution Initiative.	Genobioinfo Cluster: How to use
popins	Population-scale detection of novel-sequence insertions.	Genobioinfo Cluster: Ask for Install
PopIns2	Population-scale detection of non-reference sequence variants using colored de Bruijn Graphs.	Genobioinfo Cluster: Ask for Install
PoPoolation	PoPoolation is a pipeline for analysing pooled next generation sequencing data. Currently PoPoolation allows to calculate Tajima’s Pi, Watterson’s Theta and Tajima’s D with a sliding window approach for chromosomes or for set of genes.	Genobioinfo Cluster: How to use
PoPoolation2	PoPoolation2 allows to compare allele frequencies for SNPs between two or more populations and to identify significant differences. PoPoolation2 requires next generation sequencing data of pooled genomic DNA (Pool-Seq). It may be used for measuring differentiation between populations, for genome wide association studies and for experimental evolution.	Genobioinfo Cluster: How to use
popPhylABC	Scripts used for ABC analysis with homo- and heterogeneity in Migration rates or/and Effective population sizes	Genobioinfo Cluster: Ask for Install
POPS	The POPS program performs inference of ancestry distribution models.	Genobioinfo Cluster: How to use
Porechop	Porechop is a tool for finding and removing adapters from Oxford Nanopore reads. Adapters on the ends of reads are trimmed off, and when a read has an adapter in its middle, it is treated as chimeric and chopped into separate reads. Porechop performs thorough alignments to effectively find adapters, even at low sequence identity.	Genobioinfo Cluster: How to use
Porechop_ABI	Porechop_abi (ab initio) is an extension of Porechop that is able to infer the adapter sequence from the Oxford Nanopore reads. It discovers the adapter sequence from the reads using approximate k-mers and assembly, and add the sequence found to the adapter list (adapters.py file).	Genobioinfo Cluster: How to use
Portcullis	Splice junction analysis and filtering from BAM files.	Genobioinfo Cluster: Ask for Install
pp-popularity-contest	The pp-popularity-contest package sets up a cron job that periodically submits the developers anonymous statistics on the usage of Rost Lab prediction methods installed on this system.	Genobioinfo Cluster: How to use
PPanGGOLiN	PPanGGOLiN (Gautreau et al. 2020) is a software suite used to create and manipulate prokaryotic pangenomes from a set of either genomic DNA sequences or provided genome annotations.	Genobioinfo Cluster: How to use
PRANK	PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences.	Genobioinfo Cluster: How to use
PredGPI	Prediction of glycosylphosphatidylinositol-anchors in proteins based on HMMs and SVMs.	Genobioinfo Cluster: How to use
PredictHaplo	This software aims at reconstructing haplotypes from next-generation sequencing data.	Genobioinfo Cluster: How to use
preseq	Software for predicting library complexity and genome coverage in high-throughput sequencing.	Genobioinfo Cluster: How to use
PretextMap	Paired REad TEXTure Mapper. Converts SAM formatted read pairs into genome contact maps. Full suite of Pretext tools.	Genobioinfo Cluster: How to use
PretextView	OpenGL Powered Pretext Contact Map Viewer.	Genobioinfo Cluster: How to use
Primer3	Primer3 is a widely used program for designing PCR primers (PCR = "Polymerase Chain Reaction").	Genobioinfo Cluster: How to use
PRINSEQ	PRINSEQ is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. The standalone version is primarily designed for data preprocessing and does not generate summary statistics in graphical form.	Genobioinfo Cluster: How to use
PROBCONSRNA	PROBCONS is a tool for generating multiple alignments of protein sequences.	Genobioinfo Cluster: Ask for Install
Prodigal	Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee.	Genobioinfo Cluster: How to use
ProgressiveCactus	Progressive Cactus is a whole-genome alignment package.	Genobioinfo Cluster: Ask for Install
PROJ4	Cartographic Projections Library	Genobioinfo Cluster: Ask for Install
PROKKA	Prokka is a software tool for the rapid annotation of prokaryotic genomes. A typical 4 Mbp genome can be fully annotated in less than 10 minutes on a quad-core computer, and scales well to 32 core SMP systems. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.	Genobioinfo Cluster: How to use
PromoTech	Promotech is a machine-learning-based classifier trained to generate a model that generalizes and detects promoters in a wide range of bacterial species.	Genobioinfo Cluster: How to use
ProphET	ProphET, Prophage Estimation Tool: a standalone prophage sequence prediction tool with self-updating reference database.	Genobioinfo Cluster: Ask for Install
Prost	Prost! (PRocessing Of Short Transcripts) can analyze smallRNA sequencing data generated on any sequencing platform. Prost! does not rely on existing annotation to filter sequencing reads but instead starts by aligning all the reads on a user-provided genomic reference, allowing the study of miRNAs in any species. Additionally, any number of samples can be studied together in a single Prost! run, allowing an accurate analysis of an entire dataset. After grouping the processed reads by genomic location, Prost! then annotates them using a user-defined annotation database (public or personal annotation database).	Genobioinfo Cluster: How to use
Proteinortho	Proteinortho is a tool to detect orthologous genes within different species.	Genobioinfo Cluster: How to use
proteinortho_curves	Draw pan- and core-genome curves from proteinortho output	Genobioinfo Cluster: How to use
ProtHint	ProtHint is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.	Genobioinfo Cluster: How to use
ProtTrans	ProtTrans is providing state of the art pretrained language models for proteins.	Genobioinfo Cluster: How to use
PROVEAN	PROVEAN (Protein Variation Effect Analyzer) is a software tool which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein.	Genobioinfo Cluster: How to use
PSAURON	PSAURON is a machine learning model for rapid assessment of protein coding gene annotation.	Genobioinfo Cluster: How to use
Pseudofinder	Detection of pseudogene candidates in bacterial and archaeal genomes.	Genobioinfo Cluster: Ask for Install
PSGInfer	PSGInfer is a software package for the analysis of RNA-Seq data with probabilistic splice graph (PSG) models of gene alternative processing (splicing, transcription initiation, and polyadenylation)	Genobioinfo Cluster: Ask for Install
PSI-Sigma	Percent Spliced-In (PSI) values are commonly used to report alternative pre-mRNA splicing (AS) changes. However, previous PSI-detection methods are limited to specific types of AS events. PSI-Sigma is using a new splicing index (PSIΣ) that is more flexible, can incoporate novel junctions, and can compute PSI values of individual exons in complex splicing events.	Genobioinfo Cluster: Ask for Install
psmc	This software package infers population size history from a diploid sequence using the Pairwise Sequentially Markovian Coalescent (PSMC) model.	Genobioinfo Cluster: How to use
Puffaligner	Puffaligner is a fast, sensitive and accurate aligner built on top of the Pufferfish index.	Genobioinfo Cluster: Ask for Install
Purge_Dups	purge_dups is designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.	Genobioinfo Cluster: How to use
Purge_Haplotigs	Pipeline to help with curating heterozygous diploid genome assemblies (for instance when assembling using FALCON or FALCON-unzip).	Genobioinfo Cluster: Ask for Install
pybam	Very simple, pure python, BAM file reader. If you do not need to use BAM indexes, pybam is probably the fastest and simplest BAM parser out there, particularly if run under PyPy.	Genobioinfo Cluster: How to use
PyCharm	PyCharm is a dedicated Python and Django IDE providing a wide range of essential tools for Python developers, tightly integrated together to create a convenient environment for productive Python development and Web development.	Genobioinfo Cluster: Ask for Install
Pychopper	Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.	Genobioinfo Cluster: How to use
pycoQC	pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)	Genobioinfo Cluster: How to use
Pydub	Manipulate audio with a simple and easy high level interface.	Genobioinfo Cluster: How to use
pyMLST	Python Mlst Local Search Tool.	Genobioinfo Cluster: How to use
pypolca	pypolca is a Standalone Python re-implementation of the POLCA polisher from the MaSuRCA genome assembly and analysis toolkit.	Genobioinfo Cluster: How to use
PyPy	A fast, compliant alternative implementation of Python.	Genobioinfo Cluster: How to use
pyrho	Fast demography-aware inference of fine-scale recombination rates based on fused-LASSO.	Genobioinfo Cluster: How to use
PyroCleaner	PyroCleaner is intended to clean reads coming from pyrosequencing in order to ease the assembly process.	Genobioinfo Cluster: Ask for Install
pysam	Pysam is a python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API.	Genobioinfo Cluster: Ask for Install
pysamstats	A Python utility for calculating statistics against genome positions based on sequence alignments from a SAM or BAM file.	Genobioinfo Cluster: How to use
pyseer	Sequence Element Enrichment Analysis (SEER), python implementation. Pyseer uses linear models with fixed or mixed effects to estimate the effect of genetic variation in a bacterial population on a phenotype of interest, while accounting for potentially very strong confounding population structure. This allows for genome-wide association studies (GWAS) to be performed in clonal organisms such as bacteria and viruses.	Genobioinfo Cluster: How to use
PySlurm	This module provides a low-level Python wrapper around the Slurm C-API using Cython.	Genobioinfo Cluster: Ask for Install
qgrs-cpp	C++ implementation of QGRS mapping algorithm (QGRS Mapper is a software program that generates information on composition and distribution of putative Quadruplex forming G-Rich Sequences (QGRS) in nucleotide sequences.)	Genobioinfo Cluster: Ask for Install
QIIME	QIIME (pronounced "chime") stands for Quantitative Insights Into ttMicrobial Ecology. QIIME is an open source software package for ttcomparison and analysis of microbial communities, primarily based on tthigh-throughput amplicon sequencing data (such as SSU rRNA) generated tton a variety of platforms, but also supporting analysis of other types ttof data (such as shotgun metagenomic data). QIIME takes users from tttheir raw sequencing output through initial analyses such as OTU ttpicking, taxonomic assignment, and construction of phylogenetic trees ttfrom representative sequences of OTUs, and through downstream ttstatistical analysis, visualization, and production of ttpublication-quality graphics. QIIME has been applied to single studies ttbased on billions of sequences from thousands of samples. ttttt	Genobioinfo Cluster: How to use
QmRLFS-finder	QmRLFS-finder, the first R-loop finding tool which uses (unsupervised) QmRLFS (Quantitative Models of RLFS) models to predict RLFSs. This command line tool generates locations and detailed information of RLFSs as well as standards-compliant output files for further analysis and visualization.	Genobioinfo Cluster: How to use
qpWrapper	Tools allowing to launch qpAdmn analyzes (Admixtools) in series on a list of individuals.	Genobioinfo Cluster: Ask for Install
Quake	t Quake is a package to correct substitution sequencing errors in experiments with deep coverage (e.g. >15X), specifically intended for Illumina sequencing reads. Quake adopts the k-mer error correction framework, first introduced by the EULER genome assembly package. Unlike EULER and similar progams, Quake utilizes a robust mixture model of erroneous and genuine k-mer distributions to determine where errors are located. Then Quake uses read quality values and learns the nucleotide to nucleotide error rates to determine what types of errors are most likely. This leads to more corrections and greater accuracy, especially with respect to avoiding mis-corrections, which create false sequence unsimilar to anything in the original genome sequence from which the read was taken.	Genobioinfo Cluster: How to use
Qualimap	Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.	Genobioinfo Cluster: How to use
quant3p	A set of scripts for 3' RNA-seq quantification.	Genobioinfo Cluster: How to use
quarTeT	quarTeT is a collection of tools for T2T genome assembly and basic analysis in automatic workflow.	Genobioinfo Cluster: How to use
Quarto	Quarto is a software that compiles a markdown code to html, pdf, or many other formats. It is a successor of pandoc.	Genobioinfo Cluster: How to use
QUAST	QUAST evaluates genome assemblies by computing various metrics	Genobioinfo Cluster: How to use
quickLD	High-performance Computation of Linkage Disequilibrium on CPUs and GPUs.	Genobioinfo Cluster: How to use
quickmerge	A simple and fast metassembler and assembly gap filler designed for long molecule based assemblies	Genobioinfo Cluster: Ask for Install
Quorum	QuorUM (Quality Optimized Reads from the University of Maryland) is an error corrector for Illumina reads.	Genobioinfo Cluster: Ask for Install
R	R is "GNU S", a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc.	Genobioinfo Cluster: How to use
R-scape	R-scape looks for evidence of a conserved RNA structure by measuring pairwise covariations observed in an input multiple sequence alignment. It analyzes all possible pairs, including those in your proposed structure (if you provide one).	Genobioinfo Cluster: Ask for Install
r8s	This package implements several methods to infer divergence times on a molecular phylogeny, using penalized likelihood, maximum likelihood and nonparametric rate smoothing methods. It also implements miscellaneous tree and character evolution models and tests.	Genobioinfo Cluster: Ask for Install
Ra	Ra is as a fast and easy to use assembler for raw reads generated by third generation sequencing.	Genobioinfo Cluster: Ask for Install
Rabaler	Rebaler is a program for conducting reference-based assemblies using long reads. It relies mainly on minimap2 for alignment and Racon for making consensus sequences.	Genobioinfo Cluster: Ask for Install
RabbitUniq	Compute unique k-mer faster.	Genobioinfo Cluster: Ask for Install
RabbitV	RabbitV is a highly optimized and practical toolkit for the detection of viruses and microorganisms in sequencing data.	Genobioinfo Cluster: Ask for Install
Racon	Consensus module for raw de novo DNA assembly of long uncorrected reads.	Genobioinfo Cluster: How to use
RADIS	Analysis of RAD-seq data for InterSpecific phylogeny	Genobioinfo Cluster: Ask for Install
radsex	Find sex signal in RAD-Sequencing data.	Genobioinfo Cluster: How to use
RAFT	RAFT (Repeat Aware Fragmentation Tool) is an algorithm designed to improve assembly quality by rescuing contained reads. RAFT breaks long reads into smaller sub-reads by following an algorithm described in our preprint. The read fragmentation allows an OLC assembler to retain contained reads during string graph construction. When input reads have non-uniform lengths, retaining contained reads improves assembly contiguity and base-level accuracy. The inputs to RAFT include an error-corrected read file in FASTA/FASTQ format and an all-vs-all alignment file in PAF format. It performs read fragmentation and outputs the fragmented reads in FASTA format. We recommend users to use hifiasm for the initial steps (read error correction, all-vs-all overlap computation) and also for the final step (assembly of fragmented reads). The assembly output format of hifiasm is described here. The RAFT-hifiasm workflow is recommended for long accurate reads with non-uniform length distribution (e.g., ONT Duplex, or a mixture of ONT Duplex and HiFi reads). ONT UL reads can optionally be integrated during the final assembly step.	Genobioinfo Cluster: How to use
RaGOO	A tool to order and orient genome assembly contigs via Minimap2 alignments to a reference genome.	Genobioinfo Cluster: Ask for Install
Ragout	Ragout (Reference-Assisted Genome Ordering UTility) is a tool for chromosome assembly using multiple references.	Genobioinfo Cluster: Ask for Install
RagTag	RagTag, the successor to RaGOO, is a command line tool for reference-guided genome assembly improvement.	Genobioinfo Cluster: How to use
RAiSD	RAiSD (Raised Accuracy in Sweep Detection) is a stand-alone software implementation of the μ statistic for selective sweep detection.	Genobioinfo Cluster: How to use
RAMPART	RAMPART is a configurable pipeline for de novo assembly of DNA sequence data. RAMPART is not a de novo assembler.	Genobioinfo Cluster: Ask for Install
Ranbow	Ranbow is a haplotype assembler for polyploid genomes.	Genobioinfo Cluster: Ask for Install
randfold	The software compute the probability that, for a given RNA sequence, the Minimum Free Energy (MFE) of the secondary structure is different from a distribution of MFE computed with random sequences..	Genobioinfo Cluster: How to use
rastair	Fast processing of TET-assisted pyridine borane sequencing (TAPS)-based sequencing data.	Genobioinfo Cluster: How to use
rasusa	Randomly subsample sequencing reads or alignments.	Genobioinfo Cluster: How to use
Ratatosk	Ratatosk is a phased error correction tool for erroneous long reads based on compacted and colored de Bruijn graphs built from accurate short reads.	Genobioinfo Cluster: Ask for Install
RATT	RATT is software to transfer annotation from a reference (annotated) genome to an unannotated query genome.	Genobioinfo Cluster: Ask for Install
RAxML	RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for postanalyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads.	Genobioinfo Cluster: How to use
RAxML-NG	RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.	Genobioinfo Cluster: How to use
raxtax	raxtax is a fast and efficient k-mer-based non-Bayesian taxonomic classifier for barcoding DNA sequences.	Genobioinfo Cluster: How to use
Ray	Assemble genomes in parallel using the message-passing interface	Genobioinfo Cluster: Ask for Install
RBCeq2	RBCeq2 reads in genomic variant data in the form of variant call files (VCF) and outputs blood group (BG) genotype and phenotype inference.	Genobioinfo Cluster: How to use
RDKit	The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python.	Genobioinfo Cluster: How to use
RDP Classifier	The RDP Classifier is a naive Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment.	Genobioinfo Cluster: Ask for Install
RDPTools	Collection of commonly used RDP Tools for easy building	Genobioinfo Cluster: How to use
RdRpCATCH	A community effort to create a shared resource for HMM-based RdRp discovery	Genobioinfo Cluster: How to use
RDXplorer	The RDXplorer (Read Depth eXplorer) is a computational tool for copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage.	Genobioinfo Cluster: How to use
READ	READ is a method to infer the degree of relationship (up to second degree, i.e. nephew/niece-uncle/aunt, grandparent-grandchild or half-siblings) for a pair of low-coverage individuals.	Genobioinfo Cluster: How to use
read2tree	read2tree is a software tool that allows to obtain alignment matrices for tree inference.	Genobioinfo Cluster: How to use
READv2	Relationship Estimation from Ancient DNA version 2.	Genobioinfo Cluster: How to use
REAPR	REAPR is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads, without the use of a reference genome for comparison. It can be used in any stage of an assembly pipeline to automatically break incorrect scaffolds and flag other errors in an assembly for manual inspection. It reports mis-assemblies and other warnings, and produces a new broken assembly based on the error calls.	Genobioinfo Cluster: Ask for Install
Recentrifuge	Robust comparative analysis and contamination removal for metagenomics	Genobioinfo Cluster: How to use
RECON	A package for automated de novo identification of repeat families from genomic sequence.	Genobioinfo Cluster: How to use
REDItools	REDItools are python scripts developed with the aim to study RNA editing at genomic scale by next generation sequencing data.	Genobioinfo Cluster: Ask for Install
Redundans	Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.	Genobioinfo Cluster: Ask for Install
REFMAKER	REFMAKER is a command-line and user-friendly pipeline providing different tools to create nuclear references from genomic assemblies of shotgun libraries.	Genobioinfo Cluster: How to use
regenie	regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.	Genobioinfo Cluster: How to use
RegTools	RegTools is a set of tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.	Genobioinfo Cluster: Ask for Install
ReLERNN	Recombination Landscape Estimation using Recurrent Neural Networks	Genobioinfo Cluster: How to use
RepAHR	RepAHR is used to identify repeats(repetitive sequences) in genome using Next-Generation Sequencing reads.	Genobioinfo Cluster: Ask for Install
REPdenovo	REPdenovo is designed for constructing repeats directly from sequence (paired-end) reads. It based on the idea of frequent k-mer assembly. REPdenovo provides many functionalities, and can generate much longer repeats than existing tools. Internally, REPdenovo uses Jellyfish for k-mer counting, Velvet for assembly, and bwa to map reads on the Transposable Elements.	Genobioinfo Cluster: Ask for Install
RepeatExplorer	RepeatExplorer is a computational pipeline designed to identify and characterize repetitive DNA elements in next-generation sequencing data from plant and animal genomes.	Genobioinfo Cluster: Ask for Install
RepeatMasker	RepeatMasker is a program that screens DNA sequences for interspersed repeats (thanks to RepBase repeats databanks specially formatted) and low complexity DNA sequences.	Genobioinfo Cluster: How to use
RepeatModeler	RepeatModeler is a de-novo repeat family identification and modeling package.	Genobioinfo Cluster: How to use
RepeatScout	RepeatScout is a tool to discover repetitive substrings in DNA.	Genobioinfo Cluster: How to use
RepEnrich2	RepEnrich2 is an updated method to estimate repetitive element enrichment using high-throughput sequencing data.	Genobioinfo Cluster: Ask for Install
REPET	The REPET package (t Flutre et al, 2011 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale.	Genobioinfo Cluster: How to use
RetroScan	RetroScan is an easy-to-use tool for retrocopy identification that integrates a series of bioinformatics tools (LAST, BEDtools, ClustalW2, KaKs_Calculator, HISAT2, StringTie, SAMtools and Shiny) and scripts.	Genobioinfo Cluster: How to use
RetroSeq	RetroSeq is a bioinformatics tool that searches for mobile element insertions from aligned reads in a BAM file and a library of reference transposable elements.	Genobioinfo Cluster: Ask for Install
RevBayes	RevBayes provides an interactive environment for statistical computation in phylogenetics. It is primarily intended for modeling, simulation, and Bayesian inference in evolutionary biology, particularly phylogenetics. However, the environment is quite general and can be useful for many complex modeling tasks.	Genobioinfo Cluster: Ask for Install
REViewer	A tool for visualizing alignments of reads in regions containing tandem repeats	Genobioinfo Cluster: How to use
RFMIX	A discriminative method for local ancestry inference	Genobioinfo Cluster: How to use
RFMix-reader	rfmix-reader is a Python package designed to efficiently read and process output files generated by RFMix, a popular tool for estimating local ancestry in admixed populations. It employs a lazy loading approach to minimize memory usage, and leverages GPU acceleration for major speedups when available.	Genobioinfo Cluster: How to use
Rfold	Rfold computes local base pairing probabilities for long DNA sequences.	Genobioinfo Cluster: Ask for Install
RFPlasmid	RFPlasmid predicts plasmid contigs from assemblies using single copy marker genes, plasmid genes, and kmers.	Genobioinfo Cluster: How to use
RGI	Software to predict resistomes from protein or nucleotide data, including metagenomics data, based on homology and SNP models.	Genobioinfo Cluster: How to use
RiboTaper	RiboTaper is a new analysis pipeline for Ribosome Profiling (Ribo-seq) experiments, which exploits the triplet periodicity of ribosomal footprints to call translated regions.	Genobioinfo Cluster: Ask for Install
ripgrep	ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern.	Genobioinfo Cluster: How to use
rMATS	MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns.	Genobioinfo Cluster: Ask for Install
rMATS turbo	rMATS turbo is the C/Cython version of rMATS (refer to http://rnaseq-mats.sourceforge.net) : Multivariate Analysis of Transcript Splicing (MATS). The major difference between rMATS turbo and rMATS is speed and space usage.	Genobioinfo Cluster: How to use
RMBlast	RMBlast is a RepeatMasker compatible version of the standard NCBI BLAST suite. The primary difference between this distribution and the NCBI distribution is the addition of a new program "rmblastn" for use with RepeatMasker and RepeatModeler. RMBlast supports RepeatMasker searches by adding a few necessary features to the stock NCBI blastn program. These include: - Support for custom matrices ( without KA-Statistics ). - Support for cross_match-like complexity adjusted scoring. Cross_match is Phil Green's seeded smith-waterman search algorithm. - Support for cross_match-like masklevel filtering.	Genobioinfo Cluster: How to use
RNAclust	RNAclust is a perl script summarizing all the single steps required for clustering of structured RNA motifs, i.e. identifying groups of RNA sequences sharing a secondary structure motif.	Genobioinfo Cluster: Ask for Install
RNAmmer	Rnammer predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in tttfull genome sequences. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project.	Genobioinfo Cluster: How to use
RNAscClust	RNAscClust is a pipeline to cluster a set of structured RNAs taking their respective structural conservation into account. The aim of RNAscClust is to aid the discovery of families and classes of ncRNAs.	Genobioinfo Cluster: Ask for Install
RNAz	RNAz detects stable and conserved RNA secondary structures in multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
Roary	Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome.	Genobioinfo Cluster: How to use
RogueNaRok	A versatile and scalable algorithm for rogue taxon identification.	Genobioinfo Cluster: Ask for Install
ROHan	ROHan is a Bayesian framework to estimate local rates of heterozygosity, infer runs of homozygosity (ROH) and compute global rates of heterozygosity.	Genobioinfo Cluster: How to use
RopeBWT2	RopeBWT2 is an tool for constructing the FM-index for a collection of DNA sequences.	Genobioinfo Cluster: How to use
rosbags	Rosbags is the pure python library for everything rosbag.	Genobioinfo Cluster: How to use
ROSE	To create stitched enhancers, and to separate super-enhancers from typical enhancers using sequencing data (.bam) given a file of previously identified constituent enhancers (.gff)	Genobioinfo Cluster: How to use
RSEG	The RSEG software package is aimed to analyze ChIP-Seq data, especially for identifying genomic regions and their boundaries marked by diffusive histone modification markers.	Genobioinfo Cluster: Ask for Install
RSEM	RSEM (RNA-Seq by Expectation-Maximization) is a software package for estimating gene and isoform expression levels from RNA-Seq data.	Genobioinfo Cluster: How to use
RSeQC	RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data	Genobioinfo Cluster: How to use
RTGTools	RTG Tools is a subset of RTG Core that includes several useful utilities for dealing with VCF files and sequence data. Probably the most interesting is the `vcfeval` command which performs sophisticated comparison of VCF files.	Genobioinfo Cluster: How to use
Ruby	A dynamic, open source programming language.	Genobioinfo Cluster: How to use
rush	A cross-platform command-line tool for executing jobs in parallel.	Genobioinfo Cluster: How to use
rust-mdbg	rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) implementation, geared towards the assembly of long and accurate reads such as PacBio HiFi.	Genobioinfo Cluster: How to use
S3V2_IDEAS_ESMP	A package for normalizing, denoising and integrating epigenomic datasets across different cell types.	Genobioinfo Cluster: Ask for Install
Sabre	A barcode demultiplexing and trimming tool for FastQ files.	Genobioinfo Cluster: Ask for Install
Salmon	Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using lightweight alignments	Genobioinfo Cluster: How to use
SALSA	A tool to scaffold long read assemblies with Hi-C data	Genobioinfo Cluster: How to use
sambamba	Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth.	Genobioinfo Cluster: How to use
samblaster	samblaster is a fast and flexible program for marking duplicates in read-id grouped1 paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file.	Genobioinfo Cluster: How to use
samclip	Filter SAM file for soft and hard clipped alignments	Genobioinfo Cluster: Ask for Install
samtools	SAM (Sequence Alignment/Map). SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.	Genobioinfo Cluster: How to use
SARTools	SARTools is a R package dedicated to the differential analysis of RNA-seq data. It provides tools to generate descriptive and diagnostic graphs, to run the differential analysis with one of the well known DESeq2 or edgeR packages and to export the results into easily readable tab-delimited files.	Genobioinfo Cluster: in R-4.5.0
Satsuma	Highly sensitive whole-genome synteny alignments.	Genobioinfo Cluster: Ask for Install
Saturn	A tool for assessing the library saturation without any reference genome. .	Genobioinfo Cluster: Ask for Install
sbt	sbt is a build tool for Scala, Java, and more.	Genobioinfo Cluster: Ask for Install
Scaff10X	Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads	Genobioinfo Cluster: Ask for Install
Scan For Matches	scan_for_matches is a utility written in C for locating patterns in DNA or protein FASTA files.	Genobioinfo Cluster: Ask for Install
Scanpy	Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.	Genobioinfo Cluster: How to use
SCCmecFinder	SCCmecFinder identifies SCCmec elements in sequenced S. aureus isolates.	Genobioinfo Cluster: How to use
schmutzi	Bayesian maximum a posteriori contamination estimate for ancient samples.	Genobioinfo Cluster: How to use
scipy	SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. The SciPy library depends on Numpy, which provides convenient and fast N-dimensional array manipulation.	Genobioinfo Cluster: In Python modules
Scoary	Scoary is designed to take the gene_presence_absence.csv file from Roary as well as a traits file created by the user and calculate the assocations between all genes in the accessory genome and the traits.	Genobioinfo Cluster: How to use
scrm	A coalescent simulator for genome-scale sequences.	Genobioinfo Cluster: Ask for Install
SDA	Segmental Duplication Assembler	Genobioinfo Cluster: Ask for Install
SDpop	SDpop infers sex-linkage from genotyping data of several individuals of both sexes, collected in panmictic populations.	Genobioinfo Cluster: How to use
SEACells	Single-cEll Aggregation for High Resolution Cell States. SEACells algorithm for Inference of transcriptional and epigenomic cellular states from single-cell genomics data.	Genobioinfo Cluster: How to use
SEDEF	SEDEF is a quick tool to find all segmental duplications in the genome.	Genobioinfo Cluster: Ask for Install
segemehl	segemehl is a software to map short sequencer reads to reference genomes. segemehl implements a matching strategy based on enhanced suffix arrays (ESA).	Genobioinfo Cluster: Ask for Install
SELFISH	SELFISH is a tool for finding differential chromatin interactions between two Hi-C contact maps.	Genobioinfo Cluster: How to use
SelNeTime	The selnetime python package implements methods for statistical analysis of genetic data collected for a same population at different times.	Genobioinfo Cluster: How to use
selscan	A program to calculate EHH-based scans for positive selection in genomes.	Genobioinfo Cluster: How to use
Seq	Seq is a programming language for computational genomics and bioinformatics. With a Python-compatible syntax and a host of domain-specific features and optimizations, Seq makes writing high-performance genomics software as easy as writing Python code, and achieves performance comparable to (and in many cases better than) C/C++.	Genobioinfo Cluster: How to use
Seq-Gen	Seq-Gen is a program that will simulate the evolution of nucleotide or amino acid sequences along a phylogeny, using common models of the substitution process.	Genobioinfo Cluster: Ask for Install
SeqAn	SeqAn is an open source C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data.	Genobioinfo Cluster: How to use
seqclean	SeqClean is a tool for validation and trimming of DNA sequences from a flat file database (FASTA format).	Genobioinfo Cluster: How to use
seqfilter	Filter fasta/fastq(.gz) files by ID and/or sequence length	Genobioinfo Cluster: Ask for Install
SeqFu	A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths.	Genobioinfo Cluster: How to use
SeqKit	A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling.	Genobioinfo Cluster: How to use
Seqtk	Toolkit for processing sequences in FASTA/Q formats	Genobioinfo Cluster: How to use
SequenceTools	Tools for population genetics on sequencing datas	Genobioinfo Cluster: How to use
seqwish	seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments.	Genobioinfo Cluster: How to use
SGA	SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.	Genobioinfo Cluster: Ask for Install
SGSGeneLoss	Gene presence/absence variation discovery.	Genobioinfo Cluster: Ask for Install
SHAPEIT	SHAPEIT is a fast and accurate method for estimation of haplotypes (aka phasing) from genotype or sequencing data.	Genobioinfo Cluster: How to use
Shasta	The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by Oxford Nanopore flow cells.	Genobioinfo Cluster: How to use
Shennong	A Python toolbox for speech features extraction.	Genobioinfo Cluster: How to use
SHERPAS	A new, alignment-free genome recombination detection tool exploiting the idea of phylo-kmers (originally developed in RAPPAS, Linard et al. 2019) to accelerate the process by several orders of magnitude while keeping comparable accuracy.	Genobioinfo Cluster: How to use
shiver	shiver is a tool for mapping paired-end short reads to a custom reference sequence constructed using do novo assembled contigs, in order to minimise the biased loss of information that occurs from mapping to a reference that differs from the sample.	Genobioinfo Cluster: Ask for Install
ShortStack	ShortStack is a tool developed to process and analyze smallRNA-seq data with respect to a reference genome, and output a comprehensive and informative annotation of all discovered small RNA genes.	Genobioinfo Cluster: How to use
SHRiMP	SHRiMP is a software package for aligning genomic reads against a target genome. It was primarily developed with the multitudinous short reads of next generation sequencing machines in mind, as well as Applied Biosystem's colourspace genomic representation.	Genobioinfo Cluster: Ask for Install
SibeliaZ	SibeliaZ is a whole-genome alignment and locally-coliinear blocks construction pipeline. The blocks coordinates are output in GFF format and the alignment is in MAF.	Genobioinfo Cluster: How to use
SICER2	Redesigned and improved version of the original ChIP-seq broad peak calling tool SICER.	Genobioinfo Cluster: How t o use
sickle	Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads and also determines when the quality is sufficiently high enough to trim the 5'-end of reads.	Genobioinfo Cluster: How to use
SIFT4G	Sorting Intolerant From Tolerant For Genomes.	Genobioinfo Cluster: How to use
SIFT4G_Annotator	Annotating VCF files using the SIFT4G databases.	Genobioinfo Cluster: Ask for Install
SignalP	SignalP 4.0 server predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.	Genobioinfo Cluster: How to use
Silix	The software package SiLiX implements an ultra-efficient algorithm for the clustering of homologous sequences, based on single transitive links (single linkage) with alignment coverage constraints.	Genobioinfo Cluster: How to use
Simka	Simka is a de novo comparative metagenomics tool. Simka represents each dataset as a k-mer spectrum and compute several classical ecological distances between them.	Genobioinfo Cluster: How to use
simuPOP	simuPOP is a general-purpose individual-based forward-time population genetics simulation environment.	Genobioinfo Cluster: Ask for Install
singlem	SingleM is a tool to find the abundances of discrete operational taxonomic units (OTUs) directly from shotgun metagenome data, without heavy reliance on reference sequence databases. It is able to differentiate closely related species even if those species are from lineages new to science.	Genobioinfo Cluster: Ask for Install
Singularity	Singularity enables users to have full control of their environment. Singularity containers can be used to package entire scientific workflows, software and libraries, and even data.	Genobioinfo Cluster: How to use
sleuth	sleuth is a program for analysis of RNA-Seq experiments for which transcript abundances have been quantified with kallisto.	Genobioinfo Cluster: Ask for Install
SLICEMBLER	SLICEMBLER is a meta-assembler designed for ultra-deep sequencing data/	Genobioinfo Cluster: Ask for Install
SLiM	SLiM is an evolutionary simulation framework that combines a powerful engine for population genetic simulations with the capability of modeling arbitrarily complex evolutionary scenarios.	Genobioinfo Cluster: How to use
SLR	SLR is a program to detect sites in coding DNA that are unusually conserved and/or unusually variable (that is, evolving under purify or positive selection) by analysing the pattern of changes for an alignment of sequences on an evolutionary tree.	Genobioinfo Cluster: How to use
SLR-superscaffolder	This is a scaffold assembler designed for stLFR reads. It uses the link-reads information from stLFR reads to assemble contigs to scaffolds.	Genobioinfo Cluster: Ask for Install
SMALT	SMALT aligns DNA sequencing reads with a reference genome. Reads from a wide range of sequencing platforms can be processed, for example Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger. Paired reads are supported. There is no support for SOLiD reads.	Genobioinfo Cluster: Ask for Install
smartdenovo	SMARTdenovo is a de novo assembler for PacBio and Oxford Nanopore (ONT) data. It produces an assembly from all-vs-all raw read alignments without an error correction stage. It also provides tools to generate accurate consensus sequences, though a platform dependent consensus polish tools (e.g. Quiver for PacBio or Nanopolish for ONT) are still required for higher accuracy.	Genobioinfo Cluster: How to use
SMC++	SMC++ is a program for estimating the size history of populations from whole genome sequence data.	Genobioinfo Cluster: How to use
SMRTLink	SMRT Link is the web-based end-to-end workflow manager for the Sequel™ System. (installed in mode command line on our cluster)	Genobioinfo Cluster: How to use
Smudgeplots	Inference of ploidy and heterozygosity structure using whole genome sequencing data. This tool extracts heterozygous kmer pairs from kmer dump files (from jellyfish or KMC) and performs gymnastics with them. We are able to disentangle genome structure by comparing the sum of kmer pair coverages (CovA + CovB) to their relative coverage (CovA / (CovA + CovB)). Smudgeplots are computed from raw/trimmed reads and show the haplotype structure using heterozygous kmer pairs.	Genobioinfo Cluster: How to use
Snakemake	Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style.	Genobioinfo Cluster: How to use
snakePipes	Customizable workflows based on snakemake and python for the analysis of NGS data.	Genobioinfo Cluster: Ask for Install
SNAP	Gene prediction tool	Genobioinfo Cluster: How to use
snape-pooled	SNAPE-pooled computes the probability distribution for the frequency of the minor allele in a certain population, at a certain position in the genome.	Genobioinfo Cluster: Ask for Install
Sniffles	A fast structural variant caller for long-read sequencing, Sniffles2 accurately detect SVs on germline, somatic and population-level for PacBio and Oxford Nanopore read data.	Genobioinfo Cluster: How to use
Snippy	Rapid haploid variant calling and core genome alignment. Snippy finds SNPs between a haploid reference genome and your NGS sequence reads. It will find both substitutions (snps) and insertions/deletions (indels). It will use as many CPUs as you can give it on a single computer (tested to 64 cores). It is designed with speed in mind, and produces a consistent set of output files in a single folder. It can then take a set of Snippy results using the same reference and generate a core SNP alignment (and ultimately a phylogenomic tree).	Genobioinfo Cluster: How to use
sNMF	A fast and efficient program for estimating individual admixture coefficients based on sparse non-negative matrix factorization and population genetics.	Genobioinfo Cluster: Ask for Install
snoGPS	Search for H/ACA snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
SnoReport	Computational identification of snoRNAs with unknown targets. Detecting novel or orphan snoRNAs in RNA sequence data using sequence and structure information only without relying on target information	Genobioinfo Cluster: Ask for Install
Snoscan	Search for C/D box methylation guide snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
SNP-sites	Rapidly extracts SNPs from a multi-FASTA alignment.	Genobioinfo Cluster: How to use
snpArcher	snpArcher is a reproducible workflow optimized for nonmodel organisms and comparisons across datasets, built on the Snakemake workflow management system, for dataset acquisition, variant calling, quality control, and downstream analysis.	Genobioinfo Cluster: How to use
SnpEff	SnpEff is a variant annotation and effect prediction tool. ttttIt annotates and predicts the effects of variants on genes (such as amino acid changes)	Genobioinfo Cluster: How to use
snpflip	Report reverse and ambiguous strand SNPs in GWAS data.	Genobioinfo Cluster: How to use
SNPGenie	SNPGenie is a collection of Perl scripts for estimating πN/πS, dN/dS, and gene diversity from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.	Genobioinfo Cluster: Ask for Install
SNPhylo	a pipeline to generate a phylogenetic tree from huge SNP data	Genobioinfo Cluster: How to use
SNPsplit	SNPsplit is an allele-specific alignment sorter which is designed to read alignment files in SAM/BAM format and determine the allelic origin of reads that cover known SNP positions.	Genobioinfo Cluster: How to use
soap.coverage	Can calculate sequencing coverage or physical coverage as well as duplication rate and details of specific block for each segments and whole genome by using SOAP, Blat, Blast, BlastZ, mummer and MAQ aligement results with multi-thread. Gzip file supported.	Genobioinfo Cluster: Ask for Install
SOAPdenovo	ttSOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.	Genobioinfo Cluster: Ask for Install
SOAPdenovo-Trans	SOAPdenovo-Trans is a de novo transcriptome assembler basing on the SOAPdenovo framework, adapt to alternative splicing and different expression level among transcripts.	Genobioinfo Cluster: Ask for Install
sonic	Some Organism's Nucleotide Information Container.	Genobioinfo Cluster: Ask for Install
SortaDate	Scripts that you can use at different stages to attempt to find more clock-like genes. Generally, you would use these for dating analyses with another package	Genobioinfo Cluster: Ask for Install
SortMeRNA	SortMeRNA is a software designed to rapidly filter ribosomal RNA fragments from metatransriptomic data produced by next-generation sequencers. It is capable of handling large RNA databases and sorting out all fragments matching to the database with high accuracy and specificity	Genobioinfo Cluster: How to use
souporcell	souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.	Genobioinfo Cluster: How to use
sourmash	sourmash is a command-line tool and Python library for computing hash sketches from DNA sequences, comparing them to each other, and plotting the results.	Genobioinfo Cluster: How to use
SpaceRanger	Space Ranger is a set of analysis pipelines that process Visium spatial RNA-seq output and brightfield and fluorescence microscope images in order to detect tissue, align reads, generate feature-spot matrices, perform clustering and gene expression analysis, and place spots in spatial context on the slide image.	Genobioinfo Cluster: How to use
SPAdes	SPAdes ﾖ St. Petersburg genome assembler ﾖ is intended for both standard isolates and single-cell MDA bacteria assemblies.	Genobioinfo Cluster: How to use
Spaln	Spaln (space-efficient spliced alignment) is a stand-alone program that maps and aligns a set of cDNA or protein sequences onto a whole genomic sequence in a single job.	Genobioinfo Cluster: How to use
spaTyper	Computational method for finding spa types. Staphylococcus aureus is a major human pathogen causing skin and tissue infections, pneumonia, septicemia, and device-associated infections. The emergence of strains resistant to methicillin (MRSA) and other antibacterial agents has become a major concern, especially in the hospital environment, because of the high mortality of the infections caused by these strains. Single locus DNA-sequencing of the repeat region of the Staphylococcus protein A gene (spa) can be used for reliable, accurate and discriminatory typing of MRSA. Repeats are assigned a numerical code and the spa-type is deduced from the order of specific repeats. However, spa-typing was hampered in the past by the lack of a consensus on assignments of new spa-repeats and -types.	Genobioinfo Cluster: Ask for Install
SPECTRE	A collection of Phylogenetics tools for creating and manipulating networks and trees.	Genobioinfo Cluster: Ask for Install
SpeedSeq	A flexible framework for rapid genome analysis and interpretation.	Genobioinfo Cluster: Ask for Install
spliced_bam2gff	A tool to convert spliced BAM alignments into GFF2 format.	Genobioinfo Cluster: Ask for Install
SpliceGrapher	SpliceGrapher predicts alternative splicing patterns and produces splice graphs that capture in a single structure the ways a gene's exons may be assembled. It enhances gene models using evidence from next-generation sequencing and EST alignments.	Genobioinfo Cluster: Ask for Install
SpliceTools	A suite of downstream RNA splicing analysis tools to investigate mechanisms and impact of alternative splicing, Nucleic Acids Research, 2023.	Genobioinfo Cluster: How to use
Spoa	A multiple sequence alignment tool/library that implements the POA (partial order alignement) algorithm using SIMD.	Genobioinfo Cluster: Ask for Install
Sprai	Sprai (single-pass read accuracy improver) is a tool to correct sequencing errors in single-pass reads for de novo assembly.	Genobioinfo Cluster: Ask for Install
spruceup	Tools to discover, visualize, and remove outlier sequences in large multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
squeakr	Squeakr is a k-mer-counting and multiset-representation system using the recently-introduced counting quotient filter (CQF) Pandey et al. (2017), a feature-rich approximate membership query (AMQ) data structure.	Genobioinfo Cluster: Ask for Install
squid	A C library that is bundled with much of the above software. C function library for sequence analysis.	Genobioinfo Cluster: Ask for Install
SquiggleKit	A toolkit for manipulating nanopore signal data.	Genobioinfo Cluster: Ask for Install
SQuIRE	SQuIRE reveals locus-specific regulation of interspersed repeat expression, Nucleic Acids Research	Genobioinfo Cluster: How to use
SRAsembler	SRAssembler (Selective Recursive local Assembler) is a modular pipeline program that can assemble genomic DNA reads into contigs that are homologous to a query DNA or protein sequence.	Genobioinfo Cluster: Ask for Install
SRAToolkit	Toolkit to query Short Reads Archive at NCBI	Genobioinfo Cluster: How to use
srnaMapper	This tool maps reads produced by sRNA-Seq to a genome.	Genobioinfo Cluster: Ask for Install
sslHiC	sslHiC is a computational framework for comparative analyses of Hi-C data, including reproducibility measurement and differential chromatin interaction (DCI) detection.	Genobioinfo Cluster: How to use
SSPACE	SSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data.	Genobioinfo Cluster: Ask for Install
SSPACE-LongRead	SSPACE-LongRead is a stand-alone program for scaffolding pre-assembled contigs using long reads (e.g. PacBio RS reads).	Genobioinfo Cluster: Ask for Install
SSU-ALIGN		Genobioinfo Cluster: Ask for Install
Stacks	Stacks is a software suite for analysing RAD Sequencing data by Julian Catchen at the University of Oregon. It will process raw Illumina RAD data or RAD data aligned to a reference genome, and produce genotypes that can be viewed and filtered via a web interface.	Genobioinfo Cluster: How to use
stairway_plot	The stairway plot is a method for inferring detailed population demographic history using the site frequency spectrum (SFS) from DNA sequence data. It does not need a pre-defined population model and can be applied to hundreds of unphased sequences.	Genobioinfo Cluster: How to use
STAR	RNA-seq aligner	Genobioinfo Cluster: How to use
STAR-Fusion	STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set (using a GTF file, ideally the same annotation file used during the STAR genome index building process during the intial STAR setup).	Genobioinfo Cluster: How to use
STITCH	STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation.	Genobioinfo Cluster: Ask for Install
Strelka	Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs.	Genobioinfo Cluster: How to use
StringTie	StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus.	Genobioinfo Cluster: How to use
StrobeAlign	Aligns short reads using dynamic seed size with strobemers.	Genobioinfo Cluster: Ask for Install
Structure	The program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.	Genobioinfo Cluster: How to use
Sturgeon	Sturgeon is a CNS neural network classifier for tumour classification	Genobioinfo Cluster: How to use
Subread	A tool kit for processing next-gen sequencing data	Genobioinfo Cluster: How to use
subsampler	Small tool to subsample fasta and fastq files.	Genobioinfo Cluster: Ask for Install
Sumaclust	Fast and exact clustering of sequences.	Genobioinfo Cluster: How to use
Sumatra	Sumatra was developed by the LECA and aims to compute a great deal of sequence similarities in a fast and exact way, based on the length of the Longest Common Subsequence (LCS) between two sequences. Sequence clustering based on similarities is also available through Sumaclust.	Genobioinfo Cluster: How to use
SUPER-FOCUS	A tool for agile functional analysis of metagenomic data.	Genobioinfo Cluster: Ask for Install
SuperCRUNCH	A bioinformatics package for creating, filtering, and manipulating supermatrices and phylogenetic datasets using GenBank and/or local sequence data.	Genobioinfo Cluster: Ask for Install
Supernova	Supernova is a software package for de novo assembly from Chromium Linked-Reads that are made from a single whole-genome library from an individual DNA source.	Genobioinfo Cluster: Ask for Install
superstring	Greedy approximation of the shortest common superstring	Genobioinfo Cluster: How to use
SuperTAD	SuperTAD is an open-source command-line TAD detection package written in C++. It takes either raw or normalized Hi-C contact maps as inputs.	Genobioinfo Cluster: How to use
SUPPA	Fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions.	Genobioinfo Cluster: How to use
Surfboard	A Python package for modern audio feature extraction.	Genobioinfo Cluster: How to use
SURVIVOR	SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.	Genobioinfo Cluster: How to use
SvABA	Structural variation and indel detection by local assembly	Genobioinfo Cluster: How to use
SVanalyzer	Tools for the analysis of structural variation in genomes.	Genobioinfo Cluster: How to use
SVDetect	A tool to detect genomic structural variations from paired-end and mate-pair sequencing data.	Genobioinfo Cluster: Ask for Install
SVIM	SVIM is a structural variant caller for long reads. It is able to detect, classify and genotype five different classes of structural variants.	Genobioinfo Cluster: How to use
SVIM-asm	SVIM-asm (pronounced SWIM-assem) is a structural variant caller for haploid or diploid genome-genome alignments.	Genobioinfo Cluster: How to use
svimmer	Merges similar SVs from multiple single sample VCF files. The tool was written for merging SVs discovered using Manta calls, but should support (almost) any SV VCFs.	Genobioinfo Cluster: Ask for Install
SVJedi	SVJedi is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: Ask for Install
SVJedi-graph	SVJedi-graph is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: How to use
SVMerge	A pipeline to detect structural variants (SVs) by integrating calls from several existing SV callers, which are then validated and the breakpoints refined using local de novo assembly.	Genobioinfo Cluster: How to use
svtyper	Bayesian genotyper for structural variants.	Genobioinfo Cluster: How to use
swarm	A robust and fast clustering method for amplicon-based studies.	Genobioinfo Cluster: How to use
SweeD	A parallel and checkpointable tool that implements a composite likelihood ratio test for detecting selective sweeps. SweeD is based on the SweepFinder algorithm (Nielsen et al. 2005). SweeD can calculate the theoretical SFS of a given demographic model (stepwise changes or with an exponential growth phase + stepwise changes) by using the method by Živković and Stephan (2011).	Genobioinfo Cluster: How to use
sylph	sylph is a program that performs ultrafast (1) ANI querying or (2) metagenomic profiling for metagenomic shotgun samples.	Genobioinfo Cluster: How to use
SYNY	The SYNY pipeline investigates gene collinearity (synteny) between genomes by reconstructing clusters from conserved pairs of protein-coding genes identified from DIAMOND homology searches. It also infers collinearity from pairwise genome alignments with minimap2 or MashMap3.	Genobioinfo Cluster: How to use
SyRI	SyRI is a comprehensive tool for predicting genomic differences between related genomes using whole-genome assemblies (WGA).	Genobioinfo Cluster: How to use
T-Coffee	T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle...) into one unique alignmen.	Genobioinfo Cluster: How to use
T-lex	T-lex is a computational pipeline that detects presence and/or absence of annotated individual transposable elements (TEs) using next-generation sequencing (NGS) data.	Genobioinfo Cluster: Ask for Install
T1K	T1K (The ONE genotyper for Kir and HLA) is a computational tool to infer the alleles for the polymorphic genes such as KIR and HLA. T1K calculates the allele abundances based on the RNA-seq/WES/WGS read alignments on the provided allele reference sequences. The abundances are used to pick the true alleles for each gene. T1K provides the post analysis steps, including novel SNP detection and single-cell representation. T1K supports both single-end and paired-end sequencing data with any read length.	Genobioinfo Cluster: How to use
tabix	TAB-delimited file IndeXer. Useful for vcfTools.	Genobioinfo Cluster: in bcftools and samtools
TACO	Multi-sample transcriptome assembly from RNA-Seq.	Genobioinfo Cluster: How to use
TACT	Adds tips to a backbone phylogeny using taxonomy simulated with birth-death models	Genobioinfo Cluster: Ask for Install
Tandem Repeats Finder	Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides.	Genobioinfo Cluster: How to use
TAPAS	Tool for Alternative Polyadenylation Site Analysis.	Genobioinfo Cluster: How to use
Tapestry	Tapestry is a tool to validate and edit small eukaryotic genome assemblies using long sequence reads. It is designed to help identify complete chromosomes, symbionts, haplotypes, complex features and errors in close-to-complete genome assemblies.	Genobioinfo Cluster: Ask for Install
TARDIS	Toolkit for automated and rapid discovery of structural variants.	Genobioinfo Cluster: Ask for Install
TargetP	TargetP-2.0 server predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (lTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.	Genobioinfo Cluster: How to use
TASSEL	Trait Analysis by aSSociation, Evolution and Linkage. TASSEL has multiple functions, including association study, evaluating evolutionary relationships, analysis of linkage disequilibrium, principal component analysis, cluster analysis, missing data imputation and data visualization for large sets of data.	Genobioinfo Cluster: How to use
TaxonKit	A Practical and Efficient NCBI Taxonomy Toolkit	Genobioinfo Cluster: How to use
TBtools-II	GUI/CommandLine Tool Box for biologistists to utilize NGS data.	Genobioinfo Cluster: How to use
TE_finder	A suite of C++ programs developed for transposable element search and their annotation in large eukaryotic genome sequence. A part of the REPET package.	Genobioinfo Cluster: How to use
Telomerecat	Telomerecat is a tool for estimating the average telomere length (TL) for a paired end, whole genome sequencing (WGS) sample.	Genobioinfo Cluster: in Python-3.9.18
TelomereHunter	TelomereHunter extracts, sorts and analyses telomeric reads from WGS Data.	Genobioinfo Cluster: in Python-2.7.18
TEsorter	It is coded for LTR_retriever to classify long terminal repeat retrotransposons (LTR-RTs) at first. It can also be used to classify any other TE sequences, including Class I and Class II elements which are covered by the REXdb database.	Genobioinfo Cluster: How to use
TETools	Dfam TE Tools includes RepeatMasker, RepeatModeler, and coseg. This container is an easy way to get a minimal yet fully functional installation of RepeatMasker and RepeatModeler and is additionally useful for testing or reproducibility purposes.	Genobioinfo Cluster: How to use
TexLive	TeX Live is intended to be a straightforward way to get up and running with the TeX document production system.	Genobioinfo Cluster: Ask for Install
TGICL	This package automates clustering and assembly of a large EST/mRNA dataset. The clustering is performed by a slightly modified version of NCBI's megablast , and the resulting clusters are then assembled using CAP3 assembly program. TGICL starts with a large multi-FASTA file (and an optional peer quality values file) and outputs the assembly files as produced by CAP3.	Genobioinfo Cluster: How to use
TGSGapFiller	A gap filling tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly.	Genobioinfo Cluster: Ask for Install
Tiberius	Tiberius is a deep learning-based ab initio gene structure prediction tool that end-to-end integrates convolutional and long short-term memory layers with a differentiable HMM layer.	Genobioinfo Cluster: How to use
tidk	tidk is a toolkit to identify and visualise telomeric repeats for the Darwin Tree of Life genomes.	Genobioinfo Cluster: How to use
Tigmint	Tigmint identifies and corrects misassemblies using linked reads from 10x Genomics Chromium.	Genobioinfo Cluster: Ask for Install
TKGWV2	TKGWV2 is a pipeline to estimate biological relatedness (1st, 2nd, and unrelated degrees) between individuals specifically aimed at ultra-low coverage ancient DNA data obtained from whole genome sequencing.	Genobioinfo Cluster: How to use
TMHMM	Prediction of transmembrane helices in proteins.	Genobioinfo Cluster: How to use
Tn3+TA_finder	Tn3 Transposon/Toxin Finder (Tn3+TA_finder) is a program for the automatic prediction of transposable elements of the Tn3 family associated with type II toxin and antitoxin pairs in bacteria and archaea.	Genobioinfo Cluster: How to use
TnComp_finder	Composite Transposon Finder (TnComp_finder) is a program for the prediction of putative composite transposons in bacterial and archaeal genomes based on insertion sequence replicas in a relatively short span.	Genobioinfo Cluster: How to use
TOBIAS	TOBIAS is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.	Genobioinfo Cluster: How to use
TOGA	TOGA is a new method that integrates gene annotation, inferring orthologs and classifying genes as intact or lost.	Genobioinfo Cluster: How to use
Tombo	Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.	Genobioinfo Cluster: Ask for Install
TOPALI-v2	A rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops.	Genobioinfo Cluster: Ask for Install
Tophat	TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.	Genobioinfo Cluster: How to use
toulbar2	toulbar2 is an open-source black-box C++ optimizer for cost function networks and discrete additive graphical models. It can read a variety of formats.	Genobioinfo Cluster: How to use
TPMCalculator	TPMCalculator quantifies mRNA abundance directly from the alignments by parsing BAM files.	Genobioinfo Cluster: How to use
Tracy	Tracy is an efficient and versatile command-line application to basecall, align, assemble and deconvolute Sanger Chromatogram trace files.	Genobioinfo Cluster: Ask for Install
TransDecoder	TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.	Genobioinfo Cluster: How to use
Transposome	Transposome is a command line application to annotate transposable elements from paired-end whole genome shotgun data.	Genobioinfo Cluster: Ask for Install
transposon_annotation_tools	A set of bioconda packages for transposon annotation and transposon feature annotation in nucleotide sequences. transposon_annotation_tools is part of TransposonUltimate. The package includes a series of transposable element discovery tools, such as: MUSTv2, HelitronScanner, SineFinder, MiteTracker, MiteFinderII, SineScan, TirVish, LtrHarvest, RepeatModeler, TransposonPSI, and TransposonProteinNCBICDD1000. You can then use these tools independently.	Genobioinfo Cluster: How to use
transposon_classifier_rfsb	Transposon classification tool for nucleotide sequence classification, providing classification, model training and prediction evaluation. RFSB is part of TransposonUltimate.	Genobioinfo Cluster: How to use
Transrate	Transrate is software for de-novo transcriptome assembly quality analysis.	Genobioinfo Cluster: How to use
TRASH	Tandem Repeat Annotation and Structural Hierarchy: a package to identify and extract tandem repeats in genome sequences and investigate their higher order structures.	Genobioinfo Cluster: How to use
TreeBeST	TreeBeST, which stands for (gene) Tree Building guided by Species Tree, is a versatile program that builds, manipulates and displays phylogenetic trees. It is particularly designed for building gene trees with a known species tree and is highly efficient and accurate.	Genobioinfo Cluster: Ask for Install
TreeMix	TreeMix is a method for inferring the patterns of population splits and mixtures in the history of a set of populations. In the underlying model, the modern-day populations in a species are related to a common ancestor via a graph of ancestral populations. We use the allele frequencies in the modern populations to infer the structure of this graph.	Genobioinfo Cluster: How to use
treePL	treePL is a phylogenetic penalized likelihood program.	Genobioinfo Cluster: Ask for Install
TreeShrink	TreeShrink is an algorithm for detecting abnormally long branches in one or more phylogenetic trees.	Genobioinfo Cluster: How to use
TreeTime	Maximum likelihood inference of time stamped phylogenies and ancestral reconstruction.	Genobioinfo Cluster: Ask for Install
Trim Galore	A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries.	Genobioinfo Cluster: How to use
trimAl	trimAl: a tool for automated alignment trimmin	Genobioinfo Cluster: How to use
Trimmomatic	Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.The selection of trimming steps and their associated parameters are supplied on the command line.	Genobioinfo Cluster: How to use
trinityrnaseq	Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.	Genobioinfo Cluster: How to use
Trinotate	Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.	Genobioinfo Cluster: Ask for Install
tRNAscan-SE	Search for tRNA genes in genomic sequence.	Genobioinfo Cluster: How to use
Truvari	Structural variant comparison tool for VCFs	Genobioinfo Cluster: How to use
Trycycler	Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes.	Genobioinfo Cluster: How to use
TSEBRA	TSEBRA is a combiner tool that selects transcripts from gene predictions based on the support by extrisic evidence in form of introns and start/stop codons. It was developed to combine BRAKER11 and BRAKER22 predicitons to increase their accuracies.	Genobioinfo Cluster: How to use
Twisst	Topology weighting by iterative sampling of sub-trees.	Genobioinfo Cluster: How to use
uLTRA	uLTRA is a tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations.	Genobioinfo Cluster: How to use
Umap	The free umap software package efficiently identifies uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite converted genome (methylome).	Genobioinfo Cluster: Ask for Install
UMI-tools	Tools for handling Unique Molecular Identifiers in NGS data sets	Genobioinfo Cluster: How to use
Unicycler	Unicycler is an assembly pipeline for bacterial genomes.	Genobioinfo Cluster: How to use
Unimap	A EXPERIMENTAL fork of minimap2 optimized for assembly-to-reference alignment.	Genobioinfo Cluster: How to use
unique-kmer-counts	This program calculates the number of distinct k-mers for each sequence record in a fasta file and divides it by the total number of k-mers in that record.	Genobioinfo Cluster: How to use
unitig-caller	Methods to determine sequence element (unitig) presence/absence.	Genobioinfo Cluster: How to use
UnRAR	Easily extract RAR files.	Genobioinfo Cluster: How to use
USEARCH	USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.	Genobioinfo Cluster: How to use
vAMPirus	Automated virus amplicon sequencing analysis program integrated with Nextflow pipeline manager.	Genobioinfo Cluster: How to use
VarDict	VarDict is an ultra sensitive variant caller for both single and paired sample variant calling from BAM files.	Genobioinfo Cluster: Ask for Install
Variabel	A novel approach and method for intrahost variant detection, which outperforms existing ONT variant callers.	Genobioinfo Cluster: Ask for Install
varigraph	An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes.	Genobioinfo Cluster: How to use
VarScan	VarScan is a platform-independent software tool developed at the Genome Institute at Washington University to detect variants in NGS data.	Genobioinfo Cluster: How to use
VarTrix	VarTrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.	Genobioinfo Cluster: Ask for Install
VAST-TOOLS	Vertebrate Alternative Splicing and Transcription Tools (VAST-TOOLS) is a toolset for profiling and comparing alternative splicing events in RNA-Seq data.	Genobioinfo Cluster: Ask for Install
vawk	An awk-like VCF parser	Genobioinfo Cluster: Ask for Install
VCF-kit	Assorted utilities for the variant call format.	Genobioinfo Cluster: Ask for Install
vcf2maf	Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms.	Genobioinfo Cluster: How to use
vcflib	C++ library and cmdline tools for parsing and manipulating VCF files.	Genobioinfo Cluster: How to use
Vcfstats	Vcfstats is a tool that can generate metrics from a vcf file.	Genobioinfo Cluster: How to use
VCFtools	VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic statistics.	Genobioinfo Cluster: How to use
Velocyto	Package for the analysis of expression dynamics in single cell RNA seq data.	Genobioinfo Cluster: How to use
Velvet	Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs.	Genobioinfo Cluster: How to use
vg	Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.	Genobioinfo Cluster: How to use
ViennaRNA	Vienna RNA package allows RNA Secondary Structure Prediction and Comparison	Genobioinfo Cluster: How to use
ViewBS	A powerful toolkit for visualization of high-throughput bisulfite sequencing data.	Genobioinfo Cluster: How to use
Vina-GPU-2.0	Vina-GPU 2.0 accelerates AutoDock Vina and its related commonly derived docking methods, such as QuickVina 2 and QuickVina-W with GPUs.	Genobioinfo Cluster: How to use
VinaLC	A parallel molecular docking program based on AutoDock Vina.	Genobioinfo Cluster: How to use
ViPER	Bioinformatics pipeline used in the Laboratory of Viral Metagenomics (KU Leuven) to trim and assemble paired-end Illumina reads, and classify resulting contigs.	Genobioinfo Cluster: How to use
ViQuaS	An improved reconstruction pipeline for viral quasispecies spectra generated by next-generation sequencing.	Genobioinfo Cluster: Ask for Install
vireoSNP	Vireo: Variational Inference for Reconstructing Ensemble Origin by expressed SNPs in multiplexed scRNA-seq data.	Genobioinfo Cluster: How to use
VIRify	VIRify is a pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies.	Genobioinfo Cluster: How to use
VirSorter2	Customizable pipeline to identify viral sequences from (meta)genomic data.	Genobioinfo Cluster: How to use
VirulenceFinder	VirulenceFinder identifies viruelnce genes in total or partial sequenced isolates of bacteria - at the moment only E. coli, Enterococcus, S. aureus and Listeria are available.	Genobioinfo Cluster: Ask for Install
VITAP	The viral taxonomic assignment pipeline	Genobioinfo Cluster:
viu	A small command-line application to view images from the terminal written in Rust.	Genobioinfo Cluster: How to use
Vmatch	A versatile software tool for eﬃciently solving large scale sequence matching tasks.	Genobioinfo Cluster: Ask for Install
Vosk	Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects.	Genobioinfo Cluster: How to use
Voyager	Rapid and efficient mapping algorithm for long sequencing reads with insertion- and deletion errors. Mapping long reads in Sorted Motif Distance Space.	Genobioinfo Cluster: How to use
VSEARCH	Versatile open-source tool for metagenomics	Genobioinfo Cluster: How to use
Vt	A tool set for short variant discovery in genetic sequence data.	Genobioinfo Cluster: How to use
WASP	WASP is a suite of tools for unbiased allele-specific read mapping and discovery of molecular QTLs.	Genobioinfo Cluster: How to use
Wengan	An accurate and ultra-fast genome assembler	Genobioinfo Cluster: Ask for Install
WFA	The wavefront alignment (WFA) algorithm is an exact gap-affine algorithm that takes advantage of homologous regions between the sequences to accelerate the alignment process.	Genobioinfo Cluster: Ask for Install
wfmash	wfmash is an aligner for pangenomes based on sparse homology mapping and wavefront inception.	Genobioinfo Cluster: Ask for Install
wgd	Python package and CLI for whole genome duplication analyse	Genobioinfo Cluster: How to use
WGDI	WGDI (Whole-Genome Duplication Integrated analysis), a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidizations and cross-species genome alignments.	Genobioinfo Cluster: How to use
WGS-Assembler	Celera Assembler is a de novo whole-genome shotgun (WGS) DNA sequence assembler.	Genobioinfo Cluster: Ask for Install
Wgsim	Wgsim is a small tool for simulating sequence reads from a reference genome.	Genobioinfo Cluster: Ask for Install
WhatsHap	WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.	Genobioinfo Cluster: How to use
Whippet	Lightweight and Fast RNA-seq quantification at the event-level	Genobioinfo Cluster: Ask for Install
Whisper	Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.	Genobioinfo Cluster: How to use
whokaryote	Classification of metagenomic contigs as eukaryotic/prokaryotic using biology-based features.	Genobioinfo Cluster: How to use
WiggleTools	The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon's rank sum test, etc).	Genobioinfo Cluster: How to use
Winnowmap	Winnowmap is a long-read mapping algorithm, and a result of our exploration into superior minimizer sampling techniques.	Genobioinfo Cluster: Ask for Install
wLogDate	Molecular Dating using logarithmic penalty function. wLogDate is a method for dating phylogenetic trees. Given a phylogeny and either sampling times for leaves or calibration points for internal nodes, wLogDate outputs a "dated" tree that conforms to the sampling times or calibration points. It can also work with no sampling time or calibration points where it would simply turn the tree into ultrametric, fixing its height to a given value. Its optimization criterion is to minimize the variance of the mutation rates in log scale (hence the term logDate).	Genobioinfo Cluster: Ask for Install
WoLFPSort	WoLF PSORT is an extension of the PSORT II program for protein subcellular localization prediction, which is based on the PSORT principle. WoLF PSORT converts a protein's amino acid sequences into numerical localization features; based on sorting signals, amino acid composition and functional motifs.	Genobioinfo Cluster: How to use
wtdbg	Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).	Genobioinfo Cluster: How to use
wu-blast	Similarity search against databanks, Washington University Blast.(OBSOLETE)	Genobioinfo Cluster: How to use
yacrd	Yet Another Chimeric Read Detector for long reads	Genobioinfo Cluster: Ask for Install
Yak	Yak is initially developed for two specific use cases: 1) to robustly estimate the base accuracy of CCS reads and assembly contigs, and 2) to investigate the systematic error rate of CCS reads.	Genobioinfo Cluster: Ask for Install
YASIM	Yet Another SIMulator for Alternative Splicing Events and Realistic Gene Expression Profile. YASIM is the tool that simulates Next- or Third-Generation bulk RNA-Sequencing raw FASTQ reads with ground truth genome annotation and realistic gene expression profile (GEP). It can be used to benchmark tools that are claimed to be able to detect isoforms (e.g., StringTie) or quantify reads on an isoform level (e.g., featureCounts).	Genobioinfo Cluster: How to use
Yleaf	Software for human Y-chromosomal haplogroup inference from next generation sequencing data.	Genobioinfo Cluster: How to use

Amplicon analysis

Application	Description	Availability/Use
CRISPResso2	CRISPResso is a software pipeline designed to enable rapid and intuitive interpretation of genome editing experiments. Briefly, CRISPResso: aAligns sequencing reads to a reference sequence, quantifies insertions, mutations and deletions to determine whether a read is modified or unmodified by genome editing, summarizes editing results in intuitive plots and datasets.	Genobioinfo Cluster: How to use
ITSxpress	Software to trim the ITS region of FASTQ sequences for amplicon sequencing analysis.	Genobioinfo Cluster: How to use
metabinkit	From metagenomic or metabarcoding data, it is often necessary to assign taxonomy to DNA sequences. This is generally performed by aligning sequences to a reference database, usually resulting in multiple database alignments for each query sequence. Using these alignment results, metabinkit assigns a single taxon to each query sequence, based on user-defined percentage identity thresholds. In essence, for each query, the alignments are filtered based on the percentage identity thresholds and the lowest common ancestor for all alignments passing the filters is determined. The metabin program is not limited to BLAST alignments, and can accept alignment results produced using any program, provided the input format is correct. However, functionality is also available to create BLAST databases and to perform BLAST alignments, which can be passed directly to metabin.	Genobioinfo Cluster: How to use
NaMeco	Pipeline for the Nanopore 16S long read clustering and taxonomy classification.	Genobioinfo Cluster: How to use
raxtax	raxtax is a fast and efficient k-mer-based non-Bayesian taxonomic classifier for barcoding DNA sequences.	Genobioinfo Cluster: How to use

Annotation

Application	Description	Availability/Use
AGAT	Another Gff Analysis Toolkit: suite of tools to handle gene annotations in any GTF/GFF format. Some examples what AGAT can do: standardise any GTF/GFF file into a comprehensive GFF3 format (script with agat_sp prefix): add missing parent features (e.g. gene and mRNA if only CDS/exon exist). add missing features (e.g. exon and UTR). add missing mandatory attributes (i.e. ID, Parent). fix identifier to be uniq. fix feature location. remove duplicated features. group related features (if spread in different places in the file). sort features. merge overlapping loci into one single locus (only if option activated).	Genobioinfo Cluster: How to use
AnnotSV	An integrated tool for Structural Variations annotation and ranking.	Genobioinfo Cluster: How to use
ApoplastP	ApoplastP is a machine learning method for predicting localization of proteins to the plant apoplast. ApoplastP can distinguish non-apoplastic proteins from apoplastic proteins for both plant proteins and pathogen proteins. In particular, ApoplastP can predict if an effector localizes to the plant apoplast.	Genobioinfo Cluster: How to use
ARAGORN	ARAGORN is a program to detect tRNA genes and tmRNA genes in nucleotide sequence	Genobioinfo Cluster: Ask for Install
Augustus	Augustus is a program that predicts genes in eukaryotic genomic sequences	Genobioinfo Cluster: How to use
AvP	AvP performs automatic detection of HGT candidates within a phylogenetic framework.	Genobioinfo Cluster: How to use
Bakta	Rapid & standardized annotation of bacterial genomes, MAGs & plasmids.	Genobioinfo Cluster: How to use
Barrnap	Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).	Genobioinfo Cluster: Ask for Install
BaseSpace_CLI	Command line interface to work with BaseSpace Sequence Hub data.	Genobioinfo Cluster: How to use
Braker	BRAKER(1,2,3) is a tool for fully automated genome annotation with GeneMark-ET and AUGUSTUS	Genobioinfo Cluster: How to use
BRAKER4	BRAKER4 is a complete rewrite of the BRAKER pipeline in Snakemake ( tool for fully automated genome annotation with GeneMark-ET and AUGUSTUS) The gene prediction logic is the same: GeneMark trains on extrinsic evidence, AUGUSTUS is trained on GeneMark predictions, and TSEBRA merges the results. What changed is how this logic is orchestrated.	Genobioinfo Cluster: How to use
CAMISIM	CAMISIM is a software to model abundance distributions of microbial communities and to simulate corresponding shotgun metagenome datasets.	Genobioinfo Cluster: How to use
CAT	This project aims to provide a straightforward end-to-end pipeline that takes as input a HAL-format multiple whole genome alignment as well as a GFF3 file representing annotations on one high quality assembly in the HAL alignment, and produces a output GFF3 annotation on all target genomes chosen.	Genobioinfo Cluster: Ask for Install
CEGMA	CEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.	Genobioinfo Cluster: How to use
CENSOR	CENSOR compares and masks protein or nucleotide sequences.	Genobioinfo Cluster: How to use
ChromHMM	ChromHMM is software for learning and characterizing chromatin states.	Genobioinfo Cluster: How to use
CIRCexplorer2	CIRCexplorer2 is a comprehensive and integrative circular RNA analysis toolset	Genobioinfo Cluster: How to use
Conterminator	Conterminator is an efficient method for detecting incorrectly labeled sequences across kingdoms by an exhaustive all-against-all sequence comparison.	Genobioinfo Cluster: Ask for Install
CONTRAST	CONTRAST predicts protein-coding genes from a multiple genomic alignment using a combination of discriminative machine learning techniques.	Genobioinfo Cluster: Ask for Install
dbCAN3	run_dbcan (dbCAN3) is the standalone version of the dbCAN3 annotation tool for automated CAZyme annotation.	Genobioinfo Cluster: How to use
DIAMOND	Accelerated BLAST compatible local sequence aligner.	Genobioinfo Cluster: How to use
DIAMOND2GO	Diamond2GO is a set of tools that can rapidly assign gene ontology and perform enrichment for functional genomics.	Genobioinfo Cluster: How to use
DRAM	DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs.	Genobioinfo Cluster: Ask for Install
E2P2	Ensemble Enzyme Prediction Pipeline.	Genobioinfo Cluster: How to use
EDTA	This package is developed for automated whole-genome de-novo TE annotation and benchmarking the annotation performance of TE libraries.	Genobioinfo Cluster: How to use
EffectorP	EffectorP is a machine learning method for fungal effector prediction in secretomes and has been trained to distinguish secreted proteins from secreted effectors in plant-pathogenic fungi.	Genobioinfo Cluster: How to use
EGAPx	EGAPx is the publicly accessible version of the updated NCBI Eukaryotic Genome Annotation Pipeline.	Genobioinfo Cluster: How to use
eggNog-mapper	eggnog-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database to transfer functional information from fine-grained orthologs only.	Genobioinfo Cluster: How to use
Enformer	This package provides an implementation of the Enformer model and examples on running the model.	Genobioinfo Cluster: How to use
EnTAP	EnTAP is an eukaryotic non-model annotation pipeline developed by Alexander Hart and Dr. Jill Wegrzyn of the Plant Computational Genomics Lab at the University of Connecticut.	Genobioinfo Cluster: Ask for Install
EuGeneEP	EuGene is an open integrative gene finder for eukaryotic and prokaryotic genomes. EuGene-EP (Eukaryote Pipeline) facilitates the application of EuGene on eukaryote genomes.	Genobioinfo Cluster: Ask for Install
EviAnn	EviAnn (Evidence Annotation) is novel genome annotation software. It is purely evidence-based. EviAnn derives protein-coding gene and long non-coding RNA annotations from RNA-seq data and/or transcripts, and alignments of proteins from related species. EviAnn outputs annotations in GFF3 format. EviAnn does not require genome repeats to be soft-masked prior to running annotation.	Genobioinfo Cluster: How to use
EVidenceModeler (EVM)	The EVidenceModeler (aka EVM) software combines ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. EVM provides a flexible and intuitive framework for combining diverse evidence types into a single automated gene structure annotation system.	Genobioinfo Cluster: How to use
Finder	Finder is a gene annotator pipeline which automates the process of downloading short reads, aligning them and using the assembled transcripts to generate gene annotations.	Genobioinfo Cluster: Ask for Install
fixchr	This package selects homologous chromosomes between two genomes by comparing whole-genome alignments between them. Additionally, it generates dotplots for quick checking of the output.	Genobioinfo Cluster: How to use
fpma	Fast Plant Mito Annotation.	Genobioinfo Cluster: Ask for Install
FragGeneScan	FragGeneScan is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.	Genobioinfo Cluster: Ask for Install
FragGeneScanRs	FragGeneScanRs is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes. It is a re-implementation of FragGeneScan in Rust.	Genobioinfo Cluster: How to use
FrameDP	Sensitive peptide detection on noisy matured sequences. Available with command line interface on the cluster.	Genobioinfo Cluster: Ask for Install
Funannotate	Funannotate is a genome prediction, annotation, and comparison software package.	Genobioinfo Cluster: How to use
GAAS	Genome Assembly Annotation Service: Suite of tools related to Genome Assembly Annotation Service tasks.	Genobioinfo Cluster: Ask for Install
GALBA	GALBA is a pipeline for fully automated prediction of protein coding gene structures with AUGUSTUS in novel eukaryotic genomes for the scenario where high quality proteins from a closely related species are available.	Genobioinfo Cluster: Ask for Install
GeMoMa	Gene Model Mapper (GeMoMa) is a homology-based gene prediction program.	Genobioinfo Cluster: Ask for Install
geneid	geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure.	Genobioinfo Cluster: How to use
GeneMark-ES	Unsupervised training is an important feature of the GeneMark-ES algorithm that identifies protein coding genes in eukaryotic genomes. This is the only eukaryotic gene finder that can perform gene prediction without curated training sets.	Genobioinfo Cluster: How to use
GeneMark-ET	a semi-supervised version of GeneMark-ES, called GeneMark-ET that uses RNA-Seq reads to improve training.	Genobioinfo Cluster: How to use
GeneMark-ETP	Gene finding in eukaryotic genomes supported by transcriptome sequencing and protein homology.	Genobioinfo Cluster: How to use
GeneMarkS	A self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.	Genobioinfo Cluster: How to use
GenMap	Fast and Exact Computation of Genome Mappability.	Genobioinfo Cluster: How to use
GenomeThreader	GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments.	Genobioinfo Cluster: How to use
gff3toembl	Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI	Genobioinfo Cluster: Ask for Install
ggCaller	A de Bruijn graph-based gene-caller and pangenome analysis tool.	Genobioinfo Cluster: How to use
GINGER	GINGER is a tool that is implemented an integrated method for gene structure prediction in higher eukaryotes.	Genobioinfo Cluster: How to use
GlimmerHMM	GlimmerHMM is a new gene finder based on a Generalized Hidden Markov Model (GHMM). Although the gene finder conforms to the overall mathematical framework of a GHMM, additionally it incorporates splice site models adapted from the GeneSplicer program and a decision tree adapted from GlimmerM. It also utilizes Interpolated Markov Models for the coding and noncoding models . Currently, GlimmerHMM's GHMM structure includes introns of each phase, intergenic regions, and four types of exons (initial, internal, final, and single).	Genobioinfo Cluster: How to use
Gmove	Gmove is a genome annotation tool. This combiner takes as input mapping of RNA-seq or protein or ab initio data.	Genobioinfo Cluster: Ask for Install
GraffiTE	GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.	Genobioinfo Cluster: How to use
GrAnnoT	GrAnnoT is an annotation transfer tool for pangenome graphs.	Genobioinfo Cluster: How to use
GUSHR	Assembly-free construction of UTRs from short read RNA-Seq data on the basis of coding sequence annotation. This tool has been adapted to the format needs of AUGUSTUS/BRAKER and employs GeMoMa for generating UTRs from RNA-Seq coverage data.	Genobioinfo Cluster: How to use
HANNO	Efficient High-throughput ANNOtation of protein coding genes in eukaryote genomes.	Genobioinfo Cluster: How to use
Helixer	Using Deep Learning to predict gene annotations.	Genobioinfo Cluster: How to use
HelixerPost	HelixerPost uses a sliding window assessment to determine regions of the genome which are likely gene containing.	Genobioinfo Cluster: How to use
hexamer	Find likely coding segments in DNA using composition-normalised hexamer tables.	Genobioinfo Cluster: How to use
ICEscreen	ICEscreen is a bioinformatic pipeline for the detection and annotation of ICEs (Integrative and Conjugative Elements) and IMEs (Integrative and Mobilizable Elements) in Bacillota genomes.	Genobioinfo Cluster: How to use
Infernal	Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).	Genobioinfo Cluster: How to use
Integron Finder	Bioinformatics tool to find integrons in bacterial genomes.	Genobioinfo Cluster: How to use
iREAD	iREAD (intron REtention Analysis and Detector)is a tool to detect intron retention(IR) events from RNA-seq datasets.	Genobioinfo Cluster: Ask for Install
ISEScan	A python pipeline to identify IS (Insertion Sequence) elements in genome and metagenome. ISEScan can be used to identify/annotate full-length or non-full-length IS elements in any DNA sequence but ISEScan was only tested on prokarytoic genome including draft genome and meta-genome. Among the existing tools identifying IS elements, ISEScan might be the only one that gives TIR (Terminal Inverted Repeat) sequences.	Genobioinfo Cluster: How to use
KofamScan	KofamScan is a gene function annotation tool based on KEGG Orthology and hidden Markov model	Genobioinfo Cluster: How to use
Liftoff	Liftoff is a tool that accurately maps annotations in GFF or GTF between assemblies of the same, or closely-related species.	Genobioinfo Cluster: How to use
LiftOn	Accurate annotation mapping for GFF/GTF across assemblies.	Genobioinfo Cluster: How to use
LOCALIZER	LOCALIZER is a machine learning method for predicting the subcellular localization of both plant proteins and pathogen effectors in the plant cell.	Genobioinfo Cluster: How to use
Loctree3	Protein Subcelullar Localization Sequenced-Based Predictor	Genobioinfo Cluster: How to use
longdust	Longdust identifies long highly repetitive STRs, VNTRs, satellite DNA and other low-complexity regions (LCRs) in a genome.	Genobioinfo Cluster: How to use
Look4TRs	A de-novo tool for detecting simple tandem repeats using self-supervised hidden Markov models.	Genobioinfo Cluster: How to use
LTR_FINDER_parallel	A parallel wrapper for LTR_FINDER (LTR_Finder is an efficient program for finding full-length LTR retrotranspsons in genome sequences.)	Genobioinfo Cluster: Ask for Install
LTR_retriever	LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons.	Genobioinfo Cluster: How to use
MacSyFinder	Detection of macromolecular systems in protein datasets using systems modelling and similarity search. Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. Macromolecular System Finder (MacSyFinder) is a program that uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system.	Genobioinfo Cluster: How to use
MAKER	MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.	Genobioinfo Cluster: How to use
MetaEuk	MetaEuk is a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs.	Genobioinfo Cluster: How to use
Mfannot	MFannot is a program for the annotation of mitochondrial and plastid genomes. It is a PERL wrapper around a set of diverse, external independent tools.	Genobioinfo Cluster: Ask for Install
mgatk	A mitochondrial genome analysis toolkit.	Genobioinfo Cluster: How to use
MinCED	MinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such as metagenomes, in which sequence size can be anywhere from 100 to 800 bp. MinCED runs from the command-line and was derived from CRT	Genobioinfo Cluster: Ask for Install
MiPepid	MiPepid is a software specifically for predicting the coding capabilities of sORFs. sORFs / smORFs are short open reading frames with length <= 303 bp (including the stop codon), and if translated, they encode micropeptides that are <= 100 amino acids.	Genobioinfo Cluster: How to use
MitoFinder	Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data. MitoFinder is also designed to find and annotate mitochondrial sequences in existing genomic assemblies (generated from Hifi/PacBio/Nanopore/Illumina sequencing data...)	Genobioinfo Cluster: How to use
MitoHPC	MitoHPC : Mitochondrial High Performance Caller. For Calling Mitochondrial Homoplasmies and Heteroplasmies.	Genobioinfo Cluster: How to use
MitoSeek	MitoSeek is an open-source software tool to reliably and easily extract mitochondrial genome information from exome sequencing data. MitoSeek evaluates mitochondrial genome alignment quality, estimates relative mitochondrial copy numbers, and detects heteroplasmy, somatic mutation, and structural variance of the mitochondrial genome.	Genobioinfo Cluster: Ask for Install
MitoZ	MitoZ is a Python3-based toolkit which aims to automatically filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences from the genome assembly result, annotate mitogenome (genbank file as result), and mitogenome visualization.	Genobioinfo Cluster: How to use
MiXCR	MiXCR is a universal software for fast and accurate analysis of raw T- or B- cell receptor repertoire sequencing data.	Genobioinfo Cluster: Ask for Install
MOB-suite	Software tools for clustering, reconstruction and typing of plasmids from draft assemblies. The MOB-suite is designed to be a modular set of tools for the typing and reconstruction of plasmid sequences from WGS assemblies.	Genobioinfo Cluster: How to use
MobileElementFinder	MobileElementFinder is a tool for identifying Mobile Genetic Elements (MGEs) in Whole Genome Shotgun sequence data.	Genobioinfo Cluster: How to use
MSIsensor-pro	MSIsensor-pro is an updated version of msisensor. MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data.	Genobioinfo Cluster: How to use
NLR-Annotator	Disease resistance genes encoding nucleotide-binding and leucine-rich repeat (NLR) intracellular immune receptor proteins detect pathogens by the presence of pathogen effectors. Although developed for wheat, we demonstrate the universal applicability of NLR-Annotator across diverse plant taxa. NLR-Annotator is a tool to annotate loci associated with NLRs in large sequences.	Genobioinfo Cluster: Ask for Install
NLRtracker	NLRtracker extracts and functionally annotates NLRs from protein or transcript files based on the core features found in the RefPlantNLR dataset.	Genobioinfo Cluster: How to use
ORA	Bio::ORA is a featherweight object for identifying mammalian olfactory receptor genes. The sequences should not be longer than 40kb. The returned array include location, sequence and statistic for the putative olfactory receptor gene. Fully functional with DNA and EST sequence, no intron supported.	Genobioinfo Cluster: Ask for Install
ORFfinder	ORFfinder searches for open reading frames (ORFs) in the DNA sequence you enter.	Genobioinfo Cluster: How to use
orfipy	Fast and flexible ORF finder.	Genobioinfo Cluster: Ask for Install
OrthoFinder	OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses.	Genobioinfo Cluster: How to use
palm_annot	Scripts, HMMs and search databases for identifying and classifying viral RdRp sequences	Genobioinfo Cluster: How to use
Palmscan	Palmscan is software to detect viral polymerase palmprint barcode sequences in longer sequences such as virus genomes and ORFs. Palmprints can be used to classify RNA viruses.	Genobioinfo Cluster: How to use
PanACoTA	PANgenome with Annotations, COre identification, Tree and corresponding Alignments.	Genobioinfo Cluster: How to use
PanTools	PanTools is a toolkit for comparative analysis of large number of genomes.	Genobioinfo Cluster: How to use
PASApipeline	PASA, acronym for Program to Assemble Spliced Alignments, is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.	Genobioinfo Cluster: How to use
PathoFact	PathoFact is an easy-to-use modular pipeline for the metagenomic analyses of toxins, virulence factors and antimicrobial resistance.	Genobioinfo Cluster: Ask for Install
peaks2utr	peaks2utr is a Python command-line tool that annotates 3' untranslated regions (UTR) for a given set of aligned sequencing reads in BAM format, and canonical annotation in GFF or GTF format.	Genobioinfo Cluster: How to use
PGAP	The NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids).	Genobioinfo Cluster: How to use
PHASTER_scripts	Small utility scripts to query the PHASTER API endpoint, to identify and annotate prophage sequences within bacterial genomes and plasmids.	Genobioinfo Cluster: How to use
PhiSpy	PhiSpy identifies prophages in Bacterial (and probably Archaeal) genomes. Given an annotated genome it will use several approaches to identify the most likely prophage regions.	Genobioinfo Cluster: Ask for Install
Phobius	A combined transmembrane topology and signal peptide predictor.	Genobioinfo Cluster: Ask for Install
PlantiSMASH	PlantiSMASH is a specialized extension of antiSMASH for the identification and analysis of biosynthetic gene clusters (BGCs) in plant genomes. It supports advanced plant-specific detection rules and features for comparative genomics, visualization, and more.	Genobioinfo Cluster: How to use
PlasFlow	Software for prediction of plasmid sequences in metagenomic assemblies.	Genobioinfo Cluster: How to use
PlasmidFinder	The service identifies plasmids in total or partial sequenced isolates of bacteria.	Genobioinfo Cluster: How to use
Portcullis	Splice junction analysis and filtering from BAM files.	Genobioinfo Cluster: Ask for Install
PredGPI	Prediction of glycosylphosphatidylinositol-anchors in proteins based on HMMs and SVMs.	Genobioinfo Cluster: How to use
Prodigal	Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee.	Genobioinfo Cluster: How to use
PROKKA	Prokka is a software tool for the rapid annotation of prokaryotic genomes. A typical 4 Mbp genome can be fully annotated in less than 10 minutes on a quad-core computer, and scales well to 32 core SMP systems. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.	Genobioinfo Cluster: How to use
PSAURON	PSAURON is a machine learning model for rapid assessment of protein coding gene annotation.	Genobioinfo Cluster: How to use
Pseudofinder	Detection of pseudogene candidates in bacterial and archaeal genomes.	Genobioinfo Cluster: Ask for Install
RabbitV	RabbitV is a highly optimized and practical toolkit for the detection of viruses and microorganisms in sequencing data.	Genobioinfo Cluster: Ask for Install
RATT	RATT is software to transfer annotation from a reference (annotated) genome to an unannotated query genome.	Genobioinfo Cluster: Ask for Install
RdRpCATCH	A community effort to create a shared resource for HMM-based RdRp discovery	Genobioinfo Cluster: How to use
REPET	The REPET package (t Flutre et al, 2011 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale.	Genobioinfo Cluster: How to use
ResFinder	ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.	Genobioinfo Cluster: How to use
Resistify	Resistify is a program which rapidly identifies and classifies plant resistance genes from protein sequences. It is designed to be lightweight and easy to use.	Genobioinfo Cluster: How to use
RFPlasmid	RFPlasmid predicts plasmid contigs from assemblies using single copy marker genes, plasmid genes, and kmers.	Genobioinfo Cluster: How to use
RGAugury	A pipeline consisted of a couple of scripts for genome-wide RGAs prediction, most of single script in this package can work independently or together.	Genobioinfo Cluster: How to use
RGI	Software to predict resistomes from protein or nucleotide data, including metagenomics data, based on homology and SNP models.	Genobioinfo Cluster: How to use
SIFT4G	Sorting Intolerant From Tolerant For Genomes.	Genobioinfo Cluster: How to use
SIFT4G_Annotator	Annotating VCF files using the SIFT4G databases.	Genobioinfo Cluster: Ask for Install
SNAP	Gene prediction tool	Genobioinfo Cluster: How to use
snoGPS	Search for H/ACA snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
Snoscan	Search for C/D box methylation guide snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
SOBAcl	The Sequence Ontology Bioinformatics Analysis command line tool (SOBAcl) will generate a variety of tables, graphs and reports from the data in GFF3 files and format the output in a variety of ways.	Genobioinfo Cluster: Ask for Install
Sturgeon	Sturgeon is a CNS neural network classifier for tumour classification	Genobioinfo Cluster: How to use
T-lex	T-lex is a computational pipeline that detects presence and/or absence of annotated individual transposable elements (TEs) using next-generation sequencing (NGS) data.	Genobioinfo Cluster: Ask for Install
TAMA	Transcriptome Annotation by Modular Algorithms: this software was designed for processing Iso-Seq data and other long read transcriptome data.	Genobioinfo Cluster: How to use
TargetP	TargetP-2.0 server predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (lTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.	Genobioinfo Cluster: How to use
TEnest	TEnest is a tool for finding and annotating transposable element (TE) insertions.	Genobioinfo Cluster: How to use
Tiberius	Tiberius is a deep learning-based ab initio gene structure prediction tool that end-to-end integrates convolutional and long short-term memory layers with a differentiable HMM layer.	Genobioinfo Cluster: How to use
Tn3+TA_finder	Tn3 Transposon/Toxin Finder (Tn3+TA_finder) is a program for the automatic prediction of transposable elements of the Tn3 family associated with type II toxin and antitoxin pairs in bacteria and archaea.	Genobioinfo Cluster: How to use
TnComp_finder	Composite Transposon Finder (TnComp_finder) is a program for the prediction of putative composite transposons in bacterial and archaeal genomes based on insertion sequence replicas in a relatively short span.	Genobioinfo Cluster: How to use
TOGA	TOGA is a new method that integrates gene annotation, inferring orthologs and classifying genes as intact or lost.	Genobioinfo Cluster: How to use
TransDecoder	TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.	Genobioinfo Cluster: How to use
Transposome	Transposome is a command line application to annotate transposable elements from paired-end whole genome shotgun data.	Genobioinfo Cluster: Ask for Install
transposon_annotation_tools	A set of bioconda packages for transposon annotation and transposon feature annotation in nucleotide sequences. transposon_annotation_tools is part of TransposonUltimate. The package includes a series of transposable element discovery tools, such as: MUSTv2, HelitronScanner, SineFinder, MiteTracker, MiteFinderII, SineScan, TirVish, LtrHarvest, RepeatModeler, TransposonPSI, and TransposonProteinNCBICDD1000. You can then use these tools independently.	Genobioinfo Cluster: How to use
transposon_classifier_rfsb	Transposon classification tool for nucleotide sequence classification, providing classification, model training and prediction evaluation. RFSB is part of TransposonUltimate.	Genobioinfo Cluster: How to use
TRASH	Tandem Repeat Annotation and Structural Hierarchy: a package to identify and extract tandem repeats in genome sequences and investigate their higher order structures.	Genobioinfo Cluster: How to use
Trinotate	Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.	Genobioinfo Cluster: Ask for Install
tRNAscan-SE	Search for tRNA genes in genomic sequence.	Genobioinfo Cluster: How to use
TSEBRA	TSEBRA is a combiner tool that selects transcripts from gene predictions based on the support by extrisic evidence in form of introns and start/stop codons. It was developed to combine BRAKER11 and BRAKER22 predicitons to increase their accuracies.	Genobioinfo Cluster: How to use
VARUS	VARUS automates the selection and download of a limited number of RNA-seq reads from at NCBI's Sequence Read Archive (SRA) targeting a sufficiently high coverage for many genes for the purpose of gene-finder training and genome annotation.	Genobioinfo Cluster: Ask for Install
vcf2maf	Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms.	Genobioinfo Cluster: How to use
VDJ-insights	VDJ-Insights is a robust software package for accurate annotation of the V, D, and J gene segments within immunoglobulin (IG) and T-cell receptor (TCR) genomic regions.	Genobioinfo Cluster: How to use
VIRify	VIRify is a pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies.	Genobioinfo Cluster: How to use
VirulenceFinder	VirulenceFinder identifies viruelnce genes in total or partial sequenced isolates of bacteria - at the moment only E. coli, Enterococcus, S. aureus and Listeria are available.	Genobioinfo Cluster: Ask for Install
WoLFPSort	WoLF PSORT is an extension of the PSORT II program for protein subcellular localization prediction, which is based on the PSORT principle. WoLF PSORT converts a protein's amino acid sequences into numerical localization features; based on sorting signals, amino acid composition and functional motifs.	Genobioinfo Cluster: How to use

Assembly

Application	Description	Availability/Use
3D-DNA	3D de novo assembly (3D-DNA) pipeline.	Genobioinfo Cluster: How to use
ABySS	ABySS (Assembly By Short Sequences) is a de novo, parallel, paired-end sequence assembler that is designed for short reads.	Genobioinfo Cluster: How to use
AGC	Assembled Genomes Compressor (AGC) is a tool designed to compress collections of de-novo assembled genomes. It can be used for various types of datasets: short genomes (viruses) as well as long (humans).	Genobioinfo Cluster: How to use
ALLHiC	Phasing and scaffolding polyploid genomes based on Hi-C data	Genobioinfo Cluster: How to use
AllPaths-LG	ALLPATHS-LG is a whole genome shotgun assembler that can generate high quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers. The significant difference between ALLPATHS and traditional assemblers such as Arachne is that ALLPATHS assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies.	Genobioinfo Cluster: Ask for Install
AMOS	A Modular, Open-Source whole genome assembler.	Genobioinfo Cluster: Ask for Install
AnchorWave	AnchorWave (Anchored Wavefront Alignment) identifies collinear regions via conserved anchors (full-length CDS and full-length exon have been implemented currently) and breaks collinear regions into shorter fragments, i.e., anchor and inter-anchor intervals.	Genobioinfo Cluster: Ask for Install
Aquila	Diploid personal genome assembly and comprehensive variant detection based on linked-reads.	Genobioinfo Cluster: Ask for Install
ARBitR	ARBitR is an overlap aware genome assembly scaffolder for linked sequencing reads.	Genobioinfo Cluster: Ask for Install
ARC	ARC is a pipeline which facilitates iterative, reference guided de novo assemblies with the intent of: - Reducing time in analysis and increasing accuracy of results by only considering those reads which should assemble together. - Reducing/removing reference bias as compared to mapping based approaches.	Genobioinfo Cluster: Ask for Install
ARCS	Scaffolding genome sequence assemblies using 10X Genomics GemCode/Chromium data	Genobioinfo Cluster: Ask for Install
ARKS	Scaffolding genome sequence assemblies using 10X Genomics GemCode/Chromium data. This project is a new kmer-based (alignment free) implementation of ARCS. It provides improved runtime performance over the original ARCS implementation by removing the requirement to perform alignments with bwa mem.	Genobioinfo Cluster: Ask for Install
ASAPy	Assemble species by automatic partitioning. This is a Python wrapper for ASAP.
ASGART	ASGART (A Segmental duplications Gathering and Refinement Tool) is a multiplatform (GNU/Linux, macOS, Windows) tool designed to search for large duplications amongst one or two DNA strands.	Genobioinfo Cluster: Ask for Install
assemblathon2	This repo contains a motley assortment of unpublished scripts and commands used by Ian Korf, Keith Bradnam, and Joe Fass in the analysis of Assemblathon 2 competition entries (assemblies).	Genobioinfo Cluster: How to use
assembly-stats	Get assembly statistics from FASTA and FASTQ files.	Genobioinfo Cluster: Ask for Install
Assemblytics	Assemblytics is a bioinformatics tool to detect and analyze structural variants from a genome assembly by comparing it to a reference genome.	Genobioinfo Cluster: Ask for Install
Assexon	Assembling Exon Using Gene Capture Data	Genobioinfo Cluster: Ask for Install
aTRAM	aTRAM ("automated target restricted assembly method") is an iterative assembler that performs reference-guided local de novo assemblies using a variety of available methods.	Genobioinfo Cluster: Ask for Install
Autocycler	A tool for generating consensus long-read assemblies for bacterial genomes.	Genobioinfo Cluster: How to use
AutoHiC	AutoHiC is a deep learning tool that uses Hi-C data to support genome assembly. It can automatically correct errors during genome assembly and generate genomes at the chromosome level.	Genobioinfo Cluster: How to use
Bandage-NG	Bandage-NG is a GUI program that allows users to interact with the assembly graphs made by de novo assemblers such as SPAdes, MEGAHIT and others.	Genobioinfo Cluster: How to use
BCALM2	A bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.	Genobioinfo Cluster: Ask for Install
Bifrost	Highly parallel construction and indexing of colored and compacted de Bruijn graphs.	Genobioinfo Cluster: How to use
BiSCoT	BiSCoT is a tool that aims to improve the contiguity of scaffolds and contigs generated after a Bionano scaffolding.	Genobioinfo Cluster: Ask for Install
Bridger	Bridger is an efficient de novo trascriptome assembler for RNA-Seq data.	Genobioinfo Cluster: How to use
BUSCO	BUSCO v2 provides quantitative measures for the assessment of genome assembly, gene set, and transcriptome completeness, based on evolutionarily-informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB v9. BUSCO assessments are implemented in open-source software, with a large selection of lineage-specific sets of Benchmarking Universal Single-Copy Orthologs. These conserved orthologs are ideal candidates for large-scale phylogenomics studies, and the annotated BUSCO gene models built during genome assessments provide a comprehensive gene predictor training set for use as part of genome annotation pipelines.	Genobioinfo Cluster: How to use
BUSCO_phylogenomics	This is a Python pipeline to construct species phylogenies using BUSCO proteins. It works directly from BUSCO output and can generate concatenated supermatrix alignments and also gene trees of BUSCO families. The pipeline identifies BUSCO proteins that are complete and single-copy in all input samples. Alternatively, you can account for missing data and choose to include BUSCO proteins that are complete and single-copy in a certain percentage of input samples. Each BUSCO family is individually aligned, trimmed, and then concatenated together to generate a supermatrix alignment. The pipeline also identifies BUSCO proteins that are complete and single-copy in at least 4 input samples, and generates gene trees for each of these families.	Genobioinfo Cluster: How to use
BWISE	Software for genome assembly using short-reads.	Genobioinfo Cluster: How to use
CAMI-AMBER	AMBER is an evaluation package for the comparative assessment of genome reconstructions and taxonomic assignments from metagenome benchmark datasets.	Genobioinfo Cluster: Ask for Install
CAMISIM	CAMISIM is a software to model abundance distributions of microbial communities and to simulate corresponding shotgun metagenome datasets.	Genobioinfo Cluster: How to use
canu	A single molecule sequence assembler for genomes large and small.	Genobioinfo Cluster: How to use
CAP3	A DNA Sequence Assembly Program	Genobioinfo Cluster: How to use
CARNAC-LR	Clustering coefficient-based Acquisition of RNA Communities in Long Reads.	Genobioinfo Cluster: Ask for Install
CheckM	Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes.	Genobioinfo Cluster: How to use
chromeister	A dotplot generator for large chromosomes.	Genobioinfo Cluster: How to use
Chromonomer	Chromonomer is a program designed to integrate a genome assembly with a genetic map.	Genobioinfo Cluster: Ask for Install
Circlator	A tool to circularize genome assemblies	Genobioinfo Cluster: How to use
CliqueSNV	Scalable Reconstruction of Intra-Host Viral Populations from NGS Reads.	Genobioinfo Cluster: Ask for Install
COMEBin	COMEBin allows effective binning of metagenomic contigs using COntrastive Multi-viEw representation learning.	Genobioinfo Cluster: How to use
compleasm	A genome completeness evaluation tool based on miniprot.	Genobioinfo Cluster: How to use
CulebrONT	An open-source, scalable, modular and traceable Snakemake pipeline, able to launch multiple assembly tools in parallel, giving you the possibility of circularise, polish, and correct assemblies, checking quality. CulebrONT can help to choose the best assembly between all possibilities.	Genobioinfo Cluster: Ask for Install
DAmar	Long read QC, assembly and scaffolding pipeline for PacBio or Oxford Nanopore long-read sequencing data. T he pipeline produces a number of QC metrics at various stages as well as incorporating further technologies including Bionano, 10x and HiC data to scaffold the created contigs. DAmar, is a hybrid of the earlier Marvel, Dazzler, and Daccord systems of the Eugene Myers lab.	Genobioinfo Cluster: Ask for Install
DAZZ_DB	To facilitate the multiple phases of the dazzler assembler, we organize all the read data into what is effectively a "database" of the reads and their meta-information.	Genobioinfo Cluster: Ask for Install
DBG2OLC	The genome assembler that reduces the computational time of human genome assembly from 400,000 CPU hours to 2,000 CPU hours, utilizing long erroneous 3GS sequencing reads and short accurate NGS sequencing reads.	Genobioinfo Cluster: Ask for Install
DENTIST	DENTIST is a sensitive, highly-accurate and automated pipeline method to close gaps in (short read) assemblies with long reads.	Genobioinfo Cluster: Ask for Install
Discovar	Assemble genomes and find variants with DISCOVAR & DISCOVAR de novo	Genobioinfo Cluster: Ask for Install
drap	De novo RNA-seq Assembly Pipeline	strong>Genobioinfo Cluster: How to use
dRep	dRep is a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.	Genobioinfo Cluster: How to use
EUPAN	Toolkit that integrates various software in order to build eukaryotic pangenomes.	Genobioinfo Cluster: Ask for Install
FABuLOUS	A gap-closing software tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly. Initially called TGS-GapCloser.	Genobioinfo Cluster: Ask for Install
FALCON	Falcon: a set of tools for fast aligning long reads for consensus and assembly	Genobioinfo Cluster: Ask for Install
FALCON_unzip	Making diploid assembly becomes common practice for genomic study	Genobioinfo Cluster: Ask for Install
FALCON-Phase	FALCON-Phase integrates PacBio long-read assemblies with Phase Genomics Hi-C data to create phased, diploid, chromosome-scale scaffolds.	Genobioinfo Cluster: Ask for Install
Fast-Plast	Fast-Plast is a pipeline that leverages existing and novel programs to quickly assemble, orient, and verify whole chloroplast genome sequences.	Genobioinfo Cluster: Ask for Install
FCS	The NCBI Foreign Contamination Screen (FCS) is a tool suite (FCS-adaptator et FCS-gx) for identifying and removing contaminant sequences in genome assemblies.	Genobioinfo Cluster: How to use
FLASH	FLASH, Fast Length Adjustment of SHort reads, is a very accurate fast tool to merge paired-end reads from fragments that are shorter than twice the length of reads. The extended length of reads has a significant positive impact on improvement of genome assemblies.	Genobioinfo Cluster: How to use
Flye	Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies.	Genobioinfo Cluster: How to use
FMLRC2	FMLRC2 performs error correction/polishing of long erroneous sequences with accurate short reads. As such, it can be used as both an error-correction tool <1> for raw long reads (ex. Oxford Nanopore) and a polishing tool <2> for de novo assemblies.	Genobioinfo Cluster: How to use
GAAS	Genome Assembly Annotation Service: Suite of tools related to Genome Assembly Annotation Service tasks.	Genobioinfo Cluster: Ask for Install
GapCloser	The GapCloser is designed to close the gaps emerging during the scaffolding process by SOAPdenovo, using the abundant pair relationships of short reads.	Genobioinfo Cluster: How to use
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
GenomeScope2.0	Reference-free profiling of polyploid genomes	Genobioinfo Cluster: How to use
GetOrganelle	This toolkit assemblies organelle genome from genomic skimming data.	Genobioinfo Cluster: How to use
gfastats	A single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa <.gz>) genome assembly file manipulation.	Genobioinfo Cluster: Ask for Install
GLIMPSE	GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.	Genobioinfo Cluster: How to use
GraphUnzip	Unzip assembly graphs with Hi-C data and/or long reads.	Genobioinfo Cluster: Ask for Install
Hairsplitter	Software that separates very close sequences that have been collapsed during assembly. Uses only long reads.	Genobioinfo Cluster: How to use
Hap10	The goal is to reconstruct accurate and long haplotypes polyploid genome using linked reads.	Genobioinfo Cluster: Ask for Install
HapHiC	HapHiC is an allele-aware scaffolding tool that uses Hi-C data to scaffold haplotype-phased genome assemblies into chromosome-scale pseudomolecules.	Genobioinfo Cluster: How to use
HAPHPIPE	NGS viral assembly and population genetics.	Genobioinfo Cluster: How to use
HaploHiC	Comprehensive haplotype division of Hi-C PE-reads based on local contacts ratio.	Genobioinfo Cluster: Ask for Install
HaploMerger2	<HaploMerger2 (HM2) is an important upgrade over HaploMerger. HM2 is an easy-to-use automated pipeline for improving genome assembly in the post-assembly stage. It consists of a set of executables as well as wrappers for several third-part software.	Genobioinfo Cluster: Ask for Install
Hapo-G	Hapo-G is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.	Genobioinfo Cluster: How to use
HASLR	HASLR is a tool for rapid genome assembly of long sequencing reads. HASLR is a hybrid tool which means it requires long reads generated by Third Generation Sequencing technologies (such as PacBio or Oxford Nanopore) together with Next Generation Sequencing reads (such as Illumina) from the same sample.	Genobioinfo Cluster: Ask for Install
HiCAssembler	Software to assemble contigs/scaffolds into chromosomes using Hi-C data.	Genobioinfo Cluster: How to use
hifiasm	Hifiasm is a fast haplotype-resolved de novo assembler for PacBio Hifi reads. Unlike most existing assemblers, hifiasm starts from uncollapsed genome.	Genobioinfo Cluster: How to use
hifiasm-meta	De novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.	Genobioinfo Cluster: How to use
IBA	Python script to assemble AHE (Anchored Hybrid Enrichment) data loci by loci. To summarize, raw sequencing reads for each species were filtered for quality using Trim Galore! v. 0.4.0 (Krueger, 2015), and assembled using the iterative baited assembly (IBA) Python script which employs USEARCH v. 7.0 (Edgar, 2010) and Bridger v. 2014-12-01 (Chang et al., 2015) to assemble loci for each taxon. MAFFT v. 7.245 (Katoh and Standley, 2013) was used to align assembled sequences, and the probe and flanking regions were separated with the Python script Extract_probe_region.py (Breinholt et al., 2018).	Genobioinfo Cluster: Ask for Install
Inspector	A tool for evaluate long-read de novo assembly results.	Genobioinfo Cluster: How to use
IPA	Improved Phased Assembler (IPA) is the official PacBio software for HiFi genome assembly. IPA was designed to utilize the accuracy of PacBio HiFi reads to produce high-quality phased genome assemblies. IPA is an end-to-end solution, starting with input reads and resulting in a polished assembly.	Genobioinfo Cluster: How to use
IRMA	IRMA was designed for the robust assembly, variant calling, and phasing of highly variable RNA viruses. Currently IRMA is deployed with modules for influenza and ebolavirus. IRMA is free to use and parallelizes computations for both cluster computing and single computer multi-core setups.	Genobioinfo Cluster: How to use
IsoLasso	IsoLasso is an algorithm to assemble transcripts and estimate their expression levels from RNA-Seq reads.	Genobioinfo Cluster: Ask for Install
IVA	Iterative Virus Assembler is a de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high and variable depth	Genobioinfo Cluster: Ask for Install
JASPER	JASPER (Jellyfish based Assembly Sequence Polisher for Error Reduction) is an efficient polishing tool for draft genomes. It uses accurate reads (PacBio HiFi or Illumina) to evaluate consensus quality and correct consensus errors in genome assemblies. JASPER is substantially faster than polishing methods based on sequence alignment, and more accurate than currently available k-mer based methods. The efficiency and scalability of JASPER allows one to use it to create personalized reference genomes for specific populations very efficiently, even for large sequenced populations, by polishing the reference genome, such as GRCh38 or chm13v2.0 for human, with Illumina reads sequenced from many individuals from the population. Please see this manuscript for more details: Guo A, Salzberg SL, Zimin AV. JASPER: A fast genome polishing tool that improves accuracy of genome assemblies. PLoS Comput Biol. 2023 Mar 31;19(3):e1011032. doi: 10.1371/journal.pcbi.1011032. PMID: 37000853; PMCID: PMC10096238.	Genobioinfo Cluster: How to use
JTK	A regional diploid genome assembler.	Genobioinfo Cluster: How to use
KAD	KAD is designed for evaluating the accuracy of nucleotide base quality of genome assemblies.	Genobioinfo Cluster: Ask for Install
khmer	In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more. The khmer software for advanced biological sequencing data analysis — khmer 3.0.0a1+98.gfe0ce11 documentation	Genobioinfo Cluster: How to use
klumpy	Klumpy is a bioinformatic tool for identifying possibly incorrectly assembled regions in a long-read based assembly, with the additional capabilities of annotating sequences given a set of query sequences.	Genobioinfo Cluster: How to use
KMC		Genobioinfo Cluster: How to use
KmerGenie	KmerGenie estimates the best k-mer length for genome de novo assembly.	Genobioinfo Cluster: How to use
LACHESIS	Software that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies.	Genobioinfo Cluster: Ask for Install
lamassemble	Merge overlapping "long" DNA reads into a consensus sequence.	Genobioinfo Cluster: Ask for Install
Linker	Linker is a suite of C++ tools useful for interpreting long and linked read sequencing of cancer genomes.	Genobioinfo Cluster: Ask for Install
LINKS	LINKS is a scalable genomics application for scaffolding or re-scaffolding genome assembly drafts with long reads, such as those produced by Oxford Nanopore Technologies Ltd and Pacific Biosciences.	Genobioinfo Cluster: How to use
LJA	La Jolla Assembler (LJA) is a tool for genome assembly from HiFI reads based on de Bruijn graphs.	Genobioinfo Cluster: How to use
LongStitch	A genome assembly correction and scaffolding pipeline using long reads. Basically runs Tigmint, ntLink, ARKS.	Genobioinfo Cluster: Ask for Install
LR_Gapcloser	LR_Gapcloser is a gap closing tool using uncorrected or corrected long reads generated from Pacbio platform or Nanopore platform.	Genobioinfo Cluster: Ask for Install
LRScaf	TGS scaffolding . Improving draft genomes using long noisy reads.	Genobioinfo Cluster: Ask for Install
LRSIM	Simulator for Linked Reads: this package simulates whole genome sequencing using 10X Genomics Linked Read technology.	Genobioinfo Cluster: Ask for Install
LTR_retriever	LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons.	Genobioinfo Cluster: How to use
Mapsembler2	Mapsembler2 is a targeted assembly software. It takes as input any number of NGS raw read set(s) (fasta or fastq, gzipped or not) and a set of input sequences (starters).	Genobioinfo Cluster: Ask for Install
MARVEL	MARVEL consists of a set of tools that facilitate the overlapping, patching, correction and assembly of noisy (not so noisy ones as well) long reads.	Genobioinfo Cluster: Ask for Install
MaSuRCA	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454)	Genobioinfo Cluster: How to use
MAtCHap	An ultra fast algorithm for solving the single individual haplotype assembly problem.	Genobioinfo Cluster: Ask for Install
MECAT	MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads.	Genobioinfo Cluster: Ask for Install
MEGAHIT	An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph	Genobioinfo Cluster: How to use
MeGAMerge	A tool to merge assembled contigs, long reads from metagenomic sequencing runs	Genobioinfo Cluster: Ask for Install
Merqury	Evaluate genome assemblies with k-mers and more	Genobioinfo Cluster: How to use
Meryl	A genomic k-mer counter (and sequence utility) with nice features.	Genobioinfo Cluster: How to use
MetaMDBG	A lightweight assembler for long and accurate metagenomics reads.	Genobioinfo Cluster: How to use
MetaWRAP	A flexible pipeline for genome-resolved metagenomic data analysis.	Genobioinfo Cluster: How to use
MGSE	MGSE can harness the power of files generated in genome sequencing projects to predict the genome size. Required are the FASTA file containing a high continuity assembly and a BAM file with all available reads mapped to this assembly.	Genobioinfo Cluster: Ask for Install
Miniasm	Ultrafast de novo assembly for long noisy reads (though having no consensus step).	Genobioinfo Cluster: How to use
MiniBUSCO	A genome completeness evaluation tool based on miniprot.	Genobioinfo Cluster: How to use
Minipolish	A tool for Racon polishing of miniasm assemblies.	Genobioinfo Cluster: How to use
MIRA	Whole genome shotgun and EST sequence assembler for Sanger, 454, and Solexa / Illumina.	Genobioinfo Cluster: How to use
MITGARD	MITGARD (Mitochondrial Genome Assembly from RNA-seq Data) is a computational tool designed to recover the mitochondrial genome from RNA-seq data of any Eukaryote species.	Genobioinfo Cluster: How to use
MITObim	The MITObim procedure (mitochondrial baiting and iterative mapping) represents a highly efficient approach to assembling novel mitochondrial genomes of non-model organisms directly from total genomic DNA derived NGS reads.	Genobioinfo Cluster: How to use
MitoFinder	Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data. MitoFinder is also designed to find and annotate mitochondrial sequences in existing genomic assemblies (generated from Hifi/PacBio/Nanopore/Illumina sequencing data...)	Genobioinfo Cluster: How to use
MitoHiFi	MitoHiFi circularises, cuts and annotates the mitogenome from contigs assembled with PacBio HiFi reads and softwares such as HiCanu or Hifiasm.	Genobioinfo Cluster: How to use
mitoVGP	The Vertebrate Genomes Project Mitogenome Assembly Pipeline.	Genobioinfo Cluster: Ask for Install
MitoZ	MitoZ is a Python3-based toolkit which aims to automatically filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences from the genome assembly result, annotate mitogenome (genbank file as result), and mitogenome visualization.	Genobioinfo Cluster: How to use
MiXCR	MiXCR is a universal software for fast and accurate analysis of raw T- or B- cell receptor repertoire sequencing data.	Genobioinfo Cluster: Ask for Install
MSI	MSI was designed for sequencing reads with higher error rates (e.g., as the ones produced by Nanopore's sequencers) but also works with reads with lower error rates (e.g., Illumina).	Genobioinfo Cluster: Ask for Install
MTG-Link	MTG-Link is a novel gap-filling tool for draft genome assemblies, dedicated to linked read data.	Genobioinfo Cluster: Ask for Install
myloasm	Myloasm is a de novo metagenome assembler for long-read sequencing data. It takes sequencing reads and outputs polished contigs in a single command.	Genobioinfo Cluster: How to use
Nanopolish	A nanopore consensus algorithm using a signal-level hidden Markov model. Signal-level algorithms for MinION data.	Genobioinfo Cluster: How to use
NanoSeq	Pipeline used at the LIPME to assemble plasmid and PCR sequences with Nanopore.	Genobioinfo Cluster: How to use
NaS	NaS is a hybrid approach developed to take advantage of data generated using MinION device. It combines Illumina and Oxford Nanopore technologies to produce NaS (Nanopore Synthetic-long) reads	Genobioinfo Cluster: Ask for Install
NECAT	NECAT is an error correction and de-novo assembly tool for Nanopore long noisy reads.	Genobioinfo Cluster: How to use
Newbler	Newbler is a software package for de novo DNA sequence assembly. It is designed specifically for assembling sequence data generated by the 454 GS-series of pyrosequencing platforms sold by 454 Life Science, a Roche diagnostic.	Genobioinfo Cluster: Ask for Install
NextDenovo	NextDenovo is a string graph-based de novo assembler for TGS long reads.	Genobioinfo Cluster: How to use
NOVOLoci	NOVOLoci is a haplotype aware assembler for targeted assembly or whole genome assembly of small genomes. We currently recommend limiting the assembly size to regions <20 Mb in targeted-mode and diploid genomes that are <250 Mb in WG-mode, with a minimum sequencing depth of 10x per haplotype. If you do need to phase accuratly and you have HiFi or R10 ONT data, it is adviced to use Hifiasm, as it is has a much shorter runtime. Currently it is only available for Nanopore, PacBio and hybrid options will be available soon.	Genobioinfo Cluster: How to use
NOVOPlasty	NOVOPlasty is a de novo assembler and variance caller for short circular genomes.	Genobioinfo Cluster: How to use
nQuire	A statistical framework for ploidy estimation using NGS short-read data.	Genobioinfo Cluster: How to use
ntEdit	ntEdit is a fast and scalable genomics application for polishing genome assembly drafts. It simplifies polishing and "haploidization" of gene and genome sequences with its re-usable Bloom filter design.	Genobioinfo Cluster: Ask for Install
ntJoin	Scaffolding draft assemblies using reference assemblies and minimizer graphs	Genobioinfo Cluster: How to use
Oases	Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz). Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms.	Genobioinfo Cluster: How to use
oatk	A organelle de novo genome assembly toolkit. (Install include OatkDB)	Genobioinfo Cluster: How to use
ORG.asm	The ORGanelle ASseMbler aims to target the assembling of small sequences over-represented in a whole genome shotgun sequence dataset.	Genobioinfo Cluster: How to use
Organelle_PBA	OrganelleRef_PBA is a script to perform a de-novo PacBio assemblies of any organelle (chloroplast or mitochondrial genomes) using several programs.	Genobioinfo Cluster: Ask for Install
PASApipeline	PASA, acronym for Program to Assemble Spliced Alignments, is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.	Genobioinfo Cluster: How to use
pb-assembly	PacBio® tools : everything needed to run Falcon and Unzip	Genobioinfo Cluster: Ask for Install
PECAT	PECAT is a phased error correction and assembly tool for long reads. It includes a haplotype-aware correction method and an efficient diploid assembly method.	Genobioinfo Cluster: How to use
Peregrine	Peregrine is a fast genome assembler for accurate long reads (length > 10kb, accuraccy > 99%). It can assemble a human genome from 30x reads within 20 cpu hours from reads to polished consensus.	Genobioinfo Cluster: How to use
phasebook	phasebook is a novel approach for reconstructing the haplotypes of diploid genomes from long reads de novo, that is without the need for a reference genome.	Genobioinfo Cluster: Ask for Install
Pilon	Pilon is an automated genome assembly improvement and variant detection tool.	Genobioinfo Cluster: How to use
PlasForest	A random forest classifier of contigs to identify contigs of plasmid origin in contig and scaffold genomes.	Genobioinfo Cluster: How to use
Plass	Plass (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level.	Genobioinfo Cluster: Ask for Install
plassembler	plassembler is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing.	Genobioinfo Cluster: How to use
Platanus	Platanus is a novel de novo sequence assembler that can reconstruct genomic sequences of highly heterozygous diploids from massively parallel shotgun sequencing data.	Genobioinfo Cluster: Ask for Install
Platanus2	Platanus-allee (Platanus2) is a de novo haplotype assembler (phasing tool), which assembles each haplotype sequence in a diploid genome.	Genobioinfo Cluster: Ask for Install
plotsr	Tool to plot synteny and structural rearrangements between genomes.	Genobioinfo Cluster: How to use
Polypolish	Polypolish is a tool for polishing genome assemblies with short reads.	Genobioinfo Cluster: How to use
Purge_Dups	purge_dups is designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.	Genobioinfo Cluster: How to use
Purge_Haplotigs	Pipeline to help with curating heterozygous diploid genome assemblies (for instance when assembling using FALCON or FALCON-unzip).	Genobioinfo Cluster: Ask for Install
pypolca	pypolca is a Standalone Python re-implementation of the POLCA polisher from the MaSuRCA genome assembly and analysis toolkit.	Genobioinfo Cluster: How to use
quarTeT	quarTeT is a collection of tools for T2T genome assembly and basic analysis in automatic workflow.	Genobioinfo Cluster: How to use
QUAST	QUAST evaluates genome assemblies by computing various metrics	Genobioinfo Cluster: How to use
quickmerge	A simple and fast metassembler and assembly gap filler designed for long molecule based assemblies	Genobioinfo Cluster: Ask for Install
Ra	Ra is as a fast and easy to use assembler for raw reads generated by third generation sequencing.	Genobioinfo Cluster: Ask for Install
Rabaler	Rebaler is a program for conducting reference-based assemblies using long reads. It relies mainly on minimap2 for alignment and Racon for making consensus sequences.	Genobioinfo Cluster: Ask for Install
Racon	Consensus module for raw de novo DNA assembly of long uncorrected reads.	Genobioinfo Cluster: How to use
RAFT	RAFT (Repeat Aware Fragmentation Tool) is an algorithm designed to improve assembly quality by rescuing contained reads. RAFT breaks long reads into smaller sub-reads by following an algorithm described in our preprint. The read fragmentation allows an OLC assembler to retain contained reads during string graph construction. When input reads have non-uniform lengths, retaining contained reads improves assembly contiguity and base-level accuracy. The inputs to RAFT include an error-corrected read file in FASTA/FASTQ format and an all-vs-all alignment file in PAF format. It performs read fragmentation and outputs the fragmented reads in FASTA format. We recommend users to use hifiasm for the initial steps (read error correction, all-vs-all overlap computation) and also for the final step (assembly of fragmented reads). The assembly output format of hifiasm is described here. The RAFT-hifiasm workflow is recommended for long accurate reads with non-uniform length distribution (e.g., ONT Duplex, or a mixture of ONT Duplex and HiFi reads). ONT UL reads can optionally be integrated during the final assembly step.	Genobioinfo Cluster: How to use
RaGOO	A tool to order and orient genome assembly contigs via Minimap2 alignments to a reference genome.	Genobioinfo Cluster: Ask for Install
Ragout	Ragout (Reference-Assisted Genome Ordering UTility) is a tool for chromosome assembly using multiple references.	Genobioinfo Cluster: Ask for Install
RagTag	RagTag, the successor to RaGOO, is a command line tool for reference-guided genome assembly improvement.	Genobioinfo Cluster: How to use
RAMPART	RAMPART is a configurable pipeline for de novo assembly of DNA sequence data. RAMPART is not a de novo assembler.	Genobioinfo Cluster: Ask for Install
Ranbow	Ranbow is a haplotype assembler for polyploid genomes.	Genobioinfo Cluster: Ask for Install
Ratatosk	Ratatosk is a phased error correction tool for erroneous long reads based on compacted and colored de Bruijn graphs built from accurate short reads.	Genobioinfo Cluster: Ask for Install
Raven	Raven is a de novo genome assembler for long uncorrected reads.	Genobioinfo Cluster: How to use
Ray	Assemble genomes in parallel using the message-passing interface	Genobioinfo Cluster: Ask for Install
REAPR	REAPR is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads, without the use of a reference genome for comparison. It can be used in any stage of an assembly pipeline to automatically break incorrect scaffolds and flag other errors in an assembly for manual inspection. It reports mis-assemblies and other warnings, and produces a new broken assembly based on the error calls.	Genobioinfo Cluster: Ask for Install
Redundans	Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.	Genobioinfo Cluster: Ask for Install
REFMAKER	REFMAKER is a command-line and user-friendly pipeline providing different tools to create nuclear references from genomic assemblies of shotgun libraries.	Genobioinfo Cluster: How to use
REPdenovo	REPdenovo is designed for constructing repeats directly from sequence (paired-end) reads. It based on the idea of frequent k-mer assembly. REPdenovo provides many functionalities, and can generate much longer repeats than existing tools. Internally, REPdenovo uses Jellyfish for k-mer counting, Velvet for assembly, and bwa to map reads on the Transposable Elements.	Genobioinfo Cluster: Ask for Install
riboSeed	Pipeline for using ribosomal flanking regions to improve bacterial genome assembly.	Genobioinfo Cluster: How to use
runBNG	An easy way to run BioNano genomic analysis.	Genobioinfo Cluster: Ask for Install
rust-mdbg	rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) implementation, geared towards the assembly of long and accurate reads such as PacBio HiFi.	Genobioinfo Cluster: How to use
SALSA	A tool to scaffold long read assemblies with Hi-C data	Genobioinfo Cluster: How to use
Scaff10X	Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads	Genobioinfo Cluster: Ask for Install
Scallop	Scallop is an accurate reference-based transcript assembler.	Genobioinfo Cluster: How to use
SDA	Segmental Duplication Assembler	Genobioinfo Cluster: Ask for Install
SECAPR	Used to process targeted sequencing (or Gene capture) data by applying assembly and subsequent mapping algorithms (reducing paralogs unlike Hybpiper).	Genobioinfo Cluster: Ask for Install
SGA	SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.	Genobioinfo Cluster: Ask for Install
Shasta	The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by Oxford Nanopore flow cells.	Genobioinfo Cluster: How to use
shiver	shiver is a tool for mapping paired-end short reads to a custom reference sequence constructed using do novo assembled contigs, in order to minimise the biased loss of information that occurs from mapping to a reference that differs from the sample.	Genobioinfo Cluster: Ask for Install
SLICEMBLER	SLICEMBLER is a meta-assembler designed for ultra-deep sequencing data/	Genobioinfo Cluster: Ask for Install
SLR-superscaffolder	This is a scaffold assembler designed for stLFR reads. It uses the link-reads information from stLFR reads to assemble contigs to scaffolds.	Genobioinfo Cluster: Ask for Install
smartdenovo	SMARTdenovo is a de novo assembler for PacBio and Oxford Nanopore (ONT) data. It produces an assembly from all-vs-all raw read alignments without an error correction stage. It also provides tools to generate accurate consensus sequences, though a platform dependent consensus polish tools (e.g. Quiver for PacBio or Nanopolish for ONT) are still required for higher accuracy.	Genobioinfo Cluster: How to use
SMRTLink	SMRT Link is the web-based end-to-end workflow manager for the Sequel™ System. (installed in mode command line on our cluster)	Genobioinfo Cluster: How to use
soap.coverage	Can calculate sequencing coverage or physical coverage as well as duplication rate and details of specific block for each segments and whole genome by using SOAP, Blat, Blast, BlastZ, mummer and MAQ aligement results with multi-thread. Gzip file supported.	Genobioinfo Cluster: Ask for Install
SOAPdenovo	ttSOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.	Genobioinfo Cluster: Ask for Install
SOAPdenovo-Trans	SOAPdenovo-Trans is a de novo transcriptome assembler basing on the SOAPdenovo framework, adapt to alternative splicing and different expression level among transcripts.	Genobioinfo Cluster: Ask for Install
SPAdes	SPAdes ﾖ St. Petersburg genome assembler ﾖ is intended for both standard isolates and single-cell MDA bacteria assemblies.	Genobioinfo Cluster: How to use
SRAsembler	SRAssembler (Selective Recursive local Assembler) is a modular pipeline program that can assemble genomic DNA reads into contigs that are homologous to a query DNA or protein sequence.	Genobioinfo Cluster: Ask for Install
SSPACE	SSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data.	Genobioinfo Cluster: Ask for Install
SSPACE-LongRead	SSPACE-LongRead is a stand-alone program for scaffolding pre-assembled contigs using long reads (e.g. PacBio RS reads).	Genobioinfo Cluster: Ask for Install
StringTie	StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus.	Genobioinfo Cluster: How to use
Supernova	Supernova is a software package for de novo assembly from Chromium Linked-Reads that are made from a single whole-genome library from an individual DNA source.	Genobioinfo Cluster: Ask for Install
SuperTAD	SuperTAD is an open-source command-line TAD detection package written in C++. It takes either raw or normalized Hi-C contact maps as inputs.	Genobioinfo Cluster: How to use
TACO	Multi-sample transcriptome assembly from RNA-Seq.	Genobioinfo Cluster: How to use
Tapestry	Tapestry is a tool to validate and edit small eukaryotic genome assemblies using long sequence reads. It is designed to help identify complete chromosomes, symbionts, haplotypes, complex features and errors in close-to-complete genome assemblies.	Genobioinfo Cluster: Ask for Install
TGICL	This package automates clustering and assembly of a large EST/mRNA dataset. The clustering is performed by a slightly modified version of NCBI's megablast , and the resulting clusters are then assembled using CAP3 assembly program. TGICL starts with a large multi-FASTA file (and an optional peer quality values file) and outputs the assembly files as produced by CAP3.	Genobioinfo Cluster: How to use
TGSGapFiller	A gap filling tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly.	Genobioinfo Cluster: Ask for Install
Tigmint	Tigmint identifies and corrects misassemblies using linked reads from 10x Genomics Chromium.	Genobioinfo Cluster: Ask for Install
Transrate	Transrate is software for de-novo transcriptome assembly quality analysis.	Genobioinfo Cluster: How to use
trinityrnaseq	Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.	Genobioinfo Cluster: How to use
Trycycler	Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes.	Genobioinfo Cluster: How to use
Unicycler	Unicycler is an assembly pipeline for bacterial genomes.	Genobioinfo Cluster: How to use
Unimap	A EXPERIMENTAL fork of minimap2 optimized for assembly-to-reference alignment.	Genobioinfo Cluster: How to use
Velvet	Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs.	Genobioinfo Cluster: How to use
Verkko	Verkko is a hybrid genome assembly pipeline developed for telomere-to-telomere assembly of PacBio HiFi and Oxford Nanopore reads.	Genobioinfo Cluster: How to use
verkko-fillet	verkko-fillet is an easy-to-use toolkit for cleaning Verkko assemblies.	Genobioinfo Cluster: How to use
ViennaRNA	Vienna RNA package allows RNA Secondary Structure Prediction and Comparison	Genobioinfo Cluster: How to use
ViQuaS	An improved reconstruction pipeline for viral quasispecies spectra generated by next-generation sequencing.	Genobioinfo Cluster: Ask for Install
Wengan	An accurate and ultra-fast genome assembler	Genobioinfo Cluster: Ask for Install
wgd	Python package and CLI for whole genome duplication analyse	Genobioinfo Cluster: How to use
WGS-Assembler	Celera Assembler is a de novo whole-genome shotgun (WGS) DNA sequence assembler.	Genobioinfo Cluster: Ask for Install
wtdbg	Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).	Genobioinfo Cluster: How to use
YaHS	YaHS is a scaffolding tool using Hi-C data.	Genobioinfo Cluster: How to use

ATAC-Seq

Application	Description	Availability/Use
CellRanger ARC	Cell Ranger ARC's pipelines analyze sequencing data produced from Chromium Single Cell Multiome ATAC + Gene Expression.	Genobioinfo Cluster: How to use
CellRanger ATAC	Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.	Genobioinfo Cluster: How to use
diffTF	Quantification of Differential Transcription Factor Activity and Multiomics-Based Classification into Activators and Repressor.	Genobioinfo Cluster: How to use
NucleoATAC	Python package for calling nucleosomes using ATAC-seq data. Also includes general scripts for working with paired-end ATAC-seq data (or potentially other paired-end data).	Genobioinfo Cluster: How to use
ROSE	To create stitched enhancers, and to separate super-enhancers from typical enhancers using sequencing data (.bam) given a file of previously identified constituent enhancers (.gff)	Genobioinfo Cluster: How to use
seqOutATACBias	A CLI that corrects the sequence bias of Tn5 transposase in ATAC-seq data using a rule ensemble model.	Genobioinfo Cluster: How to use
TOBIAS	TOBIAS is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.	Genobioinfo Cluster: How to use

BS-seq analysis

Application	Description	Availability/Use
BamBam	several simple-to-use tools to facilitate NGS analysis	Genobioinfo Cluster: Ask for Install
BISCUIT	BISulfite-seq CUI Toolkit (BISCUIT) is a utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data.	Genobioinfo Cluster: How to use
Bismark	A tool to map bisulfite converted sequence reads and determine cytosine methylation states	Genobioinfo Cluster: How to use
BisSNP	Accurate combined SNP/Methylation calling.	Genobioinfo Cluster: Ask for Install
BS-Seeker2-	BS-Seeker2 is a seamless and versatile pipeline for accurately and fast mapping the bisulfite-treated reads.	Genobioinfo Cluster: Ask for Install
BS-SNPer	BS-SNPer is an ultrafast and memory-efficient package, a program for BS-Seq variation detection from alignments in standard BAM/SAM format using approximate Bayesian modeling.	Genobioinfo Cluster: Ask for Install
BSMAP	BSMAP is a short reads mapping software for bisulfite sequencing reads. Bisulfite treatment converts unmethylated Cytosines into Uracils (sequenced as Thymine) and leave methylated Cytosines unchanged, hence provides a way to study DNA cytosine methylation at single nucleotide resolution. BSMAP aligns the Ts in the reads to both Cs and Ts in the reference	Genobioinfo Cluster: Ask for Install
bwa-meth	Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.	Genobioinfo Cluster: How to use
MethylScore	Identification of differentially methylated regions between multiple epigenomes from BS-treated read mappings via methylated region calling.	Genobioinfo Cluster: How to use
nf-core workflows	This module provide access to workflows nf-core, there are automatically downloaded into your home. More info at nf-core/config page.	Genobioinfo Cluster: How to use
rastair	Fast processing of TET-assisted pyridine borane sequencing (TAPS)-based sequencing data.	Genobioinfo Cluster: How to use
segemehl	segemehl is a software to map short sequencer reads to reference genomes. segemehl implements a matching strategy based on enhanced suffix arrays (ESA).	Genobioinfo Cluster: Ask for Install
Trim Galore	A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries.	Genobioinfo Cluster: How to use
ViewBS	A powerful toolkit for visualization of high-throughput bisulfite sequencing data.	Genobioinfo Cluster: How to use

ChIP-seq analysis

Application	Description	Availability/Use
Alfred	BAM Statistics, Feature Counting and Annotation	Genobioinfo Cluster: Ask for Install
AlleleSeq	pipeline which constructs a diploid personal genome from genomic sequence variants of a family trio, including SNPs, indels and structural variants and maps functional genomic data onto this personal genome.	Genobioinfo Cluster: Ask for Install
BroadPeak	BroadPeak broad peak calling algorithm for diffuse ChIP-seq datasets.	Genobioinfo Cluster: Ask for Install
csem	CSEM is a ChIP-Seq multi-read allocator. CSEM stands for ChIP-Seq multi-read allocation using Expectation-Maximization.	Genobioinfo Cluster: How to use
DANPOS2	A toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing, version 2	Genobioinfo Cluster: Ask for Install
diffTF	Quantification of Differential Transcription Factor Activity and Multiomics-Based Classification into Activators and Repressor.	Genobioinfo Cluster: How to use
epic2	epic2 is an ultraperformant reimplementation of SICER. It focuses on speed, low memory overhead and ease of use.	Genobioinfo Cluster: How to use
GEM	GEM is a scientific software for studying protein-DNA interaction at high resolution using ChIP-seq/ChIP-exo data. It can also be applied to CLIP-seq and Branch-seq data.	Genobioinfo Cluster: How to use
IDR	The IDR (Irreproducible Discovery Rate) framework is a uniﬁed approach to measure the reproducibility of ﬁndings identiﬁed from replicate experiments and provide highly stable thresholds based on reproducibility.	Genobioinfo Cluster: How to use
MACS	We present Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.	Genobioinfo Cluster: How to use
MAGIC	A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data.	Genobioinfo Cluster: Ask for Install
MSPC	Improve Sensitivity and Specificity of Peak Calling, and Identify Consensus Regions	Genobioinfo Cluster: How to use
MuMRescueLite	MuMRescueLite is the software that enable to use the tag sequencies of mapped to multiple loci to the genome, for the expression analysis. At the mapping of short sequence tags of CAGE or ChIP-Seq to the genome, sequence tags that map to multiple genomic loci (multi-mapping tags or MuMs), are routinely omitted from further analysis, leading to experimental bias and reduced coverage. MuMRescueLite probabilistically reincorporates multi-mapping tags into mapped short read data with acceptable computational requirements.	Genobioinfo Cluster: Ask for Install
ROSE	To create stitched enhancers, and to separate super-enhancers from typical enhancers using sequencing data (.bam) given a file of previously identified constituent enhancers (.gff)	Genobioinfo Cluster: How to use
RSEG	The RSEG software package is aimed to analyze ChIP-Seq data, especially for identifying genomic regions and their boundaries marked by diffusive histone modification markers.	Genobioinfo Cluster: Ask for Install
SICER2	Redesigned and improved version of the original ChIP-seq broad peak calling tool SICER.	Genobioinfo Cluster: How t o use

Database querying

Application	Description	Availability/Use
Back_to_sequences	Given a set of kmers (fasta / fastq <.gz> format) and a set of sequences (fasta / fastq <.gz> format), this tool will extract the sequences containing some of those kmers.	Genobioinfo Cluster: How to use
BaseSpace_CLI	Command line interface to work with BaseSpace Sequence Hub data.	Genobioinfo Cluster: How to use
BOLDigger2	An even better Python program to query .fasta files against the COI database of www.boldsystems.org	Genobioinfo Cluster: How to use
DIAMOND2GO	Diamond2GO is a set of tools that can rapidly assign gene ontology and perform enrichment for functional genomics.	Genobioinfo Cluster: How to use
EMBLmyGFF3	An efficient way to convert gff3 annotation files into EMBL format ready to submit.	Genobioinfo Cluster: How to use
enaBrowserTools	enaBrowserTools is a set of scripts that interface with the ENA web services to download data from ENA easily, without any knowledge of scripting required.	Genobioinfo Cluster: How to use
ganon	ganon classifies DNA sequences against large sets of genomic reference sequences efficiently.	Genobioinfo Cluster: Ask for Install
gdc-client	The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.	Genobioinfo Cluster: How to use
gget	gget enables efficient querying of genomic reference databases.	Genobioinfo Cluster: Ask for Install
Infernal	Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).	Genobioinfo Cluster: How to use
NCBI_datasets	NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or NCBI Datasets web interface.	Genobioinfo Cluster: How to use
OMA	The OMA (Orthologous MAtrix) database is a well-established resource for identifying orthologs among publicly available complete genomes.	Genobioinfo Cluster: How to use
OrthoLoger	Standalone pipeline for delineation of orthologs.	Genobioinfo Cluster: Ask for Install
SIFT4G_Annotator	Annotating VCF files using the SIFT4G databases.	Genobioinfo Cluster: Ask for Install

DNA and protein language models

Application	Description	Availability/Use
art_modern	A modern re-implementation of the popular ART simulator with enhanced performance and functionality.	Genobioinfo Cluster: How to use
Badread	Badread is a long-read simulator tool that makes – you guessed it – bad reads! It can imitate many kinds of problems one might encounter in real long-read sets: chimeras, low-quality regions, systematic basecalling errors and more.	Genobioinfo Cluster: How to use
DTVF	Virulence Factor Prediction using Deep Learning.	Genobioinfo Cluster: How to use
Enformer	This package provides an implementation of the Enformer model and examples on running the model.	Genobioinfo Cluster: How to use
ESM-2	This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR), including our state-of-the-art ESM-2 and ESMFold, as well as MSA Transformer, ESM-1v for predicting variant effects and ESM-IF1 for inverse folding	Genobioinfo Cluster: How to use
Evo2	Evo 2 is a state of the art DNA language model for long context modeling and design.	Genobioinfo Cluster: How to use
PlantCAD	PlantCaduceus, with its short name of PlantCAD, is a plant DNA LM based on the Caduceus architecture, which extends the efficient Mamba linear-time sequence modeling framework to incorporate bi-directionality and reverse complement equivariance, specifically designed for DNA sequences.	Genobioinfo Cluster: How to use
ProtTrans	ProtTrans is providing state of the art pretrained language models for proteins.	Genobioinfo Cluster: How to use

Epigenetic

Application	Description	Availability/Use
BisSNP	Accurate combined SNP/Methylation calling.	Genobioinfo Cluster: Ask for Install
BS-SNPer	BS-SNPer is an ultrafast and memory-efficient package, a program for BS-Seq variation detection from alignments in standard BAM/SAM format using approximate Bayesian modeling.	Genobioinfo Cluster: Ask for Install
cgmaptools	Command-line Toolset for Bisulfite Sequencing Data Analysis.	Genobioinfo Cluster: How to use
CHEUI	CHEUI (Methylation (CH₃) Estimation Using Ionic current) is an RNA modification detection software for Oxford Nanopore direct RNA sequencing data. CHEUI can be used to detect m6A and m5C in individual reads at single-nucleotide resolution from any sample (e.g. single condition), or detect differential m6A or m5C between any two conditions. CHEUI uses a two-stage deep learning method to detect m6A and m5C transcriptome-wide at single-read and single-site resolution in any sequence context (i.e. without any sequence constrains).	Genobioinfo Cluster: How to use
CRISPR-broad	CRISPR-broad is a standalone tool that enables user to scan genome for regions that has high frequency of gRNA with user-supplied variation. The package is developed for the design of gRNA for the targeted epigenome modifications on a broader region.	Genobioinfo Cluster: How to use
DeepSignal	Detecting methylation using signal-level features from Nanopore sequencing reads.	Genobioinfo Cluster: Ask for Install
mCaller	This program is designed to call m6A from nanopore data using the differences between measured and expected currents.	Genobioinfo Cluster: Ask for Install
MethyLasso	A segmentation approach to analyze DNA methylation patterns and identify differentially methylation regions from whole-genome datasets.	Genobioinfo Cluster: How to use
MethylDackel	MethylDackel (formerly named PileOMeth, which was a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.	Genobioinfo Cluster: How to use
MethylScore	Identification of differentially methylated regions between multiple epigenomes from BS-treated read mappings via methylated region calling.	Genobioinfo Cluster: How to use
modbam2bed	A program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.	Genobioinfo Cluster: How to use
nf-core workflows	This module provide access to workflows nf-core, there are automatically downloaded into your home. More info at nf-core/config page.	Genobioinfo Cluster: How to use
rastair	Fast processing of TET-assisted pyridine borane sequencing (TAPS)-based sequencing data.	Genobioinfo Cluster: How to use
S3V2_IDEAS_ESMP	A package for normalizing, denoising and integrating epigenomic datasets across different cell types.	Genobioinfo Cluster: Ask for Install
Tombo	Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.	Genobioinfo Cluster: Ask for Install

Expression analysisRNA-Seq, ncRNA expression ...

Application	Description	Availability/Use
ACFS	Accurate CircRNA Finder Suite. Discovering circRNAs from RNA-Seq data.	Genobioinfo Cluster: Ask for Install
Alfred	BAM Statistics, Feature Counting and Annotation	Genobioinfo Cluster: Ask for Install
AlleleSeq	pipeline which constructs a diploid personal genome from genomic sequence variants of a family trio, including SNPs, indels and structural variants and maps functional genomic data onto this personal genome.	Genobioinfo Cluster: Ask for Install
Arriba	Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data.	Genobioinfo Cluster: How to use
BLAZE	Barcode identification from Long reads for AnalyZing single-cell gene Expression. SingleCell Nanopore sequencing data analysis.	Genobioinfo Cluster: How to use
BlockClust	BlockClust is an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. BlockClust allows both clustering and classification of small non-coding RNAs.	Genobioinfo Cluster: Ask for Install
bustools	bustools is a program for manipulating BUS files for single cell RNA-Seq datasets. It can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatibility count matrices, and is useful for many other tasks.	Genobioinfo Cluster: Ask for Install
canu	A single molecule sequence assembler for genomes large and small.	Genobioinfo Cluster: How to use
CellBender	CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.	Genobioinfo Cluster: How to use
CellRanger	Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.	Genobioinfo Cluster: How to use
CellRanger ARC	Cell Ranger ARC's pipelines analyze sequencing data produced from Chromium Single Cell Multiome ATAC + Gene Expression.	Genobioinfo Cluster: How to use
chimerascan	chimerascan is a software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets.	Genobioinfo Cluster: Ask for Install
ChimPipe	ChimPipe is a computational method for the detection of novel transcription-induced chimeric transcripts and fusion genes from Illumina Paired-End RNA-seq data. It combines junction spanning and paired-end read information to accurately detect chimeric splice junctions at base-pair resolution.	Genobioinfo Cluster: Ask for Install
ChopStitch	Exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data.	Genobioinfo Cluster: Ask for Install
circfull	A tool to detect and quantify full-length circRNA isoforms from circFL-seq.	Genobioinfo Cluster: How to use
circtools	A modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software.	Genobioinfo Cluster: Ask for Install
CIRI	CIRI (circRNA identifier) is a novel chiastic clipping signal based algorithm, which can unbiasedly and accurately detect circRNAs from transcriptome data by employing multiple filtration strategies.	Genobioinfo Cluster: Ask for Install
CleaveLand4	Analysis of degradome data to find sliced miRNA and siRNA targets	Genobioinfo Cluster: How to use
ContextMap2	Fast and accurate context-based RNA-seq mapping. ContextMap determines the most likely origin of a read by evaluating the context of the read in the form of alignments of other reads to the same genomic region. In the original implementation, the focus was on improving initial mappings provided by other mapping tools.	Genobioinfo Cluster: Ask for Install
CroCo	A program to detect potential cross contaminations in HTS assembled transcriptomes using expression level quantification.	Genobioinfo Cluster: Ask for Install
Cufflinks	Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.	Genobioinfo Cluster: How to use
devCellPy	devCellPy is a Python package designed for hierarchical multilayered classification of cells based on single-cell RNA-sequencing (scRNA-seq).	Genobioinfo Cluster: How to use
diffTF	Quantification of Differential Transcription Factor Activity and Multiomics-Based Classification into Activators and Repressor.	Genobioinfo Cluster: How to use
drap	De novo RNA-seq Assembly Pipeline	strong>Genobioinfo Cluster: How to use
Enformer	This package provides an implementation of the Enformer model and examples on running the model.	Genobioinfo Cluster: How to use
ErmineJ	ErmineJ performs analyses of gene sets in high-throughput genomics data such as gene expression profiling studies.	Genobioinfo Cluster: Ask for Install
ERVmap	ERVmap is one part curated database of human proviral ERV loci and one part a stringent algorithm to determine which ERVs are transcribed in their RNA seq data.	Genobioinfo Cluster: Ask for Install
eXpress	eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences.	Genobioinfo Cluster: How to use
FEELnc	FlExible Extraction of LncRNA.	Genobioinfo Cluster: How to use
FLAIR	FLAIR (Full-Length Alternative Isoform analysis of RNA) for the correction, isoform definition, and alternative splicing analysis of noisy reads. FLAIR has primarily been used for nanopore cDNA, native RNA, and PacBio sequencing reads.	Genobioinfo Cluster: Ask for Install
FLAMES	Full-length transcriptome splicing and mutation analysis.	Genobioinfo Cluster: How to use
G4Hunter	Re-evaluation of G-quadruplex propensity with G4Hunter. G4-Hunter : un nouvel algorithme pour la prédiction des G-quadruplexes. G-quadruplexes are involved in gene expression regulation, DNA replication, RNA processing, and genome maintenance
GUSHR	Assembly-free construction of UTRs from short read RNA-Seq data on the basis of coding sequence annotation. This tool has been adapted to the format needs of AUGUSTUS/BRAKER and employs GeMoMa for generating UTRs from RNA-Seq coverage data.	Genobioinfo Cluster: How to use
IRFinder	Detecting intron retention from RNA-Seq experiments	Genobioinfo Cluster: How to use
IsoLasso	IsoLasso is an algorithm to assemble transcripts and estimate their expression levels from RNA-Seq reads.	Genobioinfo Cluster: Ask for Install
kallisto	kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.	Genobioinfo Cluster: How to use
LIQA	Long-read Isoform Quantification and Analysis) is an Expectation-Maximization based statistical method to quantify isoform expression and detect differential alternative splicing (DAS) events using long-read RNA-seq data.	Genobioinfo Cluster: Ask for Install
MISO	MISO (Mixture-of-Isoforms) is a probabilistic framework that quantitates the expression level of alternatively spliced genes from RNA-Seq data, and identifies differentially regulated isoforms or exons across samples.	Genobioinfo Cluster: How to use
mmannot	mmannot annotates reads, or quantifies the features. For instance, suppose that you have sequenced your organism of interest with sRNA-Seq (RNA-Seq works too), and you want to know how many times you have sequenced miRNAs, rRNAs, tRNAs, etc. This is what mmannot does. A huge proportion of the reads may actually map at several locations. These multi-mapping reads are usually handled poorly by similar quantification tools. In our methods, when a read maps at several locations, all these locations are inspected: If all these locations belong to the same feature (e.g. miRNAs, in case of a duplicated gene family), the read is still annotated as a miRNA. If the location belong to different features (e.g. 3'UTR and miRNA), the read is ambiguous, and is flagged as 3'UTR--miRNA. In case 1, we say when have rescued a read.	Genobioinfo Cluster: Ask for Install
mmquant	This tool counts the number of reads (produced by RNA-Seq) per gene, much like HTSeq-count and featureCounts. The main difference with other tools is that multi-mapping reads are counted differently: if a read is mapped to gene A, gene B, and gene C, the tool will create a new feature, "geneA--geneB--geneC", that will be counted once.	Genobioinfo Cluster: How to use
MuMRescueLite	MuMRescueLite is the software that enable to use the tag sequencies of mapped to multiple loci to the genome, for the expression analysis. At the mapping of short sequence tags of CAGE or ChIP-Seq to the genome, sequence tags that map to multiple genomic loci (multi-mapping tags or MuMs), are routinely omitted from further analysis, leading to experimental bias and reduced coverage. MuMRescueLite probabilistically reincorporates multi-mapping tags into mapped short read data with acceptable computational requirements.	Genobioinfo Cluster: Ask for Install
NanoCount	NanoCount estimates transcripts abundance from Oxford Nanopore direct-RNA sequencing datasets, using an expectation-maximization approach like RSEM, Kallisto, salmon, etc to handle the uncertainty of multi-mapping reads.	Genobioinfo Cluster: Ask for Install
nf-core workflows	This module provide access to workflows nf-core, there are automatically downloaded into your home. More info at nf-core/config page.	Genobioinfo Cluster: How to use
oarfish	oarfish is a program, written in Rust, for quantifying transcript-level expression from long-read (i.e. Oxford nanopore cDNA and direct RNA and PacBio) sequencing technologies. `oarfish` requires a sample of sequencing reads aligned to the transcriptome (currntly not to the genome). It handles multi-mapping reads through the use of probabilistic allocation via an expectation-maximization (EM) algorithm	Genobioinfo Cluster: How to use
Pizzly	A program for detecting gene fusions from RNA-Seq data of cancer samples.	Genobioinfo Cluster: Ask for Install
Portcullis	Splice junction analysis and filtering from BAM files.	Genobioinfo Cluster: Ask for Install
PSGInfer	PSGInfer is a software package for the analysis of RNA-Seq data with probabilistic splice graph (PSG) models of gene alternative processing (splicing, transcription initiation, and polyadenylation)	Genobioinfo Cluster: Ask for Install
PSI-Sigma	Percent Spliced-In (PSI) values are commonly used to report alternative pre-mRNA splicing (AS) changes. However, previous PSI-detection methods are limited to specific types of AS events. PSI-Sigma is using a new splicing index (PSIΣ) that is more flexible, can incoporate novel junctions, and can compute PSI values of individual exons in complex splicing events.	Genobioinfo Cluster: Ask for Install
RdRpCATCH	A community effort to create a shared resource for HMM-based RdRp discovery	Genobioinfo Cluster: How to use
REDItools	REDItools are python scripts developed with the aim to study RNA editing at genomic scale by next generation sequencing data.	Genobioinfo Cluster: Ask for Install
REINDEER	Efficient indexing of k-mer presence and abundance in sequencing datasets.	Genobioinfo Cluster: How to use
rMATS	MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns.	Genobioinfo Cluster: Ask for Install
rMATS turbo	rMATS turbo is the C/Cython version of rMATS (refer to http://rnaseq-mats.sourceforge.net) : Multivariate Analysis of Transcript Splicing (MATS). The major difference between rMATS turbo and rMATS is speed and space usage.	Genobioinfo Cluster: How to use
RSEM	RSEM (RNA-Seq by Expectation-Maximization) is a software package for estimating gene and isoform expression levels from RNA-Seq data.	Genobioinfo Cluster: How to use
RSeQC	RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data	Genobioinfo Cluster: How to use
Salmon	Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using lightweight alignments	Genobioinfo Cluster: How to use
SAMap	Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.	Genobioinfo Cluster: How to use
SARTools	SARTools is a R package dedicated to the differential analysis of RNA-seq data. It provides tools to generate descriptive and diagnostic graphs, to run the differential analysis with one of the well known DESeq2 or edgeR packages and to export the results into easily readable tab-delimited files.	Genobioinfo Cluster: in R-4.5.0
sleuth	sleuth is a program for analysis of RNA-Seq experiments for which transcript abundances have been quantified with kallisto.	Genobioinfo Cluster: Ask for Install
SOAPdenovo-Trans	SOAPdenovo-Trans is a de novo transcriptome assembler basing on the SOAPdenovo framework, adapt to alternative splicing and different expression level among transcripts.	Genobioinfo Cluster: Ask for Install
SortMeRNA	SortMeRNA is a software designed to rapidly filter ribosomal RNA fragments from metatransriptomic data produced by next-generation sequencers. It is capable of handling large RNA databases and sorting out all fragments matching to the database with high accuracy and specificity	Genobioinfo Cluster: How to use
souporcell	souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.	Genobioinfo Cluster: How to use
SpaceRanger	Space Ranger is a set of analysis pipelines that process Visium spatial RNA-seq output and brightfield and fluorescence microscope images in order to detect tissue, align reads, generate feature-spot matrices, perform clustering and gene expression analysis, and place spots in spatial context on the slide image.	Genobioinfo Cluster: How to use
SpliceGrapher	SpliceGrapher predicts alternative splicing patterns and produces splice graphs that capture in a single structure the ways a gene's exons may be assembled. It enhances gene models using evidence from next-generation sequencing and EST alignments.	Genobioinfo Cluster: Ask for Install
SpliceTools	A suite of downstream RNA splicing analysis tools to investigate mechanisms and impact of alternative splicing, Nucleic Acids Research, 2023.	Genobioinfo Cluster: How to use
SQANTI3	SQANTI3 is the newest version of the SQANTI tool (publication) that merges features from SQANTI, (code repository) and SQANTI2 (code repository), together with new additions. SQANTI3 will continue as an integrated development aiming to providing you the best characterization possible for your new long read-defined transcriptome. SQANTI3 is the first module of the Functional IsoTranscriptomics (FIT) framework, that also includes IsoAnnot and tappAS.	Genobioinfo Cluster: How to use
SQuIRE	SQuIRE reveals locus-specific regulation of interspersed repeat expression, Nucleic Acids Research	Genobioinfo Cluster: How to use
STAR	RNA-seq aligner	Genobioinfo Cluster: How to use
STAR-Fusion	STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set (using a GTF file, ideally the same annotation file used during the STAR genome index building process during the intial STAR setup).	Genobioinfo Cluster: How to use
StringTie	StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus.	Genobioinfo Cluster: How to use
Subread	A tool kit for processing next-gen sequencing data	Genobioinfo Cluster: How to use
SUPPA	Fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions.	Genobioinfo Cluster: How to use
TACO	Multi-sample transcriptome assembly from RNA-Seq.	Genobioinfo Cluster: How to use
TAMA	Transcriptome Annotation by Modular Algorithms: this software was designed for processing Iso-Seq data and other long read transcriptome data.	Genobioinfo Cluster: How to use
TAPAS	Tool for Alternative Polyadenylation Site Analysis.	Genobioinfo Cluster: How to use
TPMCalculator	TPMCalculator quantifies mRNA abundance directly from the alignments by parsing BAM files.	Genobioinfo Cluster: How to use
Transrate	Transrate is software for de-novo transcriptome assembly quality analysis.	Genobioinfo Cluster: How to use
Trinotate	Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.	Genobioinfo Cluster: Ask for Install
Whippet	Lightweight and Fast RNA-seq quantification at the event-level	Genobioinfo Cluster: Ask for Install
YASIM	Yet Another SIMulator for Alternative Splicing Events and Realistic Gene Expression Profile. YASIM is the tool that simulates Next- or Third-Generation bulk RNA-Sequencing raw FASTQ reads with ground truth genome annotation and realistic gene expression profile (GEP). It can be used to benchmark tools that are claimed to be able to detect isoforms (e.g., StringTie) or quantify reads on an isoform level (e.g., featureCounts).	Genobioinfo Cluster: How to use

Gene fusion

Application	Description	Availability/Use
chimerascan	chimerascan is a software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets.	Genobioinfo Cluster: Ask for Install
Pizzly	A program for detecting gene fusions from RNA-Seq data of cancer samples.	Genobioinfo Cluster: Ask for Install
STAR-Fusion	STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set (using a GTF file, ideally the same annotation file used during the STAR genome index building process during the intial STAR setup).	Genobioinfo Cluster: How to use

Genetic map

Application	Description	Availability/Use
Carthagene	CarthaGene is a genetic/radiated hybrid mapping software. CarthaGene looks for multiple populations maximum likelihood consensus maps using a fast EM algorithm for maximum likelihood estimation and powerful ordering algorithms. CarthaGene can handle data made up of several distinct populations which t may each be either F2 backcross, recombinant inbred lines, F2 t intercross, phase known outbreds and/or radiated hybrids (haploid t and diploid data).	Genobioinfo Cluster: Ask for Install
Chromonomer	Chromonomer is a program designed to integrate a genome assembly with a genetic map.	Genobioinfo Cluster: Ask for Install
OMBlast	An alignment tool for optical mapping data.	Genobioinfo Cluster: Ask for Install

Genetics

Application	Description	Availability/Use
ASMC	Ascertained Sequentially Markovian Coalescent (contains ASMC and an extension, FastSMC, together with python bindings for both)	Genobioinfo Cluster: How to use
BAMscorer	BAMscorer can be used to conduct genomic assignment tests from BAM files. Assignments can be done on genomic regions, inversions, and whole-genome datasets.	Genobioinfo Cluster: Ask for Install
Beagle_Utilities	Simple utility programs for manipulating text files, especially VCF files.	Genobioinfo Cluster: Ask for Install
BeXY	BeXY is a tool to jointly infer sex karyotypes and sex-linked scaffolds from read count data. It can also be used to genetically sex single individuals. BeXY is a command-line tool, and we provide an easy-to-use R package to visualize and parse the results.	Genobioinfo Cluster: How to use
bonsaitree	Algorithm for automatically building pedigrees using IBD, Age, and Sex information.	Genobioinfo Cluster: How to use
Cas-OFFinder	Cas-OFFinder is OpenCL based, ultrafast and versatile program that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases (RGEN).	Genobioinfo Cluster: How to use
demuxlet	Genetic multiplexing of barcoded single cell RNA-seq	Genobioinfo Cluster: How to use
EMMAX	EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each loci explains only a small fraction of complex traits, which allows us to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.	Genobioinfo Cluster: How to use
evalAdmix	evalAdmix allows to evaluate the results of an admixture analysis (i.e. the result of applying ADMIXTURE, STRUCTURE, NGSadmix and similar).	Genobioinfo Cluster: How to use
gamevar.f90	A software package for calculating individual genetic diversity.	Genobioinfo Cluster: How to use
GARLIC	GARLIC is a program for calling runs of homozygosity in genotype data. It implements the ROH calling method of Pemberton et al. AJHG (2012) and Blant et al. (2017)	Genobioinfo Cluster: How to use
GONE	This program calculates and uses linkage disequilibrium at genomic marker loci to infer the effective population size trajectories over a period of about 100-200 hundred generations back in time.	Genobioinfo Cluster: How to use
hap-ibd	The hap-ibd program detects identity-by-descent (IBD) segments and homozygosity-by-descent (HBD) segments in phased genotype data.	Genobioinfo Cluster: How to use
loco-pipe	loco-pipe is an automated Snakemake pipeline that streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data.	Genobioinfo Cluster: How to use
moments	moments implements methods for inferring demographic history and patterns of selection from genetic data, based on solutions to the diffusion approximations to the site-frequency spectrum (SFS).	Genobioinfo Cluster: How to use
PopART	PopART (Population Analysis with Reticulate Trees) is free, open source population genetics software that was developed as part of the Allan Wilson Centre Imaging Evolution Initiative.	Genobioinfo Cluster: How to use
PoPoolation	PoPoolation is a pipeline for analysing pooled next generation sequencing data. Currently PoPoolation allows to calculate Tajima’s Pi, Watterson’s Theta and Tajima’s D with a sliding window approach for chromosomes or for set of genes.	Genobioinfo Cluster: How to use
POPS	The POPS program performs inference of ancestry distribution models.	Genobioinfo Cluster: How to use
pyrho	Fast demography-aware inference of fine-scale recombination rates based on fused-LASSO.	Genobioinfo Cluster: How to use
quickLD	High-performance Computation of Linkage Disequilibrium on CPUs and GPUs.	Genobioinfo Cluster: How to use
READv2	Relationship Estimation from Ancient DNA version 2.	Genobioinfo Cluster: How to use
SelNeTime	The selnetime python package implements methods for statistical analysis of genetic data collected for a same population at different times.	Genobioinfo Cluster: How to use
Vt	A tool set for short variant discovery in genetic sequence data.	Genobioinfo Cluster: How to use

Genomic association studies

Application	Description	Availability/Use
AlphaImpute	AlphaImpute is a software package for imputing and phasing genotype data in diploid populations with pedigree information.	Genobioinfo Cluster: Ask for Install
bgc	bgc implements Bayesian estimation of genomic clines to quantify introgression at many loci.	Genobioinfo Cluster: Ask for Install
BOLT-LMM	The BOLT-LMM software package consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).	Genobioinfo Cluster: How to use
coevol	Correlated evolution of substitution rates and quantitative traits.	Genobioinfo Cluster: How to use
Eagle	The Eagle software estimates haplotype phase either within a genotyped cohort or using a phased reference panel.	Genobioinfo Cluster: Ask for Install
EMMAX	EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each loci explains only a small fraction of complex traits, which allows us to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.	Genobioinfo Cluster: How to use
FaST-LMM	FaST-LMM (Factored Spectrally Transformed Linear Mixed Models) is a program for performing genome-wide association studies (GWAS) on large data sets.	Genobioinfo Cluster: Ask for Install
fineRADstructure	A complete, easy to use, and fast population inference package for RAD-seq data.	Genobioinfo Cluster: Ask for Install
GCTA	GCTA (Genome-wide Complex Trait Analysis) was originally designed to estimate the proportion of phenotypic variance explained by genome- or chromosome-wide SNPs for complex traits (the GREML method), and has subsequently extended for many other analyses to better understand the genetic architecture of complex traits.	Genobioinfo Cluster: How to use
GEMMA	GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS).	Genobioinfo Cluster: How to use
KING	KING is a toolset that makes use of high-throughput SNP data typically seen in a genome-wide association study (GWAS) or a sequencing project.	Genobioinfo Cluster: How to use
kmersGWAS	A library for running k-mers based GWAS.	Genobioinfo Cluster: How to use
lcMLkin	lcMLkin is a C++ program that allows users to infer biological relatedness from low coverage 2nd generation sequencing data	Genobioinfo Cluster: Ask for Install
Lep-MAP3	Lep-MAP3 is a novel and free software for linkage mapping. It can construct linkage maps on very large number of markers and individuals on single or multiple families.	Genobioinfo Cluster: How to use
loco-pipe	loco-pipe is an automated Snakemake pipeline that streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data.	Genobioinfo Cluster: How to use
MACH	MACH is a Markov Chain based haplotyper that can resolve long haplotypes or infer missing genotypes in samples of unrelated individuals.	Genobioinfo Cluster: Ask for Install
MapThin	Reduce the number of SNPs in a gene marker dense map computed by PLINK. First, by eliminating linked SNPs. Then, by applying different criteria.	Genobioinfo Cluster: Ask for Install
Minimac4	Minimac4 is a latest version in the series of genotype imputation software - preceded by Minimac3 (2015), Minimac2 (2014), minimac (2012) and MaCH (2010). Minimac4 is a lower memory and more computationally efficient implementation of the original algorithms with comparable imputation quality.	Genobioinfo Cluster: How to use
piMASS	posterior inference via Model Averaging and Subset Selection: performs genome-wide joint analysis of all SNPs in association with a phenotype.	Genobioinfo Cluster: Ask for Install
PLINK	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	Genobioinfo Cluster: How to use
pyseer	Sequence Element Enrichment Analysis (SEER), python implementation. Pyseer uses linear models with fixed or mixed effects to estimate the effect of genetic variation in a bacterial population on a phenotype of interest, while accounting for potentially very strong confounding population structure. This allows for genome-wide association studies (GWAS) to be performed in clonal organisms such as bacteria and viruses.	Genobioinfo Cluster: How to use
regenie	regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.	Genobioinfo Cluster: How to use
Scoary	Scoary is designed to take the gene_presence_absence.csv file from Roary as well as a traits file created by the user and calculate the assocations between all genes in the accessory genome and the traits.	Genobioinfo Cluster: How to use
snpflip	Report reverse and ambiguous strand SNPs in GWAS data.	Genobioinfo Cluster: How to use
TASSEL	Trait Analysis by aSSociation, Evolution and Linkage. TASSEL has multiple functions, including association study, evaluating evolutionary relationships, analysis of linkage disequilibrium, principal component analysis, cluster analysis, missing data imputation and data visualization for large sets of data.	Genobioinfo Cluster: How to use

Genotyping

Application	Description	Availability/Use
ACACIA	Allele CAlling proCedure for Illumina Amplicon sequencing data: This workflow aims at extracting allele information out of paired-end Illumina FASTQC files.	Genobioinfo Cluster: How to use
AmpliSAT	AmpliSAT (Amplicon Sequencing Analysis Tools) are a set of online tools that make easy the analysis of Amplicon Sequencing experiments.	Genobioinfo Cluster: How to use
Beagle_Utilities	Simple utility programs for manipulating text files, especially VCF files.	Genobioinfo Cluster: Ask for Install
Cellsnp-lite	Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs.	Genobioinfo Cluster: How to use
cgMLSTFinder	Core genome Multi-Locus Sequence Typing cgMLSTFinder runs KMA <1> against a chosen core genome MLST (cgMLST) database and outputs the detected alleles in a matrix file.	Genobioinfo Cluster: Ask for Install
chewBBACA	chewBBACA is a software suite for the creation and evaluation of core genome and whole genome MultiLocus Sequence Typing (cg/wgMLST) schemas and results. The "BBACA" stands for "BSR-Based Allele Calling Algorithm". BSR stands for BLAST Score Ratio as proposed by Rasko DA et al.. The "chew" part adds extra coolness to the name and could be thought of as "Comprehensive and Highly Efficient Workflow".	Genobioinfo Cluster: How to use
evalAdmix	evalAdmix allows to evaluate the results of an admixture analysis (i.e. the result of applying ADMIXTURE, STRUCTURE, NGSadmix and similar).	Genobioinfo Cluster: How to use
fastPHASE	A tool for genotype imputation and estimating missing haplotypes.	Genobioinfo Cluster: Ask for Install
flare	The flare program uses a set of reference haplotypes to infer the ancestry of each allele in a set of admixed study samples. The flare program is fast, accurate, and memory-efficient.	Genobioinfo Cluster: How to use
GLIMPSE	GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.	Genobioinfo Cluster: How to use
graphtyper	graphtyper is a graph-based variant caller capable of genotyping population-scale short read data sets.	Genobioinfo Cluster: How to use
locator	A supervised machine learning method for predicting the geographic origin of a sample from genotype or sequencing data.	Genobioinfo Cluster: How to use
LRez	Standalone tool and library allowing to work with barcoded linked-reads.	Genobioinfo Cluster: Ask for Install
merge-ibd-segments	Remove any breaks and short gaps in IBD segments. Haplotype phase errors and genotype errors can cause breaks and gaps in the detected IBD segments. You can use this program to remove any breaks and short gaps in IBD segments. We usually remove gaps between IBD segments that have at most one discordant homozygote and that are less than 0.6 cM in length.	Genobioinfo Cluster: How to use
mgatk	A mitochondrial genome analysis toolkit.	Genobioinfo Cluster: How to use
PanGenie	A short-read genotyper for various types of genetic variants (such as SNPs, indels and structural variants) represented in a pangenome graph.	Genobioinfo Cluster: How to use
pixy	pixy is a command-line tool for painlessly and correctly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.	Genobioinfo Cluster: How to use
RBCeq2	RBCeq2 reads in genomic variant data in the form of variant call files (VCF) and outputs blood group (BG) genotype and phenotype inference.	Genobioinfo Cluster: How to use
READ	READ is a method to infer the degree of relationship (up to second degree, i.e. nephew/niece-uncle/aunt, grandparent-grandchild or half-siblings) for a pair of low-coverage individuals.	Genobioinfo Cluster: How to use
SDpop	SDpop infers sex-linkage from genotyping data of several individuals of both sexes, collected in panmictic populations.	Genobioinfo Cluster: How to use
STITCH	STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation.	Genobioinfo Cluster: Ask for Install
SVJedi	SVJedi is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: Ask for Install
SVJedi-graph	SVJedi-graph is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: How to use
T1K	T1K (The ONE genotyper for Kir and HLA) is a computational tool to infer the alleles for the polymorphic genes such as KIR and HLA. T1K calculates the allele abundances based on the RNA-seq/WES/WGS read alignments on the provided allele reference sequences. The abundances are used to pick the true alleles for each gene. T1K provides the post analysis steps, including novel SNP detection and single-cell representation. T1K supports both single-end and paired-end sequencing data with any read length.	Genobioinfo Cluster: How to use
varigraph	An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes.	Genobioinfo Cluster: How to use
WASP	WASP is a suite of tools for unbiased allele-specific read mapping and discovery of molecular QTLs.	Genobioinfo Cluster: How to use

Haplotypes

Application	Description	Availability/Use
Beagle_Utilities	Simple utility programs for manipulating text files, especially VCF files.	Genobioinfo Cluster: Ask for Install
ChromoPainter	ChromoPainter is a tool for finding haplotypes in sequence data.	Genobioinfo Cluster: How to use
fastPHASE	A tool for genotype imputation and estimating missing haplotypes.	Genobioinfo Cluster: Ask for Install
flare	The flare program uses a set of reference haplotypes to infer the ancestry of each allele in a set of admixed study samples. The flare program is fast, accurate, and memory-efficient.	Genobioinfo Cluster: How to use
hap.py		Genobioinfo Cluster: How to use
HapHiC	HapHiC is an allele-aware scaffolding tool that uses Hi-C data to scaffold haplotype-phased genome assemblies into chromosome-scale pseudomolecules.	Genobioinfo Cluster: How to use
HaploHiC	Comprehensive haplotype division of Hi-C PE-reads based on local contacts ratio.	Genobioinfo Cluster: Ask for Install
Haplostrips	Haplostrips produce plots that depict variants in a genomic window among different samples. Visualize similarities between haplotypes with respect to a reference haplotype through haplotype clustering and sorting, useful for revealing hidden population structure.	Genobioinfo Cluster: How to use
Haploview	Haploview is designed to simplify and expedite the process of haplotype analysis by providing a common interface to several tasks relating to such analyses.	Genobioinfo Cluster: Ask for Install
harpy	Process raw haplotagging data, from raw sequences to phased haplotypes, batteries included.	Genobioinfo Cluster: How to use
HAT-phasing	HAT is a haplotype assembly tool that use NGS and TGS data along a reference genome to reconstruct haplotypes.	Genobioinfo Cluster: Ask for Install
merge-ibd-segments	Remove any breaks and short gaps in IBD segments. Haplotype phase errors and genotype errors can cause breaks and gaps in the detected IBD segments. You can use this program to remove any breaks and short gaps in IBD segments. We usually remove gaps between IBD segments that have at most one discordant homozygote and that are less than 0.6 cM in length.	Genobioinfo Cluster: How to use
Migrate	Migrate estimates effective population sizes,past migration rates between n population assuming a migration matrix model with asymmetric migration rates and different subpopulation sizes, and population divergences or admixture.	Genobioinfo Cluster: How to use
nPhase	nPhase is a ploidy agnostic tool developed in python which predicts the haplotypes of a sample that was sequenced by both long and short reads by aligning them to a reference. It should work with any ploidy.	Genobioinfo Cluster: Ask for Install
PredictHaplo	This software aims at reconstructing haplotypes from next-generation sequencing data.	Genobioinfo Cluster: How to use
READ	READ is a method to infer the degree of relationship (up to second degree, i.e. nephew/niece-uncle/aunt, grandparent-grandchild or half-siblings) for a pair of low-coverage individuals.	Genobioinfo Cluster: How to use
SNPsplit	SNPsplit is an allele-specific alignment sorter which is designed to read alignment files in SAM/BAM format and determine the allelic origin of reads that cover known SNP positions.	Genobioinfo Cluster: How to use
WGDI	WGDI (Whole-Genome Duplication Integrated analysis), a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidizations and cross-species genome alignments.	Genobioinfo Cluster: How to use

HiC

Application	Description	Availability/Use
ALLHiC	Phasing and scaffolding polyploid genomes based on Hi-C data	Genobioinfo Cluster: How to use
Armatus	Multiresolution domain calling software for chromosome conformation capture interaction matrices. Armatus is a Topologically Associated Domain caller. Follow the Web page to know more about Armatus.	Genobioinfo Cluster: Ask for Install
AutoHiC	AutoHiC is a deep learning tool that uses Hi-C data to support genome assembly. It can automatically correct errors during genome assembly and generate genomes at the chromosome level.	Genobioinfo Cluster: How to use
Chromosight	Python package to detect chromatin loops (and other patterns) in Hi-C contact maps.	Genobioinfo Cluster: How to use
Cooler	Cooler is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.	Genobioinfo Cluster: How to use
coolpuppy	A versatile tool to perform pile-up analysis on Hi-C data in .cool format.	Genobioinfo Cluster: Ask for Install
DARIC	A complete framework for identifying quantitatively differential compartments from Hi-C and Micro-C data. `DARIC`, or Differential Analysis for genomic Regions' Interaction with Compartments, is a computational framework to identify the quantitatively differential compartments from Hi-C-like data. For more details about the design and implementation of the framework, please check our paper published at BMC Genomics.	Genobioinfo Cluster: How to use
FALCON-Phase	FALCON-Phase integrates PacBio long-read assemblies with Phase Genomics Hi-C data to create phased, diploid, chromosome-scale scaffolds.	Genobioinfo Cluster: Ask for Install
FAN-C	Framework for the ANalysis of C-like data.	Genobioinfo Cluster: How to use
GraphUnzip	Unzip assembly graphs with Hi-C data and/or long reads.	Genobioinfo Cluster: Ask for Install
HapHiC	HapHiC is an allele-aware scaffolding tool that uses Hi-C data to scaffold haplotype-phased genome assemblies into chromosome-scale pseudomolecules.	Genobioinfo Cluster: How to use
HiCAssembler	Software to assemble contigs/scaffolds into chromosomes using Hi-C data.	Genobioinfo Cluster: How to use
HiCExplorer	HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.	Genobioinfo Cluster: How to use
Hickit	Hickit is a set of tools initially developed to process diploid single-cell Hi-C data. It extracts contact pairs from read alignment, identifies phases of contacts overlapping with SNPs of known phases, imputes missing phases, infers the 3D structure of a single cell and visualizes the structure.	Genobioinfo Cluster: How to use
HiCLift	A fast and efficient tool for converting chromatin interaction data between genome assemblies.	Genobioinfo Cluster: Ask for Install
juicebox_scripts	A collection of scripts for working with Hi-C data, Juicebox, and other genomic file formats	Genobioinfo Cluster: Ask for Install
Linker	Linker is a suite of C++ tools useful for interpreting long and linked read sequencing of cancer genomes.	Genobioinfo Cluster: Ask for Install
Orca	A deep learning sequence modeling framework for multiscale genome structure prediction.	Genobioinfo Cluster: How to use
PretextMap	Paired REad TEXTure Mapper. Converts SAM formatted read pairs into genome contact maps. Full suite of Pretext tools.	Genobioinfo Cluster: How to use
PretextView	OpenGL Powered Pretext Contact Map Viewer.	Genobioinfo Cluster: How to use
SALSA	A tool to scaffold long read assemblies with Hi-C data	Genobioinfo Cluster: How to use
SELFISH	SELFISH is a tool for finding differential chromatin interactions between two Hi-C contact maps.	Genobioinfo Cluster: How to use
sslHiC	sslHiC is a computational framework for comparative analyses of Hi-C data, including reproducibility measurement and differential chromatin interaction (DCI) detection.	Genobioinfo Cluster: How to use
SuperTAD	SuperTAD is an open-source command-line TAD detection package written in C++. It takes either raw or normalized Hi-C contact maps as inputs.	Genobioinfo Cluster: How to use
YaHS	YaHS is a scaffolding tool using Hi-C data.	Genobioinfo Cluster: How to use

Iso-seq analysis

Application	Description	Availability/Use
ANGEL	Robust Open Reading Frame prediction (ANGLE re-implementation)	Genobioinfo Cluster: Ask for Install
CARNAC-LR	Clustering coefficient-based Acquisition of RNA Communities in Long Reads.	Genobioinfo Cluster: Ask for Install
cDNA_Cupcake	cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data.	Genobioinfo Cluster: in Cogent module
Cogent	Cogent is a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences. It is designed to be used on Iso-Seq data and in cases where there is no reference genome or the ref genome is highly incomplete.	Genobioinfo Cluster: How to use
IsoSeq	Scalable De Novo Isoform Discovery from Single-Molecule PacBio Reads.	Genobioinfo Cluster: How to use
SQANTI3	SQANTI3 is the newest version of the SQANTI tool (publication) that merges features from SQANTI, (code repository) and SQANTI2 (code repository), together with new additions. SQANTI3 will continue as an integrated development aiming to providing you the best characterization possible for your new long read-defined transcriptome. SQANTI3 is the first module of the Functional IsoTranscriptomics (FIT) framework, that also includes IsoAnnot and tappAS.	Genobioinfo Cluster: How to use
TAMA	Transcriptome Annotation by Modular Algorithms: this software was designed for processing Iso-Seq data and other long read transcriptome data.	Genobioinfo Cluster: How to use

Librairies and other tools

Application	Description	Availability/Use
AGAT	Another Gff Analysis Toolkit: suite of tools to handle gene annotations in any GTF/GFF format. Some examples what AGAT can do: standardise any GTF/GFF file into a comprehensive GFF3 format (script with agat_sp prefix): add missing parent features (e.g. gene and mRNA if only CDS/exon exist). add missing features (e.g. exon and UTR). add missing mandatory attributes (i.e. ID, Parent). fix identifier to be uniq. fix feature location. remove duplicated features. group related features (if spread in different places in the file). sort features. merge overlapping loci into one single locus (only if option activated).	Genobioinfo Cluster: How to use
AGC	Assembled Genomes Compressor (AGC) is a tool designed to compress collections of de-novo assembled genomes. It can be used for various types of datasets: short genomes (viruses) as well as long (humans).	Genobioinfo Cluster: How to use
Alfred	BAM Statistics, Feature Counting and Annotation	Genobioinfo Cluster: Ask for Install
AMAS	Calculate summary statistics and manipulate multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
Apptainer	Apptainer is an open source container platform designed to be simple, fast, and secure.	Genobioinfo Cluster: How to use
ArrowGrid	The distribution is a parallel wrapper around the Arrow consensus framework within the SMRT Analysis Software	Genobioinfo Cluster: How to use
ART	ART is a set of simulation tools to generate synthetic next-generation sequencing reads.	Genobioinfo Cluster: Ask for Install
art_modern	A modern re-implementation of the popular ART simulator with enhanced performance and functionality.	Genobioinfo Cluster: How to use
atac dnase pipelines	ATAC-seq and DNase-seq processing pipeline. This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.	Genobioinfo Cluster: Ask for Install
awscli	The AWS Command Line Interface (AWS CLI) is an open source tool that enables you to interact with AWS services using commands in your command-line shell.	Genobioinfo Cluster: How to use
Badread	Badread is a long-read simulator tool that makes – you guessed it – bad reads! It can imitate many kinds of problems one might encounter in real long-read sets: chimeras, low-quality regions, systematic basecalling errors and more.	Genobioinfo Cluster: How to use
bamaddrg	Adds read groups to input BAM files, streams BAM output on stdout.	Genobioinfo Cluster: Ask for Install
BamBam	several simple-to-use tools to facilitate NGS analysis	Genobioinfo Cluster: Ask for Install
Bamstats (notsame as BAMstats)	Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.	Genobioinfo Cluster: Ask for Install
bamtofastq	Tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files.	Genobioinfo Cluster: How to use
Bamtools	BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.	Genobioinfo Cluster: How to use
bamUtil	bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.	Genobioinfo Cluster: How to use
BBMap	a short read aligner, as well as various other bioinformatic tools.	Genobioinfo Cluster: How to use
bcl2fastq	The Bcl2FastQ conversion software is a new tool to handle bcl conversion and demultiplexing of both unzipped and zipped bcl files, which have reduced footprint and were introduced as an optional output of the HCS Software version 2.0	Genobioinfo Cluster: How to use
BEDOPS	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	Genobioinfo Cluster: How to use
bedtools	The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage.	Genobioinfo Cluster: How to use
BigDataScript	BigDataScript is intended as a scripting language for big data pipeline	Genobioinfo Cluster: Ask for Install
Bioawk	Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.	Genobioinfo Cluster: How to use
biohazard-tools	This is a collection of command line utilities that do useful stuff involving BAM files for Next Generation Sequencing data.	Genobioinfo Cluster: Ask for Install
Biopieces	The Biopieces are a collection of bioinformatics tools that can be pieced together in a very easy and flexible manner to perform both simple and complex tasks.	Genobioinfo Cluster: Ask for Install
BIOPYTHON	Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.	Genobioinfo Cluster: in Python-3.11.1 (see "search_Python_module" script to search in others Python versions)
busco2fasta	A script to turn a set of BUSCO results into a directory of multisequence FASTA files.	Genobioinfo Cluster: Ask for Install
bwtools	bwtool is a command-line utility for bigWig files.	Genobioinfo Cluster: Ask for Install
Cabal	Cabal is the standard package system for Haskell software. It helps people to configure, build and install Haskell software and to distribute it easily to other users and developers.	Genobioinfo Cluster: Ask for Install
cctools	The Cooperative Computing Tools (cctools) enable large scale distributed computations to harness hundreds to thousands of machines from clusters, clouds, and grids.	Genobioinfo Cluster: Ask for Install
cd-hit	CD-HIT stands for Cluster Database at High Identity with Tolerance. The program (cd-hit) takes a fasta format sequence database as input and produces a set of 'non-redundant' (nr) representative sequences as output. In addition cd-hit outputs a cluster file, documenting the sequence 'groupies' for each nr sequence representative.	Genobioinfo Cluster: How to use
cdbfasta	This is a brief introduction to a couple of platform independent file-based hashing tools (cdbfasta and cdbyank) that can be used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files.	Genobioinfo Cluster: How to use
circos	Circos is a software package for visualizing data and information.	Genobioinfo Cluster: How to use
clustermq	ClusterMQ: send R function calls as cluster job	Genobioinfo Cluster: Ask for Install
code-server	Run VSCode on any machine anywhere and access it in the browser.	Genobioinfo Cluster: How to use
Computel	Computel is designed for measuring mean telomere length and abundance of canonical and variant telomeric repeats from Illumina Whole Genome NGS Sequencing data.	Genobioinfo Cluster: How to use
Concrete Autoencoders	The concrete autoencoder is an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features.	Genobioinfo Cluster: Ask for Install
csvtk	A cross-platform, efficient and practical CSV/TSV toolkit in Golang.	Genobioinfo Cluster: Ask for Install
csvtk	A cross-platform, efficient and practical CSV/TSV toolkit in Golang.	Genobioinfo Cluster: How to use
DamageProfiler	A Java based tool to determine damage patterns on ancient DNA as a replacement for mapDamage. DamageProfiler calculates damage profiles of mapped reads and provides a graphical as well as text based representation. It creates damage plots fragment length distribution read identity distribution base frequency table of reference table of different base misincorporations and their occurrences	Genobioinfo Cluster: How to use
Dasel	Dasel (short for data-selector) allows you to query and modify data structures using selector strings.	Genobioinfo Cluster: Ask for Install
datamash	GNU datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files.	Genobioinfo Cluster: default system
DAZZ_DB	To facilitate the multiple phases of the dazzler assembler, we organize all the read data into what is effectively a "database" of the reads and their meta-information.	Genobioinfo Cluster: Ask for Install
DBSCAN-SWA	An integrated tool for rapid prophage detection and annotation.	Genobioinfo Cluster: Ask for Install
deepTools	Tools to process and analyze deep sequencing data.	Genobioinfo Cluster: How to use
DSK	DSK is a k-mer counting software, similar to Jellyfish. DSK supports large values of k, and runs with (almost-)arbitrarily low memory usage and reasonably low temporary disk usage. DSK can count k-mers of large Illumina datasets on laptops and desktop computers.	Genobioinfo Cluster: Ask for Install
ecoPCR	ecoPCR is an electronic PCR software developed by LECAand Helix-Project . It helps you to estimate Barcode primers quality. In conjunction with OBItools, you can postprocess ecoPCR output to compute barcode coverage and barcode speci?city.	Genobioinfo Cluster: How to use
EDirect	Entrez Direct (EDirect) provides access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window	Genobioinfo Cluster: How to use
EGA_download_client	The EgaDemoClient is a JAVA based data streamer that enables EGA account holders to securely download files and datasets, either through an interactive shell (IS) or using direct command line mode (DCLM).	Genobioinfo Cluster: Ask for Install
EggLib	EggLib is a C++/Python library and program package for evolutionary genetics and genomics.	Genobioinfo Cluster: Ask for Install
Eigen	Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.	Genobioinfo Cluster: Ask for Install
emacs	Universal text editor.	Genobioinfo Cluster: How to use
EMBOSS	EMBOSS is "The European Molecular Biology Open Software Suite". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.	Genobioinfo Cluster: How to use
Ensembl-API	Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlying database schemes and more specific application programmes. The APIs aim to encapsulate the database layout by providing efficient high-level access to data tables and isolate applications from data layout changes. Ensembl's API is written in Perl	Genobioinfo Cluster: Ask for Install
ERPIN	ERPIN (Easy RNA Profile IdentificatioN) is an RNA motif search program developped by Daniel Gautheret and André Lambert.	Genobioinfo Cluster: Ask for Install
EUPAN	Toolkit that integrates various software in order to build eukaryotic pangenomes.	Genobioinfo Cluster: Ask for Install
FASTA Composition	finds the overall composition of sequences in a FASTA file	Genobioinfo Cluster: How to use
FASTA_Length	FASTA Length finds the lengths of sequences in a FASTA file.	Genobioinfo Cluster: How to use
fasta_validator	C code to validate a fasta file.	Genobioinfo Cluster: How to use
fastk-medians	A set of utilities to calculate the median number of times the k-mers in a sequence of interest occur across the whole set.	Genobioinfo Cluster: How to use
fastprofkernel	fastprofkernel is a Debian package that uses an accelerated version of the original profile kernel <1> to automatically train SVM based classification models. It can assign user-defined classes to so far uncharacterized proteins.	Genobioinfo Cluster: How to use
FastQC	A Quality Control application for FastQ files. FastQC is an application which takes a FastQ file and runs a series of tests on it to generate a comprehensive QC report.	Genobioinfo Cluster: How to use
FASTX-Toolkit	The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.	Genobioinfo Cluster: How to use
FFmpeg	FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created.	Genobioinfo Cluster: How to use
fgbio	A set of tools to analyze genomic data with a focus on Next Generation Sequencing.	Genobioinfo Cluster: Ask for Install
FindingOverCovRegions	FindOverCovRegions.py search for genomic regions with abnormal read coverage (e.g. depth). To do so, this program requieres begraph-like file (e.g. bedtools genomecov per-base reports) where for each position of the genome, the coverage depth is reported (even 0 values).	Genobioinfo Cluster: Ask for Install
fqtools	fqtools is a software suite for fast processing of FASTQ files; Various file manipulations are supported.	Genobioinfo Cluster: Ask for Install
gargammel	gargammel is an ancient DNA simulator	Genobioinfo Cluster: How to use
gcloud	gcloud CLI is a set of tools for creating and managing Google Cloud resources.	Genobioinfo Cluster: How to use
gCluster	The gCluster algorithm is a general clustering method that predicts clusters of any biological word or combination of them, relying only on the DNA sequence and the statistical significance. When using CG as word, gCluster works similarly to CpGcluster, our method to predict CpG islands. More broadly, gCluster has much in common with wordCluster but uses an improved distance model.	Genobioinfo Cluster: How to use
GDAL	a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation.	Genobioinfo Cluster: Ask for Install
gdc-client	The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.	Genobioinfo Cluster: How to use
GEM-library	A set of very optimized tools for indexing/querying huge genomes/files.	Genobioinfo Cluster: How to use
GEM-Tools	GEM-Tools is a C API and a Python module to support and simplify usage of the GEM Mapper.	Genobioinfo Cluster: How to use
GenomeScope	Fast genome analysis from unassembled short reads	Genobioinfo Cluster: How to use
GenomeTools	Collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named "gt".	Genobioinfo Cluster: How to use
Gerbil	A basic task in bioinformatics is the counting of k-mers in genome strings.	Genobioinfo Cluster: Ask for Install
gfatools	gfatools is a set of tools for manipulating sequence graphs in the GFA or the rGFA format. It has implemented parsing, subgraph and conversion to FASTA/BED.	Genobioinfo Cluster: How to use
gff3sort	A Perl Script to sort gff3 files and produce suitable results for tabix tools	Genobioinfo Cluster: Ask for Install
gff3toembl	Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI	Genobioinfo Cluster: Ask for Install
gffcompare	gffcompare can be used to compare, merge, annotate and estimate accuracy of one or more GFF files (the “query” files), when compared with a reference annotation (also provided as GFF).	Genobioinfo Cluster: How to use
gffread	GFF/GTF utility providing format conversions, region filtering, FASTA sequence extraction and more.	Genobioinfo Cluster: How to use
gget	gget enables efficient querying of genomic reference databases.	Genobioinfo Cluster: Ask for Install
gh-cli	gh is GitHub on the command line. It brings pull requests, issues, and other GitHub concepts to the terminal next to where you are already working with git and your code.	Genobioinfo Cluster: Ask for Install
GHC	GHC is a state-of-the-art, open source, compiler and interactive environment for the functional language Haskell	Genobioinfo Cluster: Ask for Install
Gradle	Gradle is the open source build system of choice for Java, Android, and Kotlin developers.	Genobioinfo Cluster: How to use
Grinder	Grinder is a versatile open-source bioinformatic tool to create simulated omic shotgun and amplicon sequence libraries for all main sequencing platforms.	Genobioinfo Cluster: Ask for Install
GTFtools	GTFtools provides a set of functions to analyze various modes of gene models.	Genobioinfo Cluster: How to use
hal2vg	Convert HAL to vg-compatible sequence graph.	Genobioinfo Cluster: How to use
Hclust2	Hclust2 is a handy tool for plotting heat-maps with several useful options to produce high quality figures that can be used in publication.	Genobioinfo Cluster: How to use
hcluster_sg	A hierarchical clustering software for sparse graphs	Genobioinfo Cluster: Ask for Install
HDFView	HDFView is a visual tool for browsing and editing HDF4 and HDF5 files.	Genobioinfo Cluster: Ask for Install
HGT-ID	An efficient and sensitive program for detecting viral insertion sequences from known viral reference genome in the genome of human cancers.	Genobioinfo Cluster: Ask for Install
HiCExplorer	HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.	Genobioinfo Cluster: How to use
HiCLift	A fast and efficient tool for converting chromatin interaction data between genome assemblies.	Genobioinfo Cluster: Ask for Install
hssp	Create DSSP and HSSP files. A series of PDB-related databanks for everyday needs.	Genobioinfo Cluster: How to use
IDR	The IDR (Irreproducible Discovery Rate) framework is a uniﬁed approach to measure the reproducibility of ﬁndings identiﬁed from replicate experiments and provide highly stable thresholds based on reproducibility.	Genobioinfo Cluster: How to use
iVar	Var is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates. We implemented the following functions in iVar: (1) trimming of primers and low-quality bases, (2) consensus calling, (3) variant calling - both iSNVs and insertions/deletions, and (4) identifying mismatches to primer sequences and excluding the corresponding reads from alignment files.	Genobioinfo Cluster: How to use
JAGS	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS.	Genobioinfo Cluster: How to use
JCVI	Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.	Genobioinfo Cluster: How to use
Jellyfish	JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA.	Genobioinfo Cluster: How to use
Julia	Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library.	Genobioinfo Cluster: How to use
Jvarkit	Java utilities for Bioinformatics (only requested tools are compiling)	Genobioinfo Cluster: Ask for Install
KAT	KAT (The K-mer Analysis Toolkit) is a suite of tools that generate, analyse and compare k-mer spectra produced from sequence files.	Genobioinfo Cluster: How to use
KCOSS	A fast and space-saving multi-threaded k-mer frequency statistics algorithm	Genobioinfo Cluster: Ask for Install
kentUtils	UCSC command line bioinformatic utilities	Genobioinfo Cluster: How to use
klocate	Standalone tool based on the bwa index to locate a set of kmers along a reference genome. klocate searches each kmer (full and perfect match) in the index and outputs all positions the kmer maps to (output to sdtout in bed format).	Genobioinfo Cluster: Ask for Install
kmap	Standalone tool based on the bwa index to locate a set of kmers along a reference genome.	Genobioinfo Cluster: Ask for Install
kmdiif	kmdiff provides differential k-mers analysis between two populations (control and case). Each population is represented by a set of short-read sequencing. Outputs are differentially represented k-mers between controls and cases.	Genobioinfo Cluster: How to use
kmer-counter	A fast k-mer counter written in Rust.	Genobioinfo Cluster: How to use
KmerGO	KmerGO is a user-friendly tool to identify the group-specific sequences on two groups or trait-associated sequences of high throughput sequencing datasets.	Genobioinfo Cluster: How to use
KrakenTools	KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.	Genobioinfo Cluster: How to use
Krona	Krona allows hierarchical data to be explored with zoomable pie charts. Krona charts can be created using an Excel template or KronaTools, which includes support for several bioinformatics tools and raw data formats.	Genobioinfo Cluster: How to use
lastp_aai	A simple Python script for calculating pairwise amino acid identity (AAI) between protein files (extension .faa)	Genobioinfo Cluster: Ask for Install
libplinkio	This is a small C and Python library for reading Plink genotype files.	Genobioinfo Cluster: How to use
libstree	libstree is a generic suffix tree implementation, written in C.	Genobioinfo Cluster: How to use
Liftoff	Liftoff is a tool that accurately maps annotations in GFF or GTF between assemblies of the same, or closely-related species.	Genobioinfo Cluster: How to use
llvm	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.	Genobioinfo Cluster: Ask for Install
LRez	Standalone tool and library allowing to work with barcoded linked-reads.	Genobioinfo Cluster: Ask for Install
Mash	Fast genome and metagenome distance estimation using MinHash. documentation : Publications — Mash 2.0 documentation	Genobioinfo Cluster: How to use
MEGA-CC	Software suite for analyzing DNA and protein sequence data from species and populations.	Genobioinfo Cluster: Ask for Install
MegaTools	Open-source command line tools for accessing Mega.co.nz cloud storage.	Genobioinfo Cluster: How to use
Met4j	Met4J is an open-source Java library dedicated to the structural analysis of metabolic networks. It also came with a toolbox gathering CLI for several analyses relevant to metabolism-related research.	Genobioinfo Cluster: Ask for Install
MFA	The Montreal Forced Aligner is a command line utility for performing forced alignment of speech datasets using Kaldi (http://kaldi-asr.org/).	Genobioinfo Cluster: How to use
micromamba	micromamba is a single-file executable that is statically linked and can be dropped anywhere on the operating to get started with powerful package management and virtual environments.	Genobioinfo Cluster: How to use
Miniforge	Miniforge is a minimal installer for Conda specific to conda-forge.	Genobioinfo Cluster: How to use
mosdepth	Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. mosdepth can output: per-base depth about 2x as fast samtools depth--about 25 minutes of CPU time for a 30X genome. mean per-window depth given a window size--as would be used for CNV calling. the mean per-region given a BED file of regions. a distribution of proportion of bases covered at or above a given threshhold for each chromosome and genome-wide. quantized output that merges adjacent bases as long as they fall in the same coverage bins e.g. (10-20) threshold output to indicate how many bases in each region are covered at the given thresholds. when appropriate, the output files are bgzipped and indexed for ease of use.	Genobioinfo Cluster: How to use
msamtools	msamtools provides useful functions that are commonly used in microbiome data analysis, especially when analyzing shotgun metagenomics or metatranscriptomics data.	Genobioinfo Cluster: How to use
MultiQC	Aggregate results from bioinformatics analyses across many samples into a single report.	Genobioinfo Cluster: How to use
NAMD	NAMD, recipient of a 2002 Gordon Bell Award and a 2012 Sidney Fernbach Award, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	Genobioinfo Cluster: How to use
NanoPlot	Plotting tool for Oxford Nanopore sequencing data and alignments.	Genobioinfo Cluster: How to use
natsort	Simple yet flexible natural sorting in Python	Genobioinfo Cluster: Ask for Install
NCBI_tools	NCBI portable software toolkit	Genobioinfo Cluster: How to use
NCBI_tools++	NCBI C++ Toolkit provides free, portable, public domain libraries.	Genobioinfo Cluster: How to use
NetLogo	NetLogo is a multi-agent programmable modeling environment.	Genobioinfo Cluster: Ask for Install
NextCloudcmd	A command line client that can be used to synchronize Nextcloud files to client machines.	Genobioinfo Cluster: Ask for Install
Nextflow	Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.	Genobioinfo Cluster: How to use
ngsutils	Tools for next-generation sequencing analysis.	Genobioinfo Cluster: How to use
numpy	NumPy is a package needed for scientific computing with Python.	Genobioinfo Cluster: Ask for Install
OBITools	OBITools is a set of python programs developed to simplify the manipulation of sequence files in our labs. They were mainly designed to help us for analyzing Next Generation Sequencer outputs (454 or Illumina) in the context of DNA Metabarcoding.	Genobioinfo Cluster: How to use
Ollama	Ollama is an open-source tool that allows you to run large language models (LLMs).	Genobioinfo Cluster: How to use
OpenBabel	Open Babel is a chemical toolbox designed to speak the many languages of chemical data.	Genobioinfo Cluster: How to use
openSMILE	Python package for openSMILE (open-source Speech and Music Interpretation by Large-space Extraction).	Genobioinfo Cluster: How to use
Pandoc	Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.	Genobioinfo Cluster: How to use
parallel	GNU parallel is a shell tool for executing jobs in parallel using one or more computers.	Genobioinfo Cluster: default system
parallel-fastq-dump	NCBI fastq-dump can be very slow sometimes, even if you have the resources (network, IO, CPU) to go faster, even if you already downloaded the sra file (see the protip below). This tool speeds up the process by dividing the work into multiple threads.	Genobioinfo Cluster: How to use
Parselmouth	Parselmouth aim to provide a complete and Pythonic interface to the internal Praat code.	Genobioinfo Cluster: How to use
pbtk	PacBio BAM toolkit	Genobioinfo Cluster: How to use
PEAR	PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.	Genobioinfo Cluster: How to use
PfamScan	A program that searches a FASTA file against a library of Pfam HMMs.	Genobioinfo Cluster: How to use
Pomoxis	Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing.	Genobioinfo Cluster: Ask for Install
pong	pong is a freely available software package, released by Behr et al. (2016, Bioinformatics), for post-processing output from clustering inference using population genetic data.	Genobioinfo Cluster: Ask for Install
PoPoolation2	PoPoolation2 allows to compare allele frequencies for SNPs between two or more populations and to identify significant differences. PoPoolation2 requires next generation sequencing data of pooled genomic DNA (Pool-Seq). It may be used for measuring differentiation between populations, for genome wide association studies and for experimental evolution.	Genobioinfo Cluster: How to use
pp-popularity-contest	The pp-popularity-contest package sets up a cron job that periodically submits the developers anonymous statistics on the usage of Rost Lab prediction methods installed on this system.	Genobioinfo Cluster: How to use
preseq	Software for predicting library complexity and genome coverage in high-throughput sequencing.	Genobioinfo Cluster: How to use
Primer3	Primer3 is a widely used program for designing PCR primers (PCR = "Polymerase Chain Reaction").	Genobioinfo Cluster: How to use
PRINSEQ	PRINSEQ is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. The standalone version is primarily designed for data preprocessing and does not generate summary statistics in graphical form.	Genobioinfo Cluster: How to use
PROJ4	Cartographic Projections Library	Genobioinfo Cluster: Ask for Install
pybam	Very simple, pure python, BAM file reader. If you do not need to use BAM indexes, pybam is probably the fastest and simplest BAM parser out there, particularly if run under PyPy.	Genobioinfo Cluster: How to use
PyCharm	PyCharm is a dedicated Python and Django IDE providing a wide range of essential tools for Python developers, tightly integrated together to create a convenient environment for productive Python development and Web development.	Genobioinfo Cluster: Ask for Install
Pydub	Manipulate audio with a simple and easy high level interface.	Genobioinfo Cluster: How to use
PyPy	A fast, compliant alternative implementation of Python.	Genobioinfo Cluster: How to use
pysamstats	A Python utility for calculating statistics against genome positions based on sequence alignments from a SAM or BAM file.	Genobioinfo Cluster: How to use
PySlurm	This module provides a low-level Python wrapper around the Slurm C-API using Cython.	Genobioinfo Cluster: Ask for Install
Quake	t Quake is a package to correct substitution sequencing errors in experiments with deep coverage (e.g. >15X), specifically intended for Illumina sequencing reads. Quake adopts the k-mer error correction framework, first introduced by the EULER genome assembly package. Unlike EULER and similar progams, Quake utilizes a robust mixture model of erroneous and genuine k-mer distributions to determine where errors are located. Then Quake uses read quality values and learns the nucleotide to nucleotide error rates to determine what types of errors are most likely. This leads to more corrections and greater accuracy, especially with respect to avoiding mis-corrections, which create false sequence unsimilar to anything in the original genome sequence from which the read was taken.	Genobioinfo Cluster: How to use
Quarto	Quarto is a software that compiles a markdown code to html, pdf, or many other formats. It is a successor of pandoc.	Genobioinfo Cluster: How to use
R	R is "GNU S", a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc.	Genobioinfo Cluster: How to use
RabbitUniq	Compute unique k-mer faster.	Genobioinfo Cluster: Ask for Install
RabbitV	RabbitV is a highly optimized and practical toolkit for the detection of viruses and microorganisms in sequencing data.	Genobioinfo Cluster: Ask for Install
rasusa	Randomly subsample sequencing reads or alignments.	Genobioinfo Cluster: How to use
RBCeq2	RBCeq2 reads in genomic variant data in the form of variant call files (VCF) and outputs blood group (BG) genotype and phenotype inference.	Genobioinfo Cluster: How to use
RetroScan	RetroScan is an easy-to-use tool for retrocopy identification that integrates a series of bioinformatics tools (LAST, BEDtools, ClustalW2, KaKs_Calculator, HISAT2, StringTie, SAMtools and Shiny) and scripts.	Genobioinfo Cluster: How to use
ripgrep	ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern.	Genobioinfo Cluster: How to use
Roary	Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome.	Genobioinfo Cluster: How to use
RopeBWT2	RopeBWT2 is an tool for constructing the FM-index for a collection of DNA sequences.	Genobioinfo Cluster: How to use
rosbags	Rosbags is the pure python library for everything rosbag.	Genobioinfo Cluster: How to use
ROSE	To create stitched enhancers, and to separate super-enhancers from typical enhancers using sequencing data (.bam) given a file of previously identified constituent enhancers (.gff)	Genobioinfo Cluster: How to use
RTGTools	RTG Tools is a subset of RTG Core that includes several useful utilities for dealing with VCF files and sequence data. Probably the most interesting is the `vcfeval` command which performs sophisticated comparison of VCF files.	Genobioinfo Cluster: How to use
Ruby	A dynamic, open source programming language.	Genobioinfo Cluster: How to use
rush	A cross-platform command-line tool for executing jobs in parallel.	Genobioinfo Cluster: How to use
sambamba	Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth.	Genobioinfo Cluster: How to use
samclip	Filter SAM file for soft and hard clipped alignments	Genobioinfo Cluster: Ask for Install
Saturn	A tool for assessing the library saturation without any reference genome. .	Genobioinfo Cluster: Ask for Install
sbt	sbt is a build tool for Scala, Java, and more.	Genobioinfo Cluster: Ask for Install
scipy	SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. The SciPy library depends on Numpy, which provides convenient and fast N-dimensional array manipulation.	Genobioinfo Cluster: In Python modules
selscan	A program to calculate EHH-based scans for positive selection in genomes.	Genobioinfo Cluster: How to use
Seq	Seq is a programming language for computational genomics and bioinformatics. With a Python-compatible syntax and a host of domain-specific features and optimizations, Seq makes writing high-performance genomics software as easy as writing Python code, and achieves performance comparable to (and in many cases better than) C/C++.	Genobioinfo Cluster: How to use
SeqAn	SeqAn is an open source C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data.	Genobioinfo Cluster: How to use
seqfilter	Filter fasta/fastq(.gz) files by ID and/or sequence length	Genobioinfo Cluster: Ask for Install
SeqFu	A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths.	Genobioinfo Cluster: How to use
SeqKit	A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling.	Genobioinfo Cluster: How to use
Seqtk	Toolkit for processing sequences in FASTA/Q formats	Genobioinfo Cluster: How to use
SequenceTools	Tools for population genetics on sequencing datas	Genobioinfo Cluster: How to use
Shennong	A Python toolbox for speech features extraction.	Genobioinfo Cluster: How to use
Singularity	Singularity enables users to have full control of their environment. Singularity containers can be used to package entire scientific workflows, software and libraries, and even data.	Genobioinfo Cluster: How to use
Smudgeplots	Inference of ploidy and heterozygosity structure using whole genome sequencing data. This tool extracts heterozygous kmer pairs from kmer dump files (from jellyfish or KMC) and performs gymnastics with them. We are able to disentangle genome structure by comparing the sum of kmer pair coverages (CovA + CovB) to their relative coverage (CovA / (CovA + CovB)). Smudgeplots are computed from raw/trimmed reads and show the haplotype structure using heterozygous kmer pairs.	Genobioinfo Cluster: How to use
Snakemake	Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style.	Genobioinfo Cluster: How to use
soap.coverage	Can calculate sequencing coverage or physical coverage as well as duplication rate and details of specific block for each segments and whole genome by using SOAP, Blat, Blast, BlastZ, mummer and MAQ aligement results with multi-thread. Gzip file supported.	Genobioinfo Cluster: Ask for Install
sourmash	sourmash is a command-line tool and Python library for computing hash sketches from DNA sequences, comparing them to each other, and plotting the results.	Genobioinfo Cluster: How to use
squeakr	Squeakr is a k-mer-counting and multiset-representation system using the recently-introduced counting quotient filter (CQF) Pandey et al. (2017), a feature-rich approximate membership query (AMQ) data structure.	Genobioinfo Cluster: Ask for Install
squid	A C library that is bundled with much of the above software. C function library for sequence analysis.	Genobioinfo Cluster: Ask for Install
SRAToolkit	Toolkit to query Short Reads Archive at NCBI	Genobioinfo Cluster: How to use
subsampler	Small tool to subsample fasta and fastq files.	Genobioinfo Cluster: Ask for Install
Sumaclust	Fast and exact clustering of sequences.	Genobioinfo Cluster: How to use
Sumatra	Sumatra was developed by the LECA and aims to compute a great deal of sequence similarities in a fast and exact way, based on the length of the Longest Common Subsequence (LCS) between two sequences. Sequence clustering based on similarities is also available through Sumaclust.	Genobioinfo Cluster: How to use
superstring	Greedy approximation of the shortest common superstring	Genobioinfo Cluster: How to use
Surfboard	A Python package for modern audio feature extraction.	Genobioinfo Cluster: How to use
tabix	TAB-delimited file IndeXer. Useful for vcfTools.	Genobioinfo Cluster: in bcftools and samtools
Telomerecat	Telomerecat is a tool for estimating the average telomere length (TL) for a paired end, whole genome sequencing (WGS) sample.	Genobioinfo Cluster: in Python-3.9.18
TelomereHunter	TelomereHunter extracts, sorts and analyses telomeric reads from WGS Data.	Genobioinfo Cluster: in Python-2.7.18
TexLive	TeX Live is intended to be a straightforward way to get up and running with the TeX document production system.	Genobioinfo Cluster: Ask for Install
toulbar2	toulbar2 is an open-source black-box C++ optimizer for cost function networks and discrete additive graphical models. It can read a variety of formats.	Genobioinfo Cluster: How to use
Ultralytics	Ultralytics creates cutting-edge, state-of-the-art (SOTA) YOLO models built on years of foundational research in computer vision and AI. Include SAHI https://obss.github.io/sahi/	Genobioinfo Cluster: How to use
Umap	The free umap software package efficiently identifies uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite converted genome (methylome).	Genobioinfo Cluster: Ask for Install
unique-kmer-counts	This program calculates the number of distinct k-mers for each sequence record in a fasta file and divides it by the total number of k-mers in that record.	Genobioinfo Cluster: How to use
unitig-caller	Methods to determine sequence element (unitig) presence/absence.	Genobioinfo Cluster: How to use
UnRAR	Easily extract RAR files.	Genobioinfo Cluster: How to use
vawk	An awk-like VCF parser	Genobioinfo Cluster: Ask for Install
VCF-kit	Assorted utilities for the variant call format.	Genobioinfo Cluster: Ask for Install
vcflib	C++ library and cmdline tools for parsing and manipulating VCF files.	Genobioinfo Cluster: How to use
VCFtools	VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic statistics.	Genobioinfo Cluster: How to use
Vmatch	A versatile software tool for eﬃciently solving large scale sequence matching tasks.	Genobioinfo Cluster: Ask for Install
Vosk	Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects.	Genobioinfo Cluster: How to use
VSEARCH	Versatile open-source tool for metagenomics	Genobioinfo Cluster: How to use
Wgsim	Wgsim is a small tool for simulating sequence reads from a reference genome.	Genobioinfo Cluster: Ask for Install
Whisper	Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.	Genobioinfo Cluster: How to use
WiggleTools	The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon's rank sum test, etc).	Genobioinfo Cluster: How to use

Long reads

Application	Description	Availability/Use
3rdChimeraMiner	Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.	Genobioinfo Cluster: How to use
ANGEL	Robust Open Reading Frame prediction (ANGLE re-implementation)	Genobioinfo Cluster: Ask for Install
ArrowGrid	The distribution is a parallel wrapper around the Arrow consensus framework within the SMRT Analysis Software	Genobioinfo Cluster: How to use
Autocycler	A tool for generating consensus long-read assemblies for bacterial genomes.	Genobioinfo Cluster: How to use
Badread	Badread is a long-read simulator tool that makes – you guessed it – bad reads! It can imitate many kinds of problems one might encounter in real long-read sets: chimeras, low-quality regions, systematic basecalling errors and more.	Genobioinfo Cluster: How to use
BELLA	A computationally-efficient and highly-accurate long-read to long-read aligner and overlapper.	Genobioinfo Cluster: Ask for Install
chopper	This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. chopper is a tool that reunites the now outdated softwares NanoFilt and NanoLyse. It permits to filter QC files and has a faster execution time than NanoFilt and NanoLyse.	Genobioinfo Cluster: How to use
CIRI-long	Circular RNA Identification for Long-Reads Nanopore Sequencing Data.	Genobioinfo Cluster: How to use
Clair3	Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.	Genobioinfo Cluster: How to use
CONSENT	CONSENT (sCalable self-cOrrectioN of long reads with multiple SEquence alignmeNT) is a self-correction method for long reads.	Genobioinfo Cluster: Ask for Install
cuteSV	Long read based human genomic structural variation detection with cuteSV.	Genobioinfo Cluster: How to use
DAmar	Long read QC, assembly and scaffolding pipeline for PacBio or Oxford Nanopore long-read sequencing data. T he pipeline produces a number of QC metrics at various stages as well as incorporating further technologies including Bionano, 10x and HiC data to scaffold the created contigs. DAmar, is a hybrid of the earlier Marvel, Dazzler, and Daccord systems of the Eugene Myers lab.	Genobioinfo Cluster: Ask for Install
DAZZ_DB	To facilitate the multiple phases of the dazzler assembler, we organize all the read data into what is effectively a "database" of the reads and their meta-information.	Genobioinfo Cluster: Ask for Install
DBG2OLC	The genome assembler that reduces the computational time of human genome assembly from 400,000 CPU hours to 2,000 CPU hours, utilizing long erroneous 3GS sequencing reads and short accurate NGS sequencing reads.	Genobioinfo Cluster: Ask for Install
DeChat	Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat.	Genobioinfo Cluster: How to use
Deepbinner	Deepbinner is a tool for demultiplexing barcoded Oxford Nanopore sequencing reads. It does this with a deep convolutional neural network classifier, using many of the architectural advances that have proven successful in image classification. Unlike other demultiplexers (e.g. Albacore and Porechop), Deepbinner identifies barcodes from the raw signal (a.k.a. squiggle) which gives it greater sensitivity and fewer unclassified reads.	Genobioinfo Cluster: Ask for Install
DeepSignal	Detecting methylation using signal-level features from Nanopore sequencing reads.	Genobioinfo Cluster: Ask for Install
DENTIST	DENTIST is a sensitive, highly-accurate and automated pipeline method to close gaps in (short read) assemblies with long reads.	Genobioinfo Cluster: Ask for Install
FABuLOUS	A gap-closing software tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly. Initially called TGS-GapCloser.	Genobioinfo Cluster: Ask for Install
FALCON	Falcon: a set of tools for fast aligning long reads for consensus and assembly	Genobioinfo Cluster: Ask for Install
FALCON-Phase	FALCON-Phase integrates PacBio long-read assemblies with Phase Genomics Hi-C data to create phased, diploid, chromosome-scale scaffolds.	Genobioinfo Cluster: Ask for Install
fastplong	Ultra-fast preprocessing and quality control for long-read sequencing data.	Genobioinfo Cluster: How to use
Filtlong	Filtlong is a tool for filtering long reads by quality.	Genobioinfo Cluster: How to use
FLAIR	FLAIR (Full-Length Alternative Isoform analysis of RNA) for the correction, isoform definition, and alternative splicing analysis of noisy reads. FLAIR has primarily been used for nanopore cDNA, native RNA, and PacBio sequencing reads.	Genobioinfo Cluster: Ask for Install
FLAS	FLAS is software that makes self-correction for PacBio long reads with fast speed and high throughput.	Genobioinfo Cluster: Ask for Install
Flye	Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies.	Genobioinfo Cluster: How to use
FMLRC	FMLRC, or FM-index Long Read Corrector, is a tool for performing hybrid correction of long read sequencing using the BWT and FM-index of short-read sequencing data.	Genobioinfo Cluster: Ask for Install
FMLRC2	FMLRC2 performs error correction/polishing of long erroneous sequences with accurate short reads. As such, it can be used as both an error-correction tool <1> for raw long reads (ex. Oxford Nanopore) and a polishing tool <2> for de novo assemblies.	Genobioinfo Cluster: How to use
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
GraphAligner	Seed-and-extend program for aligning long error-prone reads to genome graphs.	Genobioinfo Cluster: How to use
GraphMap	A highly sensitive and accurate mapper for long, error-prone reads.	Genobioinfo Cluster: Ask for Install
Hairsplitter	Software that separates very close sequences that have been collapsed during assembly. Uses only long reads.	Genobioinfo Cluster: How to use
Hap10	The goal is to reconstruct accurate and long haplotypes polyploid genome using linked reads.	Genobioinfo Cluster: Ask for Install
Hapo-G	Hapo-G is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.	Genobioinfo Cluster: How to use
HASLR	HASLR is a tool for rapid genome assembly of long sequencing reads. HASLR is a hybrid tool which means it requires long reads generated by Third Generation Sequencing technologies (such as PacBio or Oxford Nanopore) together with Next Generation Sequencing reads (such as Illumina) from the same sample.	Genobioinfo Cluster: Ask for Install
HECIL	Hybrid Error Correction of Long Reads using Iterative Learning	Genobioinfo Cluster: Ask for Install
HELEN	HELEN (Homopolymer Encoded Long-read Error-corrector for Nanopore) uses a Recurrent-Neural-Network (RNN) based Multi-Task Learning (MTL) model that can predict a base and a run-length for each genomic position using the weights generated by MarginPolish. This installation includes MarginPolish.	Genobioinfo Cluster: Ask for Install
HG-CoLoR	HG-CoLoR (Hybrid method based on a variable-order de bruijn Graph for the error Correction of Long Reads) is a hybrid method for the error correction of long reads that both aligns the short reads to the long reads, and uses a variable-order de Bruijn graph, in a seed-and-extend approach.	Genobioinfo Cluster: Ask for Install
HiFiAdapterFilt	Convert .bam to .fastq and remove reads with remnant PacBio adapter sequences.	Genobioinfo Cluster: How to use
hifiasm	Hifiasm is a fast haplotype-resolved de novo assembler for PacBio Hifi reads. Unlike most existing assemblers, hifiasm starts from uncollapsed genome.	Genobioinfo Cluster: How to use
hifiasm-meta	De novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.	Genobioinfo Cluster: How to use
IPA	Improved Phased Assembler (IPA) is the official PacBio software for HiFi genome assembly. IPA was designed to utilize the accuracy of PacBio HiFi reads to produce high-quality phased genome assemblies. IPA is an end-to-end solution, starting with input reads and resulting in a polished assembly.	Genobioinfo Cluster: How to use
IsoSeq	Scalable De Novo Isoform Discovery from Single-Molecule PacBio Reads.	Genobioinfo Cluster: How to use
Jabba	A hybrid error correction tool for sequencing reads.	Genobioinfo Cluster: Ask for Install
lamassemble	Merge overlapping "long" DNA reads into a consensus sequence.	Genobioinfo Cluster: Ask for Install
lima	Demultiplex Barcoded PacBio Samples.	Genobioinfo Cluster: Ask for Install
Linker	Linker is a suite of C++ tools useful for interpreting long and linked read sequencing of cancer genomes.	Genobioinfo Cluster: Ask for Install
LIQA	Long-read Isoform Quantification and Analysis) is an Expectation-Maximization based statistical method to quantify isoform expression and detect differential alternative splicing (DAS) events using long-read RNA-seq data.	Genobioinfo Cluster: Ask for Install
LongQC	LongQC is a tool for the data quality control of the PacBio and ONT long reads, and it has two functionalities: sample qc and platform qc.	Genobioinfo Cluster: How to use
longshot	Longshot is a variant calling tool for diploid genomes using long error prone reads such as Pacific Biosciences (PacBio) SMRT and Oxford Nanopore Technologies (ONT).	Genobioinfo Cluster: Ask for Install
LoRDEC	LoRDEC is a program to correct sequencing errors in long reads from 3rd generation sequencing with high error rate, and is especially intended for PacBio reads. It uses a hybrid strategy, meaning that it uses two sets of reads: the reference read set, whose error rate is assumed to be small, and the PacBio read set, which is then corrected using the reference set. Typically, the reference set contains Illumina reads.	Genobioinfo Cluster: How to use
LR_Gapcloser	LR_Gapcloser is a gap closing tool using uncorrected or corrected long reads generated from Pacbio platform or Nanopore platform.	Genobioinfo Cluster: Ask for Install
LRScaf	TGS scaffolding . Improving draft genomes using long noisy reads.	Genobioinfo Cluster: Ask for Install
MARVEL	MARVEL consists of a set of tools that facilitate the overlapping, patching, correction and assembly of noisy (not so noisy ones as well) long reads.	Genobioinfo Cluster: Ask for Install
MashMap	MashMap implements a fast and approximate algorithm for computing local alignment boundaries between long DNA sequences. It can be useful for mapping genome assembly or long reads (PacBio/ONT) to reference genome(s). Given a minimum alignment length and an identity threshold for the desired local alignments, Mashmap computes alignment boundaries and identity estimates using k-mers. It does not compute the alignments explicitly, but rather estimates a k-mer based Jaccard similarity using a combination of Minimizers and MinHash. This is then converted to an estimate of sequence identity using the Mash distance.	Genobioinfo Cluster: Ask for Install
MaSuRCA	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454)	Genobioinfo Cluster: How to use
mCaller	This program is designed to call m6A from nanopore data using the differences between measured and expected currents.	Genobioinfo Cluster: Ask for Install
Megalodon	Megalodon provides "basecalling augmentation" for raw nanopore sequencing reads, including direct, reference-guided SNP and modified base calling.	Genobioinfo Cluster: Ask for Install
MeGAMerge	A tool to merge assembled contigs, long reads from metagenomic sequencing runs	Genobioinfo Cluster: Ask for Install
MetaMaps	MetaMaps is tool specifically developed for the analysis of long-read (PacBio/ONT) metagenomic datasets. It simultaenously carries out read assignment and sample composition estimation. It is faster than classical exact alignment-based approaches, and its output is more information-rich than that of kmer-spectra-based methods. For example, each MetaMaps alignment comes with an approximate alignment location, an estimated alignment identity and a mapping quality.	Genobioinfo Cluster: Ask for Install
Miniasm	Ultrafast de novo assembly for long noisy reads (though having no consensus step).	Genobioinfo Cluster: How to use
minibar	Dual barcode and primer demultiplexing for MinION sequenced reads.	Genobioinfo Cluster: How to use
MinIONQC	Fast and effective quality control for MinION and PromethION sequencing data	Genobioinfo Cluster: Ask for Install
Minipolish	A tool for Racon polishing of miniasm assemblies.	Genobioinfo Cluster: How to use
MiniScrub	MiniScrub is a de novo long sequencing read preprocessing method that improves read quality by predicting and removing ("scrubbing") read segments that have a high concentration of errors.	Genobioinfo Cluster: Ask for Install
modbam2bed	A program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.	Genobioinfo Cluster: How to use
NanoCaller	NanoCaller is a computational method that integrates long reads in deep convolutional neural network for the detection of SNPs/indels from long-read sequencing data.	Genobioinfo Cluster: How to use
NanoComp	Compare multiple runs of long read sequencing data and alignments.	Genobioinfo Cluster: in Python-3.11.1
NanoSeq	Pipeline used at the LIPME to assemble plasmid and PCR sequences with Nanopore.	Genobioinfo Cluster: How to use
NanoSim	NanoSim is a fast and scalable read simulator that captures the technology-specific features of ONT data, and allows for adjustments upon improvement of nanopore sequencing technology.	Genobioinfo Cluster: How to use
NanoSPC	NanoSPC is a scalable, portable and cloud compatible pipeline for analyzing Nanopore sequencing data.	Genobioinfo Cluster: How to use
NaS	NaS is a hybrid approach developed to take advantage of data generated using MinION device. It combines Illumina and Oxford Nanopore technologies to produce NaS (Nanopore Synthetic-long) reads	Genobioinfo Cluster: Ask for Install
NECAT	NECAT is an error correction and de-novo assembly tool for Nanopore long noisy reads.	Genobioinfo Cluster: How to use
NextDenovo	NextDenovo is a string graph-based de novo assembler for TGS long reads.	Genobioinfo Cluster: How to use
NGMLR	NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations SV detection from paired end reads mapping	Genobioinfo Cluster: How to use
NOVOLoci	NOVOLoci is a haplotype aware assembler for targeted assembly or whole genome assembly of small genomes. We currently recommend limiting the assembly size to regions <20 Mb in targeted-mode and diploid genomes that are <250 Mb in WG-mode, with a minimum sequencing depth of 10x per haplotype. If you do need to phase accuratly and you have HiFi or R10 ONT data, it is adviced to use Hifiasm, as it is has a much shorter runtime. Currently it is only available for Nanopore, PacBio and hybrid options will be available soon.	Genobioinfo Cluster: How to use
oarfish	oarfish is a program, written in Rust, for quantifying transcript-level expression from long-read (i.e. Oxford nanopore cDNA and direct RNA and PacBio) sequencing technologies. `oarfish` requires a sample of sequencing reads aligned to the transcriptome (currntly not to the genome). It handles multi-mapping reads through the use of probabilistic allocation via an expectation-maximization (EM) algorithm	Genobioinfo Cluster: How to use
Oases	Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz). Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms.	Genobioinfo Cluster: How to use
oatk	A organelle de novo genome assembly toolkit. (Install include OatkDB)	Genobioinfo Cluster: How to use
ont_fast5_api	ont_fast5_api is a simple interface to HDF5 files of the Oxford Nanopore fast5 file format.	Genobioinfo Cluster: How to use
Organelle_PBA	OrganelleRef_PBA is a script to perform a de-novo PacBio assemblies of any organelle (chloroplast or mitochondrial genomes) using several programs.	Genobioinfo Cluster: Ask for Install
Pacasus	Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification.	Genobioinfo Cluster: Ask for Install
pbmm2	A minimap2 frontend for PacBio native data format.	Genobioinfo Cluster: How to use
PBSIM3	A simulator for all types of Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long reads.	Genobioinfo Cluster: How to use
Peregrine	Peregrine is a fast genome assembler for accurate long reads (length > 10kb, accuraccy > 99%). It can assemble a human genome from 30x reads within 20 cpu hours from reads to polished consensus.	Genobioinfo Cluster: How to use
Pomoxis	Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing.	Genobioinfo Cluster: Ask for Install
Porechop_ABI	Porechop_abi (ab initio) is an extension of Porechop that is able to infer the adapter sequence from the Oxford Nanopore reads. It discovers the adapter sequence from the reads using approximate k-mers and assembly, and add the sequence found to the adapter list (adapters.py file).	Genobioinfo Cluster: How to use
Pychopper	Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.	Genobioinfo Cluster: How to use
pycoQC	pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)	Genobioinfo Cluster: How to use
Rabaler	Rebaler is a program for conducting reference-based assemblies using long reads. It relies mainly on minimap2 for alignment and Racon for making consensus sequences.	Genobioinfo Cluster: Ask for Install
Racon	Consensus module for raw de novo DNA assembly of long uncorrected reads.	Genobioinfo Cluster: How to use
RAFT	RAFT (Repeat Aware Fragmentation Tool) is an algorithm designed to improve assembly quality by rescuing contained reads. RAFT breaks long reads into smaller sub-reads by following an algorithm described in our preprint. The read fragmentation allows an OLC assembler to retain contained reads during string graph construction. When input reads have non-uniform lengths, retaining contained reads improves assembly contiguity and base-level accuracy. The inputs to RAFT include an error-corrected read file in FASTA/FASTQ format and an all-vs-all alignment file in PAF format. It performs read fragmentation and outputs the fragmented reads in FASTA format. We recommend users to use hifiasm for the initial steps (read error correction, all-vs-all overlap computation) and also for the final step (assembly of fragmented reads). The assembly output format of hifiasm is described here. The RAFT-hifiasm workflow is recommended for long accurate reads with non-uniform length distribution (e.g., ONT Duplex, or a mixture of ONT Duplex and HiFi reads). ONT UL reads can optionally be integrated during the final assembly step.	Genobioinfo Cluster: How to use
Ratatosk	Ratatosk is a phased error correction tool for erroneous long reads based on compacted and colored de Bruijn graphs built from accurate short reads.	Genobioinfo Cluster: Ask for Install
rust-mdbg	rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) implementation, geared towards the assembly of long and accurate reads such as PacBio HiFi.	Genobioinfo Cluster: How to use
Shasta	The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by Oxford Nanopore flow cells.	Genobioinfo Cluster: How to use
SLR-superscaffolder	This is a scaffold assembler designed for stLFR reads. It uses the link-reads information from stLFR reads to assemble contigs to scaffolds.	Genobioinfo Cluster: Ask for Install
spliced_bam2gff	A tool to convert spliced BAM alignments into GFF2 format.	Genobioinfo Cluster: Ask for Install
SQANTI3	SQANTI3 is the newest version of the SQANTI tool (publication) that merges features from SQANTI, (code repository) and SQANTI2 (code repository), together with new additions. SQANTI3 will continue as an integrated development aiming to providing you the best characterization possible for your new long read-defined transcriptome. SQANTI3 is the first module of the Functional IsoTranscriptomics (FIT) framework, that also includes IsoAnnot and tappAS.	Genobioinfo Cluster: How to use
SquiggleKit	A toolkit for manipulating nanopore signal data.	Genobioinfo Cluster: Ask for Install
SSPACE-LongRead	SSPACE-LongRead is a stand-alone program for scaffolding pre-assembled contigs using long reads (e.g. PacBio RS reads).	Genobioinfo Cluster: Ask for Install
Sturgeon	Sturgeon is a CNS neural network classifier for tumour classification	Genobioinfo Cluster: How to use
SVIM	SVIM is a structural variant caller for long reads. It is able to detect, classify and genotype five different classes of structural variants.	Genobioinfo Cluster: How to use
SVJedi	SVJedi is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: Ask for Install
SVJedi-graph	SVJedi-graph is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: How to use
Tapestry	Tapestry is a tool to validate and edit small eukaryotic genome assemblies using long sequence reads. It is designed to help identify complete chromosomes, symbionts, haplotypes, complex features and errors in close-to-complete genome assemblies.	Genobioinfo Cluster: Ask for Install
TGSGapFiller	A gap filling tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly.	Genobioinfo Cluster: Ask for Install
Tombo	Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.	Genobioinfo Cluster: Ask for Install
Trycycler	Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes.	Genobioinfo Cluster: How to use
uLTRA	uLTRA is a tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations.	Genobioinfo Cluster: How to use
VeChat	Correcting errors in noisy long reads using variation graphs.	Genobioinfo Cluster: How to use
Velvet	Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs.	Genobioinfo Cluster: How to use
Voyager	Rapid and efficient mapping algorithm for long sequencing reads with insertion- and deletion errors. Mapping long reads in Sorted Motif Distance Space.	Genobioinfo Cluster: How to use
WhatsHap	WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.	Genobioinfo Cluster: How to use
yacrd	Yet Another Chimeric Read Detector for long reads	Genobioinfo Cluster: Ask for Install
Yak	Yak is initially developed for two specific use cases: 1) to robustly estimate the base accuracy of CCS reads and assembly contigs, and 2) to investigate the systematic error rate of CCS reads.	Genobioinfo Cluster: Ask for Install

Mathematics

Application	Description	Availability/Use
Beagle-lib	BEAGLE-lib is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages	Genobioinfo Cluster: How to use
eLSA	Extended Local Similarity Analysis -- Finding Time-Dependent Associations in Time Series Datasets	Genobioinfo Cluster: Ask for Install
GenoML2	GenoML (genoml2) is an open source Python package. It is an automated machine learning (autoML) platform for genomics data.	Genobioinfo Cluster: How to use
GROMACS	GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.	Genobioinfo Cluster: How to use
JAGS	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS.	Genobioinfo Cluster: How to use
MCL	The MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs.	Genobioinfo Cluster: How to use
ReLERNN	Recombination Landscape Estimation using Recurrent Neural Networks	Genobioinfo Cluster: How to use
snape-pooled	SNAPE-pooled computes the probability distribution for the frequency of the minor allele in a certain population, at a certain position in the genome.	Genobioinfo Cluster: Ask for Install
sNMF	A fast and efficient program for estimating individual admixture coefficients based on sparse non-negative matrix factorization and population genetics.	Genobioinfo Cluster: Ask for Install
toulbar2	toulbar2 is an open-source black-box C++ optimizer for cost function networks and discrete additive graphical models. It can read a variety of formats.	Genobioinfo Cluster: How to use

Metabolic Network Modeling

Application	Description	Availability/Use
gapseq	Informed prediction and analysis of bacterial metabolic pathways and genome-scale networks.	Genobioinfo Cluster: How to use

Metabolomics

Application	Description	Availability/Use
HUMAnN	HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.	Genobioinfo Cluster: How to use
METABOLIC	METabolic And BiogeOchemistry anaLyses In miCrobes	Genobioinfo Cluster: How to use
PlantiSMASH	PlantiSMASH is a specialized extension of antiSMASH for the identification and analysis of biosynthetic gene clusters (BGCs) in plant genomes. It supports advanced plant-specific detection rules and features for comparative genomics, visualization, and more.	Genobioinfo Cluster: How to use
RDKit	The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python.	Genobioinfo Cluster: How to use

ncRNA

Application	Description	Availability/Use
ARAGORN	ARAGORN is a program to detect tRNA genes and tmRNA genes in nucleotide sequence	Genobioinfo Cluster: Ask for Install
Barrnap	Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).	Genobioinfo Cluster: Ask for Install
BlockClust	BlockClust is an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. BlockClust allows both clustering and classification of small non-coding RNAs.	Genobioinfo Cluster: Ask for Install
circtools	A modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software.	Genobioinfo Cluster: Ask for Install
CIRI-long	Circular RNA Identification for Long-Reads Nanopore Sequencing Data.	Genobioinfo Cluster: How to use
CleaveLand4	Analysis of degradome data to find sliced miRNA and siRNA targets	Genobioinfo Cluster: How to use
FEELnc	FlExible Extraction of LncRNA.	Genobioinfo Cluster: How to use
Infernal	Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).	Genobioinfo Cluster: How to use
miRDeep2	miRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples. Last, a new module for preprocessing of raw Illumina sequencing data produces files for downstream analysis with the miRDeep2 or quantifier module.	Genobioinfo Cluster: How to use
MiRfold	MiRfold searches for a good miRNA-like folding in the sequence surrounding a putative miRNA. It was optimized on plant miRNAs.	Genobioinfo Cluster: Ask for Install
PHASE	PHASE is a package that performs molecular phylogenetic inference. The software seeks to accurately compare molecular sequences to determine the likely evolutionary relationships between a group of species.	Genobioinfo Cluster: How to use
PhaseTank	To systemically characterize phasiRNAs/tasiRNAs and their regulatory cascades 'miRNA/phasiRNA -&gt	Genobioinfo Cluster: Ask for Install
phyloFlash	phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.	Genobioinfo Cluster: How to use
PingPongPro	Find ping-pong signatures in piRNA-Seq data like a pro.	Genobioinfo Cluster: Ask for Install
Prost	Prost! (PRocessing Of Short Transcripts) can analyze smallRNA sequencing data generated on any sequencing platform. Prost! does not rely on existing annotation to filter sequencing reads but instead starts by aligning all the reads on a user-provided genomic reference, allowing the study of miRNAs in any species. Additionally, any number of samples can be studied together in a single Prost! run, allowing an accurate analysis of an entire dataset. After grouping the processed reads by genomic location, Prost! then annotates them using a user-defined annotation database (public or personal annotation database).	Genobioinfo Cluster: How to use
RNAclust	RNAclust is a perl script summarizing all the single steps required for clustering of structured RNA motifs, i.e. identifying groups of RNA sequences sharing a secondary structure motif.	Genobioinfo Cluster: Ask for Install
RNAmmer	Rnammer predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in tttfull genome sequences. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project.	Genobioinfo Cluster: How to use
RNAscClust	RNAscClust is a pipeline to cluster a set of structured RNAs taking their respective structural conservation into account. The aim of RNAscClust is to aid the discovery of families and classes of ncRNAs.	Genobioinfo Cluster: Ask for Install
ShortStack	ShortStack is a tool developed to process and analyze smallRNA-seq data with respect to a reference genome, and output a comprehensive and informative annotation of all discovered small RNA genes.	Genobioinfo Cluster: How to use
snoGPS	Search for H/ACA snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
Snoscan	Search for C/D box methylation guide snoRNA genes in a genomic sequence.	Genobioinfo Cluster: How to use
SortMeRNA	SortMeRNA is a software designed to rapidly filter ribosomal RNA fragments from metatransriptomic data produced by next-generation sequencers. It is capable of handling large RNA databases and sorting out all fragments matching to the database with high accuracy and specificity	Genobioinfo Cluster: How to use
srnaMapper	This tool maps reads produced by sRNA-Seq to a genome.	Genobioinfo Cluster: Ask for Install
tRNAscan-SE	Search for tRNA genes in genomic sequence.	Genobioinfo Cluster: How to use

Nucleic acid folding

Application	Description	Availability/Use
Juicebox	Software for visualizing data from Hi-C and other proximity mapping experiments	Genobioinfo Cluster: Ask for Install
Juicer	A One-Click System for Analyzing Loop-Resolution Hi-C Experiments	Genobioinfo Cluster: How to use
LocARNA	LocARNA is a tool for multiple alignment of RNA molecules. LocARNA requires only RNA sequences as input and will simultaneously fold and align the input sequences.	Genobioinfo Cluster: Ask for Install
MiRfold	MiRfold searches for a good miRNA-like folding in the sequence surrounding a putative miRNA. It was optimized on plant miRNAs.	Genobioinfo Cluster: Ask for Install
non-B_gfa	gfa programs for Non-B site at NCI/FNLCR. gfa is a Suite of programs developed at NCI-Frederick/Frederick National Lab to find sequences associated with non-B DNA forming motifs.	Genobioinfo Cluster: Ask for Install
Openfold3	A fully open source biomolecular structure prediction model based on AlphaFold3.	Genobioinfo Cluster: How to use
PETfold	PETfold performs Probabilistic Evolutionary and Thermodynamic folding of a multiple alignment of RNA sequences.	Genobioinfo Cluster: Ask for Install
R-scape	R-scape looks for evidence of a conserved RNA structure by measuring pairwise covariations observed in an input multiple sequence alignment. It analyzes all possible pairs, including those in your proposed structure (if you provide one).	Genobioinfo Cluster: Ask for Install
randfold	The software compute the probability that, for a given RNA sequence, the Minimum Free Energy (MFE) of the secondary structure is different from a distribution of MFE computed with random sequences..	Genobioinfo Cluster: How to use
RFdiffusion	RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).	Genobioinfo Cluster: How to use
Rfold	Rfold computes local base pairing probabilities for long DNA sequences.	Genobioinfo Cluster: Ask for Install
RNAclust	RNAclust is a perl script summarizing all the single steps required for clustering of structured RNA motifs, i.e. identifying groups of RNA sequences sharing a secondary structure motif.	Genobioinfo Cluster: Ask for Install
RNAz	RNAz detects stable and conserved RNA secondary structures in multiple sequence alignments.	Genobioinfo Cluster: Ask for Install

ONT

Application	Description	Availability/Use
ASHURE	Python-based pipeline for analyzing Nanopore sequencing metabarcoding data. ASHURE can take a reference database in order to improve accuracy.	Genobioinfo Cluster: Ask for Install
blue-crab	blue-crab is a conversion tool to convert from ONT's POD5 format to the community maintained SLOW5/BLOW5 format.	Genobioinfo Cluster: How to use
CARNAC-LR	Clustering coefficient-based Acquisition of RNA Communities in Long Reads.	Genobioinfo Cluster: Ask for Install
chopper	This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. chopper is a tool that reunites the now outdated softwares NanoFilt and NanoLyse. It permits to filter QC files and has a faster execution time than NanoFilt and NanoLyse.	Genobioinfo Cluster: How to use
CONSENT	CONSENT (sCalable self-cOrrectioN of long reads with multiple SEquence alignmeNT) is a self-correction method for long reads.	Genobioinfo Cluster: Ask for Install
DAmar	Long read QC, assembly and scaffolding pipeline for PacBio or Oxford Nanopore long-read sequencing data. T he pipeline produces a number of QC metrics at various stages as well as incorporating further technologies including Bionano, 10x and HiC data to scaffold the created contigs. DAmar, is a hybrid of the earlier Marvel, Dazzler, and Daccord systems of the Eugene Myers lab.	Genobioinfo Cluster: Ask for Install
Deepbinner	Deepbinner is a tool for demultiplexing barcoded Oxford Nanopore sequencing reads. It does this with a deep convolutional neural network classifier, using many of the architectural advances that have proven successful in image classification. Unlike other demultiplexers (e.g. Albacore and Porechop), Deepbinner identifies barcodes from the raw signal (a.k.a. squiggle) which gives it greater sensitivity and fewer unclassified reads.	Genobioinfo Cluster: Ask for Install
DeepSignal	Detecting methylation using signal-level features from Nanopore sequencing reads.	Genobioinfo Cluster: Ask for Install
Dorado	Dorado is a high-performance, easy-to-use, open source basecaller for Oxford Nanopore reads.	Genobioinfo Cluster: How to use
f5c	Ultra-fast methylation calling and event alignment tool for nanopore sequencing data.	Genobioinfo Cluster: How to use
Flye	Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies.	Genobioinfo Cluster: How to use
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
LR_Gapcloser	LR_Gapcloser is a gap closing tool using uncorrected or corrected long reads generated from Pacbio platform or Nanopore platform.	Genobioinfo Cluster: Ask for Install
mCaller	This program is designed to call m6A from nanopore data using the differences between measured and expected currents.	Genobioinfo Cluster: Ask for Install
modbam2bed	A program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.	Genobioinfo Cluster: How to use
Modkit	A bioinformatics tool for working with modified bases from Oxford Nanopore. Specifically for converting modBAM to bedMethyl files using best practices, but also manipulating modBAM files and generating summary statistics.	Genobioinfo Cluster: How to use
NaMeco	Pipeline for the Nanopore 16S long read clustering and taxonomy classification.	Genobioinfo Cluster: How to use
Nano-Q	Python script for conservatively cleaning ONT reads from bam files and estimate variant frequencies.	Genobioinfo Cluster: Ask for Install
NanoASV	Nanopore full-length 16S metabarcoding amplicon data analysis	Genobioinfo Cluster: How to use
NanoCLUST	NanoCLUST is an analysis pipeline for UMAP-based classification of amplicon-based full-length 16S rRNA nanopore reads.	Genobioinfo Cluster: How to use
NanoCount	NanoCount estimates transcripts abundance from Oxford Nanopore direct-RNA sequencing datasets, using an expectation-maximization approach like RSEM, Kallisto, salmon, etc to handle the uncertainty of multi-mapping reads.	Genobioinfo Cluster: Ask for Install
NanoSim	NanoSim is a fast and scalable read simulator that captures the technology-specific features of ONT data, and allows for adjustments upon improvement of nanopore sequencing technology.	Genobioinfo Cluster: How to use
NECAT	NECAT is an error correction and de-novo assembly tool for Nanopore long noisy reads.	Genobioinfo Cluster: How to use
ont_fast5_api	ont_fast5_api is a simple interface to HDF5 files of the Oxford Nanopore fast5 file format.	Genobioinfo Cluster: How to use
POD5	The pod5 Python package contains the tools and python API wrapping the compiled bindings for the POD5 file format from lib_pod5.	Genobioinfo Cluster: How to use
Porechop_ABI	Porechop_abi (ab initio) is an extension of Porechop that is able to infer the adapter sequence from the Oxford Nanopore reads. It discovers the adapter sequence from the reads using approximate k-mers and assembly, and add the sequence found to the adapter list (adapters.py file).	Genobioinfo Cluster: How to use
Ratatosk	Ratatosk is a phased error correction tool for erroneous long reads based on compacted and colored de Bruijn graphs built from accurate short reads.	Genobioinfo Cluster: Ask for Install
Shasta	The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by Oxford Nanopore flow cells.	Genobioinfo Cluster: How to use
Sniffles	A fast structural variant caller for long-read sequencing, Sniffles2 accurately detect SVs on germline, somatic and population-level for PacBio and Oxford Nanopore read data.	Genobioinfo Cluster: How to use
spliced_bam2gff	A tool to convert spliced BAM alignments into GFF2 format.	Genobioinfo Cluster: Ask for Install
SquiggleKit	A toolkit for manipulating nanopore signal data.	Genobioinfo Cluster: Ask for Install
Tombo	Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.	Genobioinfo Cluster: Ask for Install
Variabel	A novel approach and method for intrahost variant detection, which outperforms existing ONT variant callers.	Genobioinfo Cluster: Ask for Install

Pangenome

Application	Description	Availability/Use
Bandage-NG	Bandage-NG is a GUI program that allows users to interact with the assembly graphs made by de novo assemblers such as SPAdes, MEGAHIT and others.	Genobioinfo Cluster: How to use
Bifrost	Highly parallel construction and indexing of colored and compacted de Bruijn graphs.	Genobioinfo Cluster: How to use
Cactus	Cactus is a reference-free whole-genome alignment program, as well as a pagenome graph construction toolkit.	Genobioinfo Cluster: How to use
CLARC	Connected Linkage and Alignment Redefinition of COGs: a tool that uses sequence identity, linkage patterns and functional annotations to identify and reduce the over-splitting of accessory genes into multiple clusters of orthologous genes (COGs) in a pangenome analysis. In summary, CLARC is meant to compliment existing bacterial pangenome tools by polishing their COG definitions. As input, the pipeline currently takes the presence absence matrix generated with Roary (but can also accept inputs from Panaroo, PPanGGOLiN and RIBAP). We believe CLARC is particularly helpful for researchers that plan to perform downstream analyses that rely on COG frequencies, such as studying the evolutionary dynamics of accessory genes or running a panGWAS.	Genobioinfo Cluster: How to use
fastix	A simple command line tool to add prefixes to FASTA headers.	Genobioinfo Cluster: How to use
gaftools	gaftools is a fast and comprehensive toolkit designed for processing pangenome alignments.	Genobioinfo Cluster: How to use
gfacpp	Library for common operations on GFA graphs and some processing algorithms.	Genobioinfo Cluster: How to use
GFAffix		Genobioinfo Cluster: How to use
GfaViz	Graphical interactive tool for the visualization of sequence graphs in GFA format.	Genobioinfo Cluster: How to use
ggCaller	A de Bruijn graph-based gene-caller and pangenome analysis tool.	Genobioinfo Cluster: How to use
GrAnnoT	GrAnnoT is an annotation transfer tool for pangenome graphs.	Genobioinfo Cluster: How to use
hal2vg	Convert HAL to vg-compatible sequence graph.	Genobioinfo Cluster: How to use
Jasmine	JASMINE: Jointly Accurate Sv Merging with Intersample Network Edges. This tool is used to merge structural variants (SVs) across samples. Each sample has a number of SV calls, consisting of position information (chromosome, start, end, length), type and strand information, and a number of other values. Jasmine represents the set of all SVs across samples as a network, and uses a modified minimum spanning forest algorithm to determine the best way of merging the variants such that each merged variants represents a set of analogous variants occurring in different samples. Manual : Jasmine User Manual · mkirsche/Jasmine Wiki · GitHub Jasmine also includes a module for automating the creation of IGV screenshots of variants of interest.	Genobioinfo Cluster: How to use
kSNP4	kSNP4 identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs.	Genobioinfo Cluster: How to use
odgi	odgi provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.	Genobioinfo Cluster: How to use
PanACoTA	PANgenome with Annotations, COre identification, Tree and corresponding Alignments.	Genobioinfo Cluster: How to use
panacus		Genobioinfo Cluster: How to use
PanGenie	A short-read genotyper for various types of genetic variants (such as SNPs, indels and structural variants) represented in a pangenome graph.	Genobioinfo Cluster: How to use
pantera	Identification of transposable element families from pangenome polymorphisms. A pangenome is a collection of genomes or haplotypes that can be aligned and stored as a variation graph in gfa format. pantera receives as input a list of gfa files of non overlapping variation graphs and produces a library of transposable elements found to be polymorphic on that pangenome.	Genobioinfo Cluster: How to use
PanTools	PanTools is a toolkit for comparative analysis of large number of genomes.	Genobioinfo Cluster: How to use
pggb	Pangenome graph builder.	Genobioinfo Cluster: How to use
PopIns2	Population-scale detection of non-reference sequence variants using colored de Bruijn Graphs.	Genobioinfo Cluster: Ask for Install
PPanGGOLiN	PPanGGOLiN (Gautreau et al. 2020) is a software suite used to create and manipulate prokaryotic pangenomes from a set of either genomic DNA sequences or provided genome annotations.	Genobioinfo Cluster: How to use
proteinortho_curves	Draw pan- and core-genome curves from proteinortho output	Genobioinfo Cluster: How to use
varigraph	An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes.	Genobioinfo Cluster: How to use
vcfbub	Popping bubbles in vg deconstruct VCFs.	Genobioinfo Cluster: How to use

Patterns and profiles

Application	Description	Availability/Use
BlockClust	BlockClust is an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. BlockClust allows both clustering and classification of small non-coding RNAs.	Genobioinfo Cluster: Ask for Install
CMfinder	CMfinder is a RNA motif prediction tool.	Genobioinfo Cluster: Ask for Install
DinuQ	The DinuQ (Dinucleotide Quantification) Python3 package provides a range of metrics for quantifying nucleotide, dinucleotide and synonymous codon representation in genetic sequences.	Genobioinfo Cluster: Ask for Install
EMBOSS	EMBOSS is "The European Molecular Biology Open Software Suite". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.	Genobioinfo Cluster: How to use
FastaGrep	FastaGrep is a tool for searching oligonucleotide binding sites from FastA genomic sequences. It can do both match/mismatch based and thermodynamic binding energy searches.	Genobioinfo Cluster: Ask for Install
Gerbil	A basic task in bioinformatics is the counting of k-mers in genome strings.	Genobioinfo Cluster: Ask for Install
Homer	HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. It is a collection of command line programs for unix-style operating systems written in Perl and C++.	Genobioinfo Cluster: How to use
iFeature	iFeature is a comprehensive Python-based toolkit for generating various numerical feature representation schemes from protein or peptide sequences. Install with Spann model: https://github.com/nicolagulmini/spaan	Genobioinfo Cluster: How to use
kmap	Standalone tool based on the bwa index to locate a set of kmers along a reference genome.	Genobioinfo Cluster: Ask for Install
MEME	The MEME Suite allows you to: (1)&nbspdiscover motifs using MEME or GLAM2 on groups of related DNA or protein sequences, (2)&nbspsearch sequence databases using motifs, (3)&nbspcompare a motif to all motifs in a database of motifs, and (3)&nbspassociate motifs with Gene Ontology terms via their putative target genes.	Genobioinfo Cluster: How to use
pftools3	The pftools package contains all the software necessary to build protein and DNA generalized profiles and use them to scan and align sequences, and search databases	Genobioinfo Cluster: How to use
QmRLFS-finder	QmRLFS-finder, the first R-loop finding tool which uses (unsupervised) QmRLFS (Quantitative Models of RLFS) models to predict RLFSs. This command line tool generates locations and detailed information of RLFSs as well as standards-compliant output files for further analysis and visualization.	Genobioinfo Cluster: How to use
RNAclust	RNAclust is a perl script summarizing all the single steps required for clustering of structured RNA motifs, i.e. identifying groups of RNA sequences sharing a secondary structure motif.	Genobioinfo Cluster: Ask for Install
Scan For Matches	scan_for_matches is a utility written in C for locating patterns in DNA or protein FASTA files.	Genobioinfo Cluster: Ask for Install
VAST-TOOLS	Vertebrate Alternative Splicing and Transcription Tools (VAST-TOOLS) is a toolset for profiling and comparing alternative splicing events in RNA-Seq data.	Genobioinfo Cluster: Ask for Install

Phylogeny & selection / Metagenomic

Application	Description	Availability/Use
AAF	This is a package for constructing phylogeny without doing alignment or assembly.	Genobioinfo Cluster: Ask for Install
adegenet	R package dedicated to the exploratory analysis of genetic data. It implements a set of tools ranging from multivariate methods to spatial genetics and genome-wise SNP data analysis	Genobioinfo Cluster: Ask for Install
ALFATClust	ALignment-Free Adaptive Threshold Clustering:Biological sequence clustering tool with dynamic threshold for individual clusters. Suitable for clustering multiple groups of homologous sequences.	Genobioinfo Cluster: How to use
AmpliSAT	AmpliSAT (Amplicon Sequencing Analysis Tools) are a set of online tools that make easy the analysis of Amplicon Sequencing experiments.	Genobioinfo Cluster: How to use
Anvio	Anvi’o is an analysis and visualization platform for ‘omics data. It brings together many aspects of today’s cutting-edge genomic, metagenomic, and metatranscriptomic analysis practices to address a wide array of needs.	Genobioinfo Cluster: How to use
Apscale	Advanced Pipeline for Simple yet Comprehensive AnaLysEs of DNA metabarcoding data	Genobioinfo Cluster: How to use
ARGweaver	The ARGweaver/ARGweaver-D software package contains programs and libraries for sampling and manipulating ancestral recombination graphs (ARGs).	Genobioinfo Cluster: How to use
ASHURE	Python-based pipeline for analyzing Nanopore sequencing metabarcoding data. ASHURE can take a reference database in order to improve accuracy.	Genobioinfo Cluster: Ask for Install
ASTER	A family of ASTRAL-like algorithms.	Genobioinfo Cluster: How to use
ASTRAL	ASTRAL is a tool for estimating an unrooted species tree given a set of unrooted gene trees.	Genobioinfo Cluster: How to use
ASTRAL-Pro	ASTRAL-Pro stands for ASTRAL for PaRalogs and Orthologs. ASTRAL is a tool for estimating an unrooted species tree given a set of unrooted gene trees.	Genobioinfo Cluster: Ask for Install
Bakta	Rapid & standardized annotation of bacterial genomes, MAGs & plasmids.	Genobioinfo Cluster: How to use
BAli-Phy	BAli-Phy is software by Ben Redelings that estimates multiple sequence alignments and evolutionary trees from DNA, amino acid, or codon sequences. It uses likelihood-based evolutionary models of substitutions and insertions and deletions to place gaps.	Genobioinfo Cluster: Ask for Install
BayesTraits	BayesTraits is a computer package for performing analyses of trait evolution among groups of species for which a phylogeny or sample of phylogenies is available. This new package incoporates our earlier and separate programes Multistate, Discrete and Continuous. BayesTraits can be applied to the analysis of traits that adopt a finite number of discrete states, or to the analysis of continuously varying traits. Hypotheses can be tested about models of evolution, about ancestral states and about correlations among pairs of traits.	Genobioinfo Cluster: Ask for Install
Beagle-lib	BEAGLE-lib is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages	Genobioinfo Cluster: How to use
BEAST	BEAST is a software package for phylogenetic analysis with an emphasis on time-scaled trees. BEAST is a cross-platform program for Bayesian analysis of molecular sequences using MCMC. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. We include a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.	Genobioinfo Cluster: How to use
BEAST2	BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences.	Genobioinfo Cluster: How to use
BIG-SCAPE	Biosynthetic Genes Similarity Clustering and Prospecting Engine. Defines a distance metric between Gene Clusters using a combination of three indices (Jaccard Index of domain types, Domain Sequence Similarity the Adjacency Index)	Genobioinfo Cluster: Ask for Install
BlobTools	A modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.	Genobioinfo Cluster: How to use
BOLDigger	Python program to query .fasta files against the different databases of www.boldsystems.org	Genobioinfo Cluster: How to use
BppSuite	BppSuite is a suite of ready-to-use programs for phylogenetic and sequence analysis.	Genobioinfo Cluster: How to use
Bracken	Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.	Genobioinfo Cluster: How to use
BUCKy	BUCKy is a free program to combine molecular data from multiple loci. BUCKy estimates the dominant history of sampled individuals, and how much of the genome supports each relationship, using Bayesian concordance analysis.	Genobioinfo Cluster: Ask for Install
CAFE	Software for Computational Analysis of gene Family Evolution. The purpose of CAFE is to analyze changes in gene family size in a way that accounts for phylogenetic history and provides a statistical foundation for evolutionary inferences.	Genobioinfo Cluster: How to use
CafePlotter	A tool for plotting CAFE5 gene family expansion/contraction result.	Genobioinfo Cluster: How to use
CAMI-AMBER	AMBER is an evaluation package for the comparative assessment of genome reconstructions and taxonomic assignments from metagenome benchmark datasets.	Genobioinfo Cluster: Ask for Install
CAMISIM	CAMISIM is a software to model abundance distributions of microbial communities and to simulate corresponding shotgun metagenome datasets.	Genobioinfo Cluster: How to use
CarpeDeam	CarpeDeam is a damage-aware metagenome assembler for ancient metagenomic DNA datasets. It takes (merged) reads and a damage matrix as input and prooved to work best for heavily damaged datasets.	Genobioinfo Cluster: How to use
CCMetagen	CCMetagen processes sequence alignments produced with KMA, which implements the ConClave sorting scheme to achieve highly accurate read mappings. CCMetagen processes sequence alignments produced with KMA, which implements the ConClave sorting scheme to achieve highly accurate read mappings. CCMetagen produces ranked taxonomic results in user-friendly formats that are ready for publication or downstream statistical analyses.	Genobioinfo Cluster: Ask for Install
Centrifuge	Classifier for metagenomic sequences. Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species on desktop computers.	Genobioinfo Cluster: How to use
CheckM	Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes.	Genobioinfo Cluster: How to use
CheckM2	Assessing the quality of metagenome-derived genome bins using machine learning.	Genobioinfo Cluster: How to use
ClonalFrameML	A software package that performs efficient inference of recombination in bacterial genomes.	Genobioinfo Cluster: How to use
COMEBin	COMEBin allows effective binning of metagenomic contigs using COntrastive Multi-viEw representation learning.	Genobioinfo Cluster: How to use
CONCOCT	A program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.	Genobioinfo Cluster: How to use
CoverM	Read coverage calculator for metagenomics.	Genobioinfo Cluster: How to use
CRABS	CRABS (Creating Reference databases for Amplicon-Based Sequencing) is a versatile software program that generates curated reference databases for metagenomic analysis.	Genobioinfo Cluster: How to use
d2SBin	Improving the binning of metagenomic contigs on d2S oligonucleotide frequency dissimilarity	Genobioinfo Cluster: Ask for Install
DAS_Tool	An automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.	Genobioinfo Cluster: Ask for Install
DATES	DATES (Distribution of Ancestry Tracts of Evolutionary Signals) is a method to estimate the time of admixture in ancient DNA samples described in Narasimhan, Patterson et al. 2018	Genobioinfo Cluster: How to use
dbcAmplicons	Analysis of Double Barcoded Illumina Amplicon Data.	Genobioinfo Cluster: Ask for Install
decOM	decOM is a high-accuracy microbial source tracking method that is suitable for contamination quantification in paleogenomics, namely the analysis of collections of possibly contaminated ancient oral metagenomic data sets.	Genobioinfo Cluster: How to use
DECX	This is the DECX (DEC eXtended) model for historical biogeographic inference	Genobioinfo Cluster: Ask for Install
DESMAN	De novo Extraction of Strains from MetAgeNomes.	Genobioinfo Cluster: Ask for Install
DGINN	DGINN is a pipeline dedicated to the detection of genetic innovations, starting from a gene sequence.	Genobioinfo Cluster: How to use
DLCpar	DLCpar is a reconciliation method for inferring gene duplications, losses, and coalescence (accounting for incomplete lineage sorting).	Genobioinfo Cluster: How to use
dnabarcoder	Dnabarcoder is a tool to PREDICT global and local similarity cut-offs for fungal sequence identification for a reference dataset, and CLASSIFY unidentified sequences based on the predicted similarity cutoffs.	Genobioinfo Cluster: How to use
DRAM	DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs.	Genobioinfo Cluster: Ask for Install
EggLib	EggLib is a C++/Python library and program package for evolutionary genetics and genomics.	Genobioinfo Cluster: Ask for Install
eMPRess	eMPRess is a software tool for reconciling pairs of phylogenetic trees such as host-parasite, host-symbiont, and species-gene trees under the Duplication-Transfer-Loss (DTL) model.	Genobioinfo Cluster: How to use
Emu	Emu is a relative abundance estimator for 16S genomic sequences. The method is optimized for error-prone full-length reads, but can also be utilized for short-read data.	Genobioinfo Cluster: How to use
EPIK	EPIK is a program for rapid alignment-free phylogenetic placement.	Genobioinfo Cluster: How to use
ETE	A Python framework for the analysis and visualization of trees.	Genobioinfo Cluster: in Python-3.11.1 and How to use
EukCC	EukCC is a completeness and contamination estimator for metagenomic assembled microbial eukaryotic genomes.	Genobioinfo Cluster: Ask for Install
EukRep	Classification of Eukaryotic and Prokaryotic sequences from metagenomic datasets.	Genobioinfo Cluster: How to use
Exabayes	ExaBayes is a software package for Bayesian tree inference. It is particularly suitable for large-scale analyses on computer clusters.	Genobioinfo Cluster: Ask for Install
ExaML	Exascale Maximum Likelihood (ExaML) code for phylogenetic inference using MPI.	Genobioinfo Cluster: How to use
FastME	FastME provides distance algorithms to infer phylogenies. FastME is based on balanced minimum evolution, which is the very principle of NJ. FastME improves over NJ by performing topological moves using fast, sophisticated algorithms.	Genobioinfo Cluster: How to use
FastTree	FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.	Genobioinfo Cluster: How to use
FigTree	FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures.	Genobioinfo Cluster: How to use
FROGS	FROGS is a CLI workflow designed to produce an OTU count matrix from high depth sequencing amplicon data.	Genobioinfo Cluster: How to use
G-PhoCS	G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences.	Genobioinfo Cluster: How to use
ganon	ganon classifies DNA sequences against large sets of genomic reference sequences efficiently.	Genobioinfo Cluster: Ask for Install
gappa	A toolkit for analyzing and visualizing phylogenetic (placement) data.	Genobioinfo Cluster: How to use
GARLI	GARLI, Genetic Algorithm for Rapid Likelihood Inference is a program for inferring phylogenetic trees.	Genobioinfo Cluster: Ask for Install
Gblocks	Gblocks is a computer program written in ANSI C language that eliminates poorly aligned positions and divergent regions of an alignment of DNA or protein sequences. These positions may not be homologous or may have been saturated by multiple substitutions and it is convenient to eliminate them prior to phylogenetic analysis.	Genobioinfo Cluster: How to use
Gotree	Gotree is a set of command line tools and an API to manipulate phylogenetic trees.	Genobioinfo Cluster: How to use
GrapeTree	GrapeTree is a fully interactive, tree visualization program within EnteroBase, which supports facile manipulations of both tree layout and metadata.	Genobioinfo Cluster: How to use
GraPhlAn	GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.	Genobioinfo Cluster: Ask for Install
GTDB-Tk	GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.	Genobioinfo Cluster: How to use
Gubbins	Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences. Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.	Genobioinfo Cluster: How to use
hapflk	hapflk is a software implementing the hapFLK <1> and FLK <2> tests for the detection of selection signatures based on multiple population genotyping data.	Genobioinfo Cluster: How to use
HAPHPIPE	NGS viral assembly and population genetics.	Genobioinfo Cluster: How to use
hifiasm-meta	De novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.	Genobioinfo Cluster: How to use
HyPhy	HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses.	Genobioinfo Cluster: How to use
InSilicoSeq	InSilicoSeq is a sequencing simulator producing realistic Illumina reads. Primarily intended for simulating metagenomic samples, it can also be used to produce sequencing data from a single genome.	Genobioinfo Cluster: How to use
IPK	IPK is a tool for computing phylo-k-mers for a fixed phylogeny.	Genobioinfo Cluster: How to use
IQ-TREE	Efficient phylogenomic software by maximum likelihood	Genobioinfo Cluster: How to use
ITSx	Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for use in environmental sequencing	Genobioinfo Cluster: How to use
iVar	Var is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates. We implemented the following functions in iVar: (1) trimming of primers and low-quality bases, (2) consensus calling, (3) variant calling - both iSNVs and insertions/deletions, and (4) identifying mismatches to primer sequences and excluding the corresponding reads from alignment files.	Genobioinfo Cluster: How to use
jModeltest	jModelTest is a tool to carry out statistical selection of best-fit models of nucleotide substitution.	Genobioinfo Cluster: Ask for Install
jpHMM	jpHMM (jumping profile Hidden Markov Model) is a probabilistic approach to compare a sequence to a multiple alignment of a sequence family. The jpHMM web server at GOBICS is a tool for the detection of recombinations in HIV-1 and hepatitis B virus (HBV) genomes. For a query sequence phylogenetic recombination breakpoints are predicted and each region of the sequence is assigned to one HIV-1 subtype/HBV genotype. This prediction is based on a pre-calculated multiple alignment of the major HIV-1 subtypes/HBV genotypes. A detailed description of the algorithm and some information about the evaluation can be found here. For information about the output format please see the online submission page.	Genobioinfo Cluster: How to use
JustOrthologs	A Fast, Accurate, and User-Friendly Ortholog-Finding Algorithm	Genobioinfo Cluster: Ask for Install
Kaiju	Fast taxonomic classification of metagenomic sequencing reads using a protein reference database	Genobioinfo Cluster: How to use
Kraken	Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	Genobioinfo Cluster: Ask for Install
Kraken2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	Genobioinfo Cluster: How to use
KrakenTools	KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.	Genobioinfo Cluster: How to use
KrakenUniq	KrakenUniq (formerly KrakenHLL) is a novel metagenomics classifier that combines the fast k-mer-based classification of Kraken with an efficient algorithm for assessing the coverage of unique k-mers found in each species in a dataset.	Genobioinfo Cluster: How to use
kSNP4	kSNP4 identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs.	Genobioinfo Cluster: How to use
LEfSe	LEfSe (Linear discriminant analysis effect size) is a tool developed by the Huttenhower group to find biomarkers between 2 or more groups using relative abundances.	Genobioinfo Cluster: How to use
LSx	LS^X is a script in R that runs the LS³ and LS⁴ algorithms of data subsampling for multigene phylogenetic inference. Both of these algorithms do a gene-by-gene inspection of the heterogeneity of evolutionary rates among user-defined lineages of interest (LOI). Then, using criteria that differ in both algorithms (see details here or in the papers), they try to find a subsample of sequences that evolve at a homogeneous rate across all LOIs. If this subset is found, an alignment of the gene is produced with only the sequences that evolve homogeneously. At the same time, a table is also produced showing which sequences were “flagged” (the sequences that were removed), and which sequences were kept. If a subset of sequences that evolve at a homogeneous rate is not found, the gene is flagged entirely.	Genobioinfo Cluster: How to use
MALT	MALT (MEGAN alignment tool) is an extension of MEGAN (metagenome analyzer). MALT performs alignment of metagenomic reads against a database of reference sequences (such as NR, GenBank or Silva) and produces a MEGAN RMA file as output. The software is currently under development.	Genobioinfo Cluster: How to use
MARVEL_bins	MARVEL (Metagenomic Analysis and Retrieval of Viral Elements) is a tool for recovery of draft phage genomes from whole community shotgun metagenomic sequencing data.	Genobioinfo Cluster: Ask for Install
mashtree	Create a tree using Mash distances.	Genobioinfo Cluster: How to use
MaxBin2	MaxBin is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.	Genobioinfo Cluster: Ask for Install
MEGAHIT	An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph	Genobioinfo Cluster: How to use
MEGAN	MEtaGenome ANalyzer : Metagenomic data analysis : taxonomic and functionnal (SEED and KEGG classification) analysis.	Genobioinfo Cluster: How to use
MetaBat	An adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies.	Genobioinfo Cluster: How to use
metabinkit	From metagenomic or metabarcoding data, it is often necessary to assign taxonomy to DNA sequences. This is generally performed by aligning sequences to a reference database, usually resulting in multiple database alignments for each query sequence. Using these alignment results, metabinkit assigns a single taxon to each query sequence, based on user-defined percentage identity thresholds. In essence, for each query, the alignments are filtered based on the percentage identity thresholds and the lowest common ancestor for all alignments passing the filters is determined. The metabin program is not limited to BLAST alignments, and can accept alignment results produced using any program, provided the input format is correct. However, functionality is also available to create BLAST databases and to perform BLAST alignments, which can be passed directly to metabin.	Genobioinfo Cluster: How to use
metaBIT	An integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data.	Genobioinfo Cluster: How to use
metaDMG	A fast and accurate ancient DNA damage toolkit for metagenomic data.	Genobioinfo Cluster: How to use
MetaEuk	MetaEuk is a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs.	Genobioinfo Cluster: How to use
METAL	The METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a convenient, rapid and memory efficient manner.	Genobioinfo Cluster: Ask for Install
MetaMDBG	A lightweight assembler for long and accurate metagenomics reads.	Genobioinfo Cluster: How to use
MetaPhlAn	MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.	Genobioinfo Cluster: How to use
MetaPhlAn2	MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: How to use
MetaPhlAn3	MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: Ask for Install
MetaPhlAn4	MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With StrainPhlAn, it is possible to perform accurate strain-level microbial profiling.	Genobioinfo Cluster: How to use
MetaWRAP	A flexible pipeline for genome-resolved metagenomic data analysis.	Genobioinfo Cluster: How to use
Metaxa2	Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data.	Genobioinfo Cluster: How to use
meteor	Meteor (Metagenomic Explorator), a software for profiling metagenomic data at gene level.	Genobioinfo Cluster: How to use
Migrate	Migrate estimates effective population sizes,past migration rates between n population assuming a migration matrix model with asymmetric migration rates and different subpopulation sizes, and population divergences or admixture.	Genobioinfo Cluster: How to use
MiMiC2	MiMiC2 is a bioinformatic pipeline for the selection of a few microbial genomes that functionally represent an entire ecosystem, termed a synthetic community (SynCom).	Genobioinfo Cluster: How to use
ModelTest-NG	ModelTest-NG is a tool for selecting the best-fit model of evolution for DNA and protein alignments. ModelTest-NG supersedes jModelTest and ProtTest in one single tool, with graphical and command console interfaces.	Genobioinfo Cluster: How to use
mothur	The one-stop source for your computational microbial ecology needs. mothur offers the ability to go from raw sequences to the generation of visualization tools to describe alpha and beta diversity.	Genobioinfo Cluster: How to use
mPTP	A tool for single-locus species delimitation.	Genobioinfo Cluster: How to use
MrBayes	MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.	Genobioinfo Cluster: How to use
msamtools	msamtools provides useful functions that are commonly used in microbiome data analysis, especially when analyzing shotgun metagenomics or metatranscriptomics data.	Genobioinfo Cluster: How to use
MVP	MVP stands for Multi-choice Viromics Pipeline. It is a simplified pipeline that utilizes a suite of state-of-art tools to easily get from a set of contigs to a vOTU heatmap (and more).	Genobioinfo Cluster: How to use
MyCC	Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.	Genobioinfo Cluster: Ask for Install
myloasm	Myloasm is a de novo metagenome assembler for long-read sequencing data. It takes sequencing reads and outputs polished contigs in a single command.	Genobioinfo Cluster: How to use
NanoASV	Nanopore full-length 16S metabarcoding amplicon data analysis	Genobioinfo Cluster: How to use
NanoCLUST	NanoCLUST is an analysis pipeline for UMAP-based classification of amplicon-based full-length 16S rRNA nanopore reads.	Genobioinfo Cluster: How to use
Newick_Utilities	The Newick Utilities are a suite of Unix shell tools for processing phylogenetic trees. We distribute the package under the BSD License. Functions include re-rooting, extracting subtrees, trimming, pruning, condensing, drawing (ASCII graphics or SVG).	Genobioinfo Cluster: How to use
NINJA	Nearly Infinite Neighbor Joining Application	Genobioinfo Cluster: How to use
orthAgogue	a tool for high speed estimation of homology relations within and between species in massive data sets. orthAgogue is easy to use and offers flexibility through a range of optional parameters.	Genobioinfo Cluster: Ask for Install
OrthoFinder	OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses.	Genobioinfo Cluster: How to use
PALEOMIX	The PALEOMIX pipeline is a set of free and open-source pipelines and tools designed to enable the rapid processing of Next Generation Sequencing (NGS) data, starting from de-multiplexed reads from one or more samples, through sequence processing and alignment, and ending with genotyping, phylogenetic inference on the samples, as well as metagenomic analysis of the samples.	Genobioinfo Cluster: How to use
PAML	PAML is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.	Genobioinfo Cluster: How to use
PanTools	PanTools is a toolkit for comparative analysis of large number of genomes.	Genobioinfo Cluster: How to use
ParGenes	A massively parallel tool for model selection and tree inference on thousands of genes.	Genobioinfo Cluster: Ask for Install
PartitionFinder	PartitionFinder is free open source software to select best-fit partitioning schemes and models of molecular evolution for phylogenetic analyses.	Genobioinfo Cluster: How to use
PathoFact	PathoFact is an easy-to-use modular pipeline for the metagenomic analyses of toxins, virulence factors and antimicrobial resistance.	Genobioinfo Cluster: Ask for Install
pathPhynder	A workflow for integrating ancient lineages into present-day phylogenies.	Genobioinfo Cluster: How to use
PAUP	Tools for inferring and interpreting phylogenetic trees	Genobioinfo Cluster: How to use
Pelican	Pelican is a reimplementation of the model of Tamuri et al. (2009) to identify sites undergoing different kinds of directional selection in different parts of a phylogenetic tree.	Genobioinfo Cluster: How to use
PHASE	PHASE is a package that performs molecular phylogenetic inference. The software seeks to accurately compare molecular sequences to determine the likely evolutionary relationships between a group of species.	Genobioinfo Cluster: How to use
PHAST	Phylogenetic Analysis with Space/Time models (PHAST) is a freely available software package consisting of a collection of command-line programs and supporting libraries for comparative and evolutionary genomics.	Genobioinfo Cluster: How to use
PhiPack	The Phi Test is a simple, rapid, and statistically efficient test for recombination.	Genobioinfo Cluster: Ask for Install
PhyKIT	PhyKIT is a UNIX shell toolkit for processing and analyzing phylogenomic data.	Genobioinfo Cluster: How to use
PHYLIP	PHYLIP (PHYLogeny Inference Package), is a package composed by 34 programs dedicated to phylogeny inference. Methods that are available in the package include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees. Data types that can be handled include molecular sequences, gene frequencies, restriction sites and fragments, distance matrices, and discrete characters.	Genobioinfo Cluster: How to use
PhyloBayes	PhyloBayes is a Bayesian Monte Carlo Markov Chain (MCMC) sampler for phylogenetic reconstruction and molecular dating using protein and nucleic acid alignments.	Genobioinfo Cluster: How to use
Phylobayes_MPI	PhyloBayes (Lartillot et al, 2009) is a Bayesian Monte Carlo Markov Chain (MCMC) sampler for phylogenetic reconstruction. With MPI.	Genobioinfo Cluster: How to use
phyloFlash	phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.	Genobioinfo Cluster: How to use
PhyloNet		Genobioinfo Cluster: How to use
PhyloPhlAn	PhyloPhlAn is a computational pipeline for reconstructing highly accurate and resolved phylogenetic trees based on whole-genome sequence information. The pipeline is scalable to thousands of genomes and uses the most conserved 400 proteins for extracting the phylogenetic signal. PhyloPhlAn also implements taxonomic curation, estimation, and insertion operations.	Genobioinfo Cluster: Ask for Install
phyluce	phyluce (phy-loo-chee) is a software package that was initially developed for analyzing data collected from ultraconserved elements in organismal genomes.	Genobioinfo Cluster: How to use
PhyML	PhyML is a phylogeny software based on the maximum-likelihood principle.	Genobioinfo Cluster: How to use
phyx	phyx performs phylogenetics analyses on trees and sequences.	Genobioinfo Cluster: Ask for Install
PICRUSt	PICRUSt (pronounced ﾓpie crustﾔ) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.	Genobioinfo Cluster: How to use
Plasmer	An accurate and sensitive bacterial plasmid identification tool based on deep machine-learning of shared k-mers and genomic features.	Genobioinfo Cluster: How to use
Plass	Plass (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level.	Genobioinfo Cluster: Ask for Install
ProphET	ProphET, Prophage Estimation Tool: a standalone prophage sequence prediction tool with self-updating reference database.	Genobioinfo Cluster: Ask for Install
Proteinortho	Proteinortho is a tool to detect orthologous genes within different species.	Genobioinfo Cluster: How to use
pyMLST	Python Mlst Local Search Tool.	Genobioinfo Cluster: How to use
QIIME	QIIME (pronounced "chime") stands for Quantitative Insights Into ttMicrobial Ecology. QIIME is an open source software package for ttcomparison and analysis of microbial communities, primarily based on tthigh-throughput amplicon sequencing data (such as SSU rRNA) generated tton a variety of platforms, but also supporting analysis of other types ttof data (such as shotgun metagenomic data). QIIME takes users from tttheir raw sequencing output through initial analyses such as OTU ttpicking, taxonomic assignment, and construction of phylogenetic trees ttfrom representative sequences of OTUs, and through downstream ttstatistical analysis, visualization, and production of ttpublication-quality graphics. QIIME has been applied to single studies ttbased on billions of sequences from thousands of samples. ttttt	Genobioinfo Cluster: How to use
r8s	This package implements several methods to infer divergence times on a molecular phylogeny, using penalized likelihood, maximum likelihood and nonparametric rate smoothing methods. It also implements miscellaneous tree and character evolution models and tests.	Genobioinfo Cluster: Ask for Install
RAiSD	RAiSD (Raised Accuracy in Sweep Detection) is a stand-alone software implementation of the μ statistic for selective sweep detection.	Genobioinfo Cluster: How to use
RAxML	RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for postanalyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads.	Genobioinfo Cluster: How to use
RAxML-NG	RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.	Genobioinfo Cluster: How to use
raxtax	raxtax is a fast and efficient k-mer-based non-Bayesian taxonomic classifier for barcoding DNA sequences.	Genobioinfo Cluster: How to use
Ray	Assemble genomes in parallel using the message-passing interface	Genobioinfo Cluster: Ask for Install
RDP Classifier	The RDP Classifier is a naive Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment.	Genobioinfo Cluster: Ask for Install
RDPTools	Collection of commonly used RDP Tools for easy building	Genobioinfo Cluster: How to use
RdRpCATCH	A community effort to create a shared resource for HMM-based RdRp discovery	Genobioinfo Cluster: How to use
read2tree	read2tree is a software tool that allows to obtain alignment matrices for tree inference.	Genobioinfo Cluster: How to use
Recentrifuge	Robust comparative analysis and contamination removal for metagenomics	Genobioinfo Cluster: How to use
REFMAKER	REFMAKER is a command-line and user-friendly pipeline providing different tools to create nuclear references from genomic assemblies of shotgun libraries.	Genobioinfo Cluster: How to use
REINDEER	Efficient indexing of k-mer presence and abundance in sequencing datasets.	Genobioinfo Cluster: How to use
relate	Software to estimate genome-wide genealogies for thousands of samples	Genobioinfo Cluster: How to use
ResFinder	ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.	Genobioinfo Cluster: How to use
RevBayes	RevBayes provides an interactive environment for statistical computation in phylogenetics. It is primarily intended for modeling, simulation, and Bayesian inference in evolutionary biology, particularly phylogenetics. However, the environment is quite general and can be useful for many complex modeling tasks.	Genobioinfo Cluster: Ask for Install
RFMIX	A discriminative method for local ancestry inference	Genobioinfo Cluster: How to use
RogueNaRok	A versatile and scalable algorithm for rogue taxon identification.	Genobioinfo Cluster: Ask for Install
RootDigger	RootDigger is a program that will, when given a MSA and an unrooted tree with branch lengths place a root on the given tree. For the foreseeable future, RootDigger will only support DNA data, as the method RootDigger uses is ineffective when using AA data.	Genobioinfo Cluster: How to use
Scoary	Scoary is designed to take the gene_presence_absence.csv file from Roary as well as a traits file created by the user and calculate the assocations between all genes in the accessory genome and the traits.	Genobioinfo Cluster: How to use
Seq-Gen	Seq-Gen is a program that will simulate the evolution of nucleotide or amino acid sequences along a phylogeny, using common models of the substitution process.	Genobioinfo Cluster: Ask for Install
SGSGeneLoss	Gene presence/absence variation discovery.	Genobioinfo Cluster: Ask for Install
SHERPAS	A new, alignment-free genome recombination detection tool exploiting the idea of phylo-kmers (originally developed in RAPPAS, Linard et al. 2019) to accelerate the process by several orders of magnitude while keeping comparable accuracy.	Genobioinfo Cluster: How to use
Simka	Simka is a de novo comparative metagenomics tool. Simka represents each dataset as a k-mer spectrum and compute several classical ecological distances between them.	Genobioinfo Cluster: How to use
singlem	SingleM is a tool to find the abundances of discrete operational taxonomic units (OTUs) directly from shotgun metagenome data, without heavy reliance on reference sequence databases. It is able to differentiate closely related species even if those species are from lineages new to science.	Genobioinfo Cluster: Ask for Install
SLR	SLR is a program to detect sites in coding DNA that are unusually conserved and/or unusually variable (that is, evolving under purify or positive selection) by analysing the pattern of changes for an alignment of sequences on an evolutionary tree.	Genobioinfo Cluster: How to use
SNPhylo	a pipeline to generate a phylogenetic tree from huge SNP data	Genobioinfo Cluster: How to use
SortaDate	Scripts that you can use at different stages to attempt to find more clock-like genes. Generally, you would use these for dating analyses with another package	Genobioinfo Cluster: Ask for Install
SPECTRE	A collection of Phylogenetics tools for creating and manipulating networks and trees.	Genobioinfo Cluster: Ask for Install
SqueezeMeta	A fully automated metagenomics pipeline, from reads to bins.	Genobioinfo Cluster: How to use
SSU-ALIGN		Genobioinfo Cluster: Ask for Install
Sumatra	Sumatra was developed by the LECA and aims to compute a great deal of sequence similarities in a fast and exact way, based on the length of the Longest Common Subsequence (LCS) between two sequences. Sequence clustering based on similarities is also available through Sumaclust.	Genobioinfo Cluster: How to use
SUPER-FOCUS	A tool for agile functional analysis of metagenomic data.	Genobioinfo Cluster: Ask for Install
SuperCRUNCH	A bioinformatics package for creating, filtering, and manipulating supermatrices and phylogenetic datasets using GenBank and/or local sequence data.	Genobioinfo Cluster: Ask for Install
swarm	A robust and fast clustering method for amplicon-based studies.	Genobioinfo Cluster: How to use
SweeD	A parallel and checkpointable tool that implements a composite likelihood ratio test for detecting selective sweeps. SweeD is based on the SweepFinder algorithm (Nielsen et al. 2005). SweeD can calculate the theoretical SFS of a given demographic model (stepwise changes or with an exponential growth phase + stepwise changes) by using the method by Živković and Stephan (2011).	Genobioinfo Cluster: How to use
sylph	sylph is a program that performs ultrafast (1) ANI querying or (2) metagenomic profiling for metagenomic shotgun samples.	Genobioinfo Cluster: How to use
TACT	Adds tips to a backbone phylogeny using taxonomy simulated with birth-death models	Genobioinfo Cluster: Ask for Install
TaxonKit	A Practical and Efficient NCBI Taxonomy Toolkit	Genobioinfo Cluster: How to use
TKGWV2	TKGWV2 is a pipeline to estimate biological relatedness (1st, 2nd, and unrelated degrees) between individuals specifically aimed at ultra-low coverage ancient DNA data obtained from whole genome sequencing.	Genobioinfo Cluster: How to use
TOGA	TOGA is a new method that integrates gene annotation, inferring orthologs and classifying genes as intact or lost.	Genobioinfo Cluster: How to use
TOPALI-v2	A rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops.	Genobioinfo Cluster: Ask for Install
TreeBeST	TreeBeST, which stands for (gene) Tree Building guided by Species Tree, is a versatile program that builds, manipulates and displays phylogenetic trees. It is particularly designed for building gene trees with a known species tree and is highly efficient and accurate.	Genobioinfo Cluster: Ask for Install
treePL	treePL is a phylogenetic penalized likelihood program.	Genobioinfo Cluster: Ask for Install
TreeShrink	TreeShrink is an algorithm for detecting abnormally long branches in one or more phylogenetic trees.	Genobioinfo Cluster: How to use
TreeTime	Maximum likelihood inference of time stamped phylogenies and ancestral reconstruction.	Genobioinfo Cluster: Ask for Install
trimAl	trimAl: a tool for automated alignment trimmin	Genobioinfo Cluster: How to use
Twisst	Topology weighting by iterative sampling of sub-trees.	Genobioinfo Cluster: How to use
vAMPirus	Automated virus amplicon sequencing analysis program integrated with Nextflow pipeline manager.	Genobioinfo Cluster: How to use
ViPER	Bioinformatics pipeline used in the Laboratory of Viral Metagenomics (KU Leuven) to trim and assemble paired-end Illumina reads, and classify resulting contigs.	Genobioinfo Cluster: How to use
VIRify	VIRify is a pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies.	Genobioinfo Cluster: How to use
VITAP	The viral taxonomic assignment pipeline	Genobioinfo Cluster:
Voyager	Rapid and efficient mapping algorithm for long sequencing reads with insertion- and deletion errors. Mapping long reads in Sorted Motif Distance Space.	Genobioinfo Cluster: How to use
WGDI	WGDI (Whole-Genome Duplication Integrated analysis), a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidizations and cross-species genome alignments.	Genobioinfo Cluster: How to use
whokaryote	Classification of metagenomic contigs as eukaryotic/prokaryotic using biology-based features.	Genobioinfo Cluster: How to use
wLogDate	Molecular Dating using logarithmic penalty function. wLogDate is a method for dating phylogenetic trees. Given a phylogeny and either sampling times for leaves or calibration points for internal nodes, wLogDate outputs a "dated" tree that conforms to the sampling times or calibration points. It can also work with no sampling time or calibration points where it would simply turn the tree into ultrametric, fixing its height to a given value. Its optimization criterion is to minimize the variance of the mutation rates in log scale (hence the term logDate).	Genobioinfo Cluster: Ask for Install

Population genetics

Application	Description	Availability/Use
ABCtoolbox	BCtoolbox is a general-purpose program to perform Approximate Bayesian Computation. ABCtoolbox can be used for ABC inference on almost any type of model, including models arising in physics, biology or engineering.	Genobioinfo Cluster: Ask for Install
ADMIXTOOLS	ADMIXTOOLS (Patterson et al. 2012) is a software package that supports formal tests of whether admixture occurred, and makes it possible to infer admixture proportions and dates.	Genobioinfo Cluster: How to use
Admixture	ADMIXTURE is a software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm.	Genobioinfo Cluster: How to use
ALDER	The ALDER software computes the weighted linkage disequilibrium (LD) statistic for making inference about population admixture	Genobioinfo Cluster: Ask for Install
AlphaImpute	AlphaImpute is a software package for imputing and phasing genotype data in diploid populations with pedigree information.	Genobioinfo Cluster: Ask for Install
Ancestry HMM	A hidden Markov model approach for simultaneously estimating local ancestry and admixture time using next generation sequence.	Genobioinfo Cluster: How to use
ancIBD	Identify IBD segments between pairs of individuals in ancient human DNA data. The software package `ancIBD` detects Identity-by-Descent (IBD) segments in typical human aDNA data, implementing an algorithm described in this preprint. The input data are imputed and phased genotype data. The default parameters of `ancIBD` are optimized for imputed data using the software GLIMPSE using the 1000 Genome haplotype reference panel. Software documentation here.	Genobioinfo Cluster: How to use
ANGSD	ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities.	Genobioinfo Cluster: How to use
ARGweaver	The ARGweaver/ARGweaver-D software package contains programs and libraries for sampling and manipulating ancestral recombination graphs (ARGs).	Genobioinfo Cluster: How to use
ASMC	Ascertained Sequentially Markovian Coalescent (contains ASMC and an extension, FastSMC, together with python bindings for both)	Genobioinfo Cluster: How to use
BAMM	A program for multimodel inference on speciation and trait evolution.	Genobioinfo Cluster: Ask for Install
BAMscorer	BAMscorer can be used to conduct genomic assignment tests from BAM files. Assignments can be done on genomic regions, inversions, and whole-genome datasets.	Genobioinfo Cluster: Ask for Install
BayeScan	Detecting natural selection from population-bases genetic data using differences in alleles frequencies between populations.	Genobioinfo Cluster: How to use
BayeScEnv	BayeScEnv is a Fst-based, genome-scan method that uses environmental variables to detect local adaptation.	Genobioinfo Cluster: Ask for Install
BayPass	The package BayPass is a population genomics software which is primarily aimed at identifying genetic markers subjected to selection and/or associated to population-specific covariates (e.g., environmental variables, quantitative or categorical phenotypic characteristics).	Genobioinfo Cluster: How to use
Beagle	BEAGLE is a state of the art software package for analysis of large-scale genetic data sets with hundreds of thousands of markers genotyped on thousands of samples.	Genobioinfo Cluster: How to use
bonsaitree	Algorithm for automatically building pedigrees using IBD, Age, and Sex information.	Genobioinfo Cluster: How to use
chewBBACA	chewBBACA is a software suite for the creation and evaluation of core genome and whole genome MultiLocus Sequence Typing (cg/wgMLST) schemas and results. The "BBACA" stands for "BSR-Based Allele Calling Algorithm". BSR stands for BLAST Score Ratio as proposed by Rasko DA et al.. The "chew" part adds extra coolness to the name and could be thought of as "Comprehensive and Highly Efficient Workflow".	Genobioinfo Cluster: How to use
Circuitscape	Circuitscape borrows algorithms from electronic circuit theory to predict patterns of movement, gene flow, and genetic differentiation among plant and animal populations in heterogeneous landscapes.	Genobioinfo Cluster: How to use
CLUMPAK	Clustering Markov Packager Across K - was developed in order to aid users analyse the results of STRUCTURE-like programs. The software offers a few alternative modes of action, please go to the Help section for detailed about these modes.	Genobioinfo Cluster: How to use
Clumppling	CLUster Matching and Permutation Program that uses integer Linear programmING: a framework for aligning mixed-membership clustering results of population structure analysis.	Genobioinfo Cluster: How to use
Comp-D	A program for comprehensive computation of D-statistics and population summaries (serial version).	Genobioinfo Cluster: Ask for Install
CRISPR-HAWK	CRISPR-HAWK is a comprehensive and scalable tool for designing guide RNAs (gRNAs) and assessing genetic variants impact on on-target sites in CRISPR-Cas systems. This makes CRISPR-HAWK particularly suitable for both personalized and population-wide gRNA design. CRISPR-HAWK automates the entire workflow—from variant-aware preprocessing to gRNA discovery—delivering comprehensive outputs including ranked tables, annotated sequences, and high-quality figures. Its modular design ensures easy integration with existing pipelines and tools, such as CRISPRme or CRISPRitz, for subsequent off-target prediction and analysis of prioritized gRNAs.	Genobioinfo Cluster: How to use
CRISPRme	CRISPRme is a comprehensive tool designed for thorough off-target assessment in CRISPR-Cas systems. CRISPRme accounts for single-nucleotide variants (SNVs) and indels, considers bona fide haplotypes, and allows for spacer:protospacer mismatches and bulges, making it well-suited for both population-wide and personal genome analyses. CRISPRme automates the entire workflow, from data download to executing the search, and delivers detailed reports complete with tables and figures through an interactive web-based interface.	Genobioinfo Cluster: How to use
currentNE	Estimation of current effective population using artificial neural networks.	Genobioinfo Cluster: How to use
currentNe2	Estimation of current effective population using artificial neural networks.	Genobioinfo Cluster: How to use
dadi	dadi implements a method for demographic inference from genetic data, based on a diffusion approximation to the allele frequency spectrum.	Genobioinfo Cluster: How to use
DFE-alpha	DFE-alpha was initially written to estimate the distribution of fitness effects (DFE) of new deleterious mutations using within-species nucleotide polymorphism data.	Genobioinfo Cluster: How to use
DILS	DILS is a statistical analysis platform for conducting demographic inferences with linked selection from population genomic data using an Approximate Bayesian Computation framework.	Genobioinfo Cluster: How to use
DIYABC	A user-friendly approach to Approximate Bayesian Computation for inference on population history using molecular markers.	Genobioinfo Cluster: Ask for Install
Dsuite	Fast calculation of Paterson's D (ABBA-BABA) and the f4-ratio statistics across many populations/species	Genobioinfo Cluster: How to use
easySFS	easySFS is a tool for the effective selection of population size projection for construction of the site frequency spectrum. It may be used to convert VCF to dadi/fastsimcoal/momi2 style SFS for demographic analysis.	Genobioinfo Cluster: How to use
EEMS	EEMS method for analyzing and visualizing spatial population structure from geo-referenced genetic samples.	Genobioinfo Cluster: How to use
Eigensoft	The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006).	Genobioinfo Cluster: How to use
ELAI	The software performs local ancestry inference for admixed individuals.	Genobioinfo Cluster: How to use
EMMAX	EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each loci explains only a small fraction of complex traits, which allows us to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.	Genobioinfo Cluster: How to use
est-sfs	est-sfs implements a maximum likelihood method to infer the unfolded site frequency spectrum (the uSFS) and ancestral state probabilities for DNA sequence data.	Genobioinfo Cluster: Ask for Install
FaMoz	FaMoz, a software written in the C language and in TclTk, uses likelihood calculation and simulation to perform parentage studies with codominant, dominant, cytoplasmic markers or combinations of the different types.	Genobioinfo Cluster: Ask for Install
fastGLOBETROTTER	fastGLOBETROTTER is an updated version of the same GLOBETROTTER model, using the same input, but that is ~4-20 times faster than GLOBETROTTER without sacrificing accuracy. fastGLOBETROTTER: an efficient method to identify, date and describe admixture events using haplotype information	Genobioinfo Cluster: How to use
fastNGSadmix	Program for infering admixture proportions and doing PCA with a single NGS sample. Inferences based on reference panel.	Genobioinfo Cluster: Ask for Install
FastSimBac	FastSimBac is a simulator of the coalescent process with bacterial recombination that simulates genealogies spatially across chromosomes as a Markov process.	Genobioinfo Cluster: Ask for Install
fastsimcoal2	Fast sequential Markov coalescent simulation of genomic data under complex evolutionary models	Genobioinfo Cluster: How to use
fastStructure	fastStructure is an algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.	Genobioinfo Cluster: How to use
FBAT	FBAT is an acronym for Family-Based Association Tests in genetic analyses. Family-based association designs, as opposed to case-control study designs, are particularly attractive, since they test for linkage as well as association, avoid spurious associations caused by admixture of populations, and are convenient for investigators interested in refining linkage findings in family samples.	Genobioinfo Cluster: Ask for Install
fineSTRUCTURE	fineSTRUCTURE is a fast and powerful algorithm for identifying population structure using dense sequencing data.	Genobioinfo Cluster: How to use
flare	The flare program uses a set of reference haplotypes to infer the ancestry of each allele in a set of admixed study samples. The flare program is fast, accurate, and memory-efficient.	Genobioinfo Cluster: How to use
G-PhoCS	G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences.	Genobioinfo Cluster: How to use
Gamma-SMC	This is an alternative and an upgrade of the widely used PSMC method, which infers population size trajectories from VCF files.	Genobioinfo Cluster: How to use
Genepop	Population genetics software that computes estimates of F-statistics.	Genobioinfo Cluster: Ask for Install
genomegaMap	Within-species genome-wide dN/dS estimation from very many genomes.	Genobioinfo Cluster: How to use
GenomeSTRiP	Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.	Genobioinfo Cluster: Ask for Install
GEVA	Genealogical Estimation of Variant Age. We have developed a method for estimating the age of genetic variants; that is, the time of origin of an allele through mutation at a single locus. Our approach, which we refer to as the Genealogical Estimation of Variant Age (GEVA), is similar to existing methods that involve coalescent modeling to infer the time to the most recent common ancestor (TMRCA) between individual genomes <13, 23, 24>. However, these methods typically operate on a discretized timescale <13>, utilize only a fraction of the information available in larger sample data <25>, or employ approximations to overcome computational complexity <14, 15, 26>.	Genobioinfo Cluster: Ask for Install
gIMble	A genome-wide IM blockwise likelihood estimation toolkit	Genobioinfo Cluster: Ask for Install
GONE2	Demographic history from the observed spectrum of linkage disequilibrium.	Genobioinfo Cluster: How to use
grenedalf	grenedalf is a collection of commands for working with pool sequencing population genetic data.	Genobioinfo Cluster: How to use
Gtools	GTOOL is a program for transforming sets of genotype data for use with the programs SNPTEST and IMPUTE.	Genobioinfo Cluster: Ask for Install
HapCUT2	Software tools for haplotype assembly from sequence data	Genobioinfo Cluster: How to use
hapflk	hapflk is a software implementing the hapFLK <1> and FLK <2> tests for the detection of selection signatures based on multiple population genotyping data.	Genobioinfo Cluster: How to use
Haplogrep3	Haplogrep is a command-line tool for mtDNA haplogroup classification.	Genobioinfo Cluster: How to use
Haplostrips	Haplostrips produce plots that depict variants in a genomic window among different samples. Visualize similarities between haplotypes with respect to a reference haplotype through haplotype clustering and sorting, useful for revealing hidden population structure.	Genobioinfo Cluster: How to use
IBDNe	The IBDNe program estimate ancestry-specific historical effective population size.	Genobioinfo Cluster: How to use
iSMC	This software extend the sequentially Markovian coalescent model to jointly infer the spatial variation in recombination rate (rho) from a single pair of unphased genomes.	Genobioinfo Cluster: Ask for Install
LDhat	LDhat is a package written in the C and C++ languages for the analysis of recombination rates from population genetic data.	Genobioinfo Cluster: Ask for Install
LDhelmet	LDhelmet performs statistical inference for fine-scale variable recombination rate estimation.	Genobioinfo Cluster: How to use
MALDER	This is a version of ALDER (http://groups.csail.mit.edu/cb/alder/) that has been modified to allow multiple admixture events.	Genobioinfo Cluster: How to use
MapThin	Reduce the number of SNPs in a gene marker dense map computed by PLINK. First, by eliminating linked SNPs. Then, by applying different criteria.	Genobioinfo Cluster: Ask for Install
Mauve	Mauve is a system for efficiently constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion. Multiple genome alignment provides a basis for research into comparative genomics and the study of evolutionary dynamics. Aligning whole genomes is a fundamentally different problem than aligning short sequences.	>Genobioinfo Cluster: How to use
MetaCHIP	Horizontal gene transfer (HGT) identification pipeline among prokaryotes.	Genobioinfo Cluster: How to use
Migrate	Migrate estimates effective population sizes,past migration rates between n population assuming a migration matrix model with asymmetric migration rates and different subpopulation sizes, and population divergences or admixture.	Genobioinfo Cluster: How to use
MLST	Multi-Locus sequence Typing. The method enables investigators to determine the ST based on WGS data.	Genobioinfo Cluster: Ask for Install
ms	A program for generating samples under neutral models.	Genobioinfo Cluster: How to use
msBayes	msBayes allows complex and flexible comparative phylogeographic inference.	Genobioinfo Cluster: Ask for Install
msmc	This software implements MSMC, a method to infer population size and gene flow from multiple genome sequences	Genobioinfo Cluster: How to use
msmc2	This program implements MSMC2, a method to infer population size history and population separation history from whole genome sequencing data	Genobioinfo Cluster: How to use
msprime	`msprime` is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consistent with a given demographic model) under a range of different models and evolutionary processes. Msprime can also simulate mutations on a given ancestral history (which can be produced by msprime or other programs supporting tskit) under a variety of genome sequence evolution models. Please see the documentation for more details	Genobioinfo Cluster: How to use
msums	A program for the efficient computation of a number of population genetics summary statistics. msums can read ms-format data on (nearly) arbitrary numbers of populations.	Genobioinfo Cluster: Ask for Install
Neural-ADMIXTURE	Neural ADMIXTURE is an unsupervised global ancestry inference technique based on ADMIXTURE. By using neural networks, Neural ADMIXTURE offers high quality ancestry assignments with a running time which is much faster than ADMIXTURE's.	Genobioinfo Cluster: How to use
ngsLD	ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype's assignation into account.	Genobioinfo Cluster: How to use
ngsRelate	NgsRelate can be used to infer relatedness coefficients for pairs of individuals from low coverage Next Generation Sequencing (NGS) data by using genotype likelihoods instead of called genotypes.	Genobioinfo Cluster: How to use
ngsTools	ngsTools is a collection of programs for population genetics analyses from NGS data, taking into account its statistical uncertainty. The methods implemented in these programs do not rely on SNP or genotype calling, and are particularly suitable for low sequencing depth data.	Genobioinfo Cluster: How to use
omegaplus	A scalable tool for rapid detection of selective sweeps in whole-genome datasets.	Genobioinfo Cluster: How to use
PCAdmix	PCAdmix is a method that estimates local ancestry via principal components analysis (PCA) using phased haplotypes. The method considers data chromosome by chromosome.	Genobioinfo Cluster: Ask for Install
PGDSpider	PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances)	Genobioinfo Cluster: How to use
Phy-Mer	A novel alignment-free and reference-independent mitochondrial haplogroup classifier.	Genobioinfo Cluster: Ask for Install
pixy	pixy is a command-line tool for painlessly and correctly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.	Genobioinfo Cluster: How to use
PMERGE	PMERGE, is a software, which implements a new method that identifies candidate PSVs by building networks of loci that share high levels of nucleotide similarity. The PMERGE is embedded in the analysis pipeline of the widely used Stacks software, and it is straightforward to apply it as an additional filter in population-genomic studies using RAD-seq data.	Genobioinfo Cluster: Ask for Install
pong	pong is a freely available software package, released by Behr et al. (2016, Bioinformatics), for post-processing output from clustering inference using population genetic data.	Genobioinfo Cluster: Ask for Install
PopART	PopART (Population Analysis with Reticulate Trees) is free, open source population genetics software that was developed as part of the Allan Wilson Centre Imaging Evolution Initiative.	Genobioinfo Cluster: How to use
popins	Population-scale detection of novel-sequence insertions.	Genobioinfo Cluster: Ask for Install
PoPoolation	PoPoolation is a pipeline for analysing pooled next generation sequencing data. Currently PoPoolation allows to calculate Tajima’s Pi, Watterson’s Theta and Tajima’s D with a sliding window approach for chromosomes or for set of genes.	Genobioinfo Cluster: How to use
PoPoolation2	PoPoolation2 allows to compare allele frequencies for SNPs between two or more populations and to identify significant differences. PoPoolation2 requires next generation sequencing data of pooled genomic DNA (Pool-Seq). It may be used for measuring differentiation between populations, for genome wide association studies and for experimental evolution.	Genobioinfo Cluster: How to use
popPhylABC	Scripts used for ABC analysis with homo- and heterogeneity in Migration rates or/and Effective population sizes	Genobioinfo Cluster: Ask for Install
psmc	This software package infers population size history from a diploid sequence using the Pairwise Sequentially Markovian Coalescent (PSMC) model.	Genobioinfo Cluster: How to use
pyrho	Fast demography-aware inference of fine-scale recombination rates based on fused-LASSO.	Genobioinfo Cluster: How to use
qpWrapper	Tools allowing to launch qpAdmn analyzes (Admixtools) in series on a list of individuals.	Genobioinfo Cluster: Ask for Install
quickLD	High-performance Computation of Linkage Disequilibrium on CPUs and GPUs.	Genobioinfo Cluster: How to use
relate	Software to estimate genome-wide genealogies for thousands of samples	Genobioinfo Cluster: How to use
RFMIX	A discriminative method for local ancestry inference	Genobioinfo Cluster: How to use
RFMix-reader	rfmix-reader is a Python package designed to efficiently read and process output files generated by RFMix, a popular tool for estimating local ancestry in admixed populations. It employs a lazy loading approach to minimize memory usage, and leverages GPU acceleration for major speedups when available.	Genobioinfo Cluster: How to use
SCCmecFinder	SCCmecFinder identifies SCCmec elements in sequenced S. aureus isolates.	Genobioinfo Cluster: How to use
scrm	A coalescent simulator for genome-scale sequences.	Genobioinfo Cluster: Ask for Install
selscan	A program to calculate EHH-based scans for positive selection in genomes.	Genobioinfo Cluster: How to use
SHERPAS	A new, alignment-free genome recombination detection tool exploiting the idea of phylo-kmers (originally developed in RAPPAS, Linard et al. 2019) to accelerate the process by several orders of magnitude while keeping comparable accuracy.	Genobioinfo Cluster: How to use
simuPOP	simuPOP is a general-purpose individual-based forward-time population genetics simulation environment.	Genobioinfo Cluster: Ask for Install
SLiM	SLiM is an evolutionary simulation framework that combines a powerful engine for population genetic simulations with the capability of modeling arbitrarily complex evolutionary scenarios.	Genobioinfo Cluster: How to use
SMC++	SMC++ is a program for estimating the size history of populations from whole genome sequence data.	Genobioinfo Cluster: How to use
sNMF	A fast and efficient program for estimating individual admixture coefficients based on sparse non-negative matrix factorization and population genetics.	Genobioinfo Cluster: Ask for Install
spaTyper	Computational method for finding spa types. Staphylococcus aureus is a major human pathogen causing skin and tissue infections, pneumonia, septicemia, and device-associated infections. The emergence of strains resistant to methicillin (MRSA) and other antibacterial agents has become a major concern, especially in the hospital environment, because of the high mortality of the infections caused by these strains. Single locus DNA-sequencing of the repeat region of the Staphylococcus protein A gene (spa) can be used for reliable, accurate and discriminatory typing of MRSA. Repeats are assigned a numerical code and the spa-type is deduced from the order of specific repeats. However, spa-typing was hampered in the past by the lack of a consensus on assignments of new spa-repeats and -types.	Genobioinfo Cluster: Ask for Install
stairway_plot	The stairway plot is a method for inferring detailed population demographic history using the site frequency spectrum (SFS) from DNA sequence data. It does not need a pre-defined population model and can be applied to hundreds of unphased sequences.	Genobioinfo Cluster: How to use
Structure	The program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.	Genobioinfo Cluster: How to use
T1K	T1K (The ONE genotyper for Kir and HLA) is a computational tool to infer the alleles for the polymorphic genes such as KIR and HLA. T1K calculates the allele abundances based on the RNA-seq/WES/WGS read alignments on the provided allele reference sequences. The abundances are used to pick the true alleles for each gene. T1K provides the post analysis steps, including novel SNP detection and single-cell representation. T1K supports both single-end and paired-end sequencing data with any read length.	Genobioinfo Cluster: How to use
TreeMix	TreeMix is a method for inferring the patterns of population splits and mixtures in the history of a set of populations. In the underlying model, the modern-day populations in a species are related to a common ancestor via a graph of ancestral populations. We use the allele frequencies in the modern populations to infer the structure of this graph.	Genobioinfo Cluster: How to use
unitig-caller	Methods to determine sequence element (unitig) presence/absence.	Genobioinfo Cluster: How to use
WASP	WASP is a suite of tools for unbiased allele-specific read mapping and discovery of molecular QTLs.	Genobioinfo Cluster: How to use
Yleaf	Software for human Y-chromosomal haplogroup inference from next generation sequencing data.	Genobioinfo Cluster: How to use

Primers

Application	Description	Availability/Use
CATCH	A package for designing compact and comprehensive capture probe sets.	Genobioinfo Cluster: How to use
CREPE	CREPE is a batch primer design and specificity analysis tool. CREPE is a batch primer design and specificity analysis tool. It uses Primer3 (https://primer3.org/) to create primers from an input CSV of target sites. It then uses UCSC's In-Silico PCR (https://genome.ucsc.edu/cgi-bin/hgPcr) to identify off-target enrichment sites for each primer pair. Lastly, a custom Python evaluation script (E-script) performs specificity analysis to determine the quality of predicted off-target sites from ISPCR.	Genobioinfo Cluster: How to use
CRISPOR	CRISPOR predicts off-targets in the genome, ranks guides, highlights problematic guides, designs primers and helps with cloning.	Genobioinfo Cluster: How to use
lima	Demultiplex Barcoded PacBio Samples.	Genobioinfo Cluster: Ask for Install
minibar	Dual barcode and primer demultiplexing for MinION sequenced reads.	Genobioinfo Cluster: How to use
MIPgen	Use MIPgen to design custom mip panels for target enrichment of moderate to high complexity DNA targets ranging from 120 to 250bp in size.	Genobioinfo Cluster: How to use

Proteins

Application	Description	Availability/Use
AlphaFold	AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence.	Genobioinfo Cluster: How to use
AlphaFold-disorder	Predict disorder and disorder binding from AlphaFold structures.	Genobioinfo Cluster: How to use
Alphafold2-Pytorch	An unofficial working Pytorch implementation of Alphafold2, a 3D protein predictor.	Genobioinfo Cluster: Ask for Install
ATTRACT	ATTRACT program suite for macromolecular docking (protein-protein, protein-nucleic acid, protein-peptide).	Genobioinfo Cluster: How to use
CAID	The CAID software produces all outputs necessary for Critical Assessment of Intrinsic Disorder (CAID) edition, including baselines, references, metrics and plots, starting from predictions and a reference (see Data Availability section to know how to obtain this data).	Genobioinfo Cluster: How to use
CITE-seq-Count	A tool that allows to get UMI counts from a single cell protein assay.	Genobioinfo Cluster: Ask for Install
ColabFold	ColabFold is an easy-to-use Notebook based environment for fast and convenient protein structure predictions. Its structure prediction is powered by AlphaFold2 and RoseTTAFold combined with a fast multiple sequence alignment generation stage using MMseqs2.	Genobioinfo Cluster: How to use
DeepTMHMM	A Deep Learning Model for Transmembrane Topology Prediction and Classification	Genobioinfo Cluster: How to use
DIAMOND	Accelerated BLAST compatible local sequence aligner.	Genobioinfo Cluster: How to use
DiffDock	DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking. Original paper on arXiv Implementation of DiffDock, state-of-the-art method for molecular docking, by Gabriele Corso, Hannes Stark, Bowen Jing*, Regina Barzilay and Tommi Jaakkola.	Genobioinfo Cluster: How to use
DTVF	Virulence Factor Prediction using Deep Learning.	Genobioinfo Cluster: How to use
E2P2	Ensemble Enzyme Prediction Pipeline.	Genobioinfo Cluster: How to use
EGAPx	EGAPx is the publicly accessible version of the updated NCBI Eukaryotic Genome Annotation Pipeline.	Genobioinfo Cluster: How to use
EMBER3D	Ultra-fast in-silico structure mutation.	Genobioinfo Cluster: How to use
EVE	EVE is a set of protein-specific models providing for any single amino acid mutation of interest a score reflecting the propensity of the resulting protein to be pathogenic.	Genobioinfo Cluster: How to use
EvoBind	EvoBind (v2) designs novel peptide binders based only on a protein target sequence. It is not necessary to specify any target residues within the protein sequence or the length of the binder (although this is possible). Cyclic binder design is also possible.	Genobioinfo Cluster: How to use
genomegaMap	Within-species genome-wide dN/dS estimation from very many genomes.	Genobioinfo Cluster: How to use
hssp	Create DSSP and HSSP files. A series of PDB-related databanks for everyday needs.	Genobioinfo Cluster: How to use
InterProScan	InterProScan is a tool that combines different protein signature recognition methods into one resource. No less than 14 pattern/profiles databanks can be interrogated.	Genobioinfo Cluster: How to use
LOCALIZER	LOCALIZER is a machine learning method for predicting the subcellular localization of both plant proteins and pathogen effectors in the plant cell.	Genobioinfo Cluster: How to use
Loctree3	Protein Subcelullar Localization Sequenced-Based Predictor	Genobioinfo Cluster: How to use
MAESTRO	A Multi AgEnt STability pRedictiOn tool for changes in unfolding free energy upon point mutation. MAESTRO is structure based and distinguishes from similar approaches in the following points: (i) MAESTRO implements a multi-agent machine learning system. (ii) It provides predicted ΔΔG values along with a corresponding prediction quality measure. (iii) MAESTRO is applicable to biological assemblies. (iv) It provides high throughput scanning for multi-point mutations where sites and types of mutation can be comprehensively controlled. (v) Finally, the software provides a specific mode for the prediction of stabilizing disulfide bonds.	Genobioinfo Cluster: How to use
MeroX	MeroX is based on StavroX. It is specialized for cleavable cross-linkers. In addition to peptide backbone fragments, MeroX identifies cross-linker specific fragments in MS-MS data.	Genobioinfo Cluster: How to use
Miniprot	Aligning proteins to genomes with splicing and frameshift.	Genobioinfo Cluster: How to use
MZmine	MZmine is an open-source software for mass-spectrometry data processing.	Genobioinfo Cluster: How to use
OMA	The OMA (Orthologous MAtrix) database is a well-established resource for identifying orthologs among publicly available complete genomes.	Genobioinfo Cluster: How to use
OMArk	OMArk is a software for proteome (protein-coding gene repertoire) quality assessment.	Genobioinfo Cluster: How to use
Openfold3	A fully open source biomolecular structure prediction model based on AlphaFold3.	Genobioinfo Cluster: How to use
OrthoLoger	Standalone pipeline for delineation of orthologs.	Genobioinfo Cluster: Ask for Install
OrthoMCL	OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences.	Genobioinfo Cluster: How to use
PanTools	PanTools is a toolkit for comparative analysis of large number of genomes.	Genobioinfo Cluster: How to use
PfamScan	A program that searches a FASTA file against a library of Pfam HMMs.	Genobioinfo Cluster: How to use
Phobius	A combined transmembrane topology and signal peptide predictor.	Genobioinfo Cluster: Ask for Install
Plass	Plass (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level.	Genobioinfo Cluster: Ask for Install
Proteinortho	Proteinortho is a tool to detect orthologous genes within different species.	Genobioinfo Cluster: How to use
ProtHint	ProtHint is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.	Genobioinfo Cluster: How to use
PROVEAN	PROVEAN (Protein Variation Effect Analyzer) is a software tool which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein.	Genobioinfo Cluster: How to use
RFdiffusion	RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).	Genobioinfo Cluster: How to use
seqconverter	The command-line program seqconverter can read and write text files containing aligned or unaligned DNA or protein sequences.	Genobioinfo Cluster: How to use
SignalP	SignalP 4.0 server predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.	Genobioinfo Cluster: How to use
TMHMM	Prediction of transmembrane helices in proteins.	Genobioinfo Cluster: How to use
Vina-GPU-2.0	Vina-GPU 2.0 accelerates AutoDock Vina and its related commonly derived docking methods, such as QuickVina 2 and QuickVina-W with GPUs.	Genobioinfo Cluster: How to use
VinaLC	A parallel molecular docking program based on AutoDock Vina.	Genobioinfo Cluster: How to use
WoLFPSort	WoLF PSORT is an extension of the PSORT II program for protein subcellular localization prediction, which is based on the PSORT principle. WoLF PSORT converts a protein's amino acid sequences into numerical localization features; based on sorting signals, amino acid composition and functional motifs.	Genobioinfo Cluster: How to use

Quality and cleaning

Application	Description	Availability/Use
3rdChimeraMiner	Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.	Genobioinfo Cluster: How to use
AdapterRemoval	This program was developed to remove residual adapter sequences from next generation sequencing reads. The program handles both single end and paired end data.	Genobioinfo Cluster: How to use
ALFATClust	ALignment-Free Adaptive Threshold Clustering:Biological sequence clustering tool with dynamic threshold for individual clusters. Suitable for clustering multiple groups of homologous sequences.	Genobioinfo Cluster: How to use
Ampliconnoise	AmpliconNoise is a collection of programs for the removal of noise from 454 sequenced PCR amplicons. It involves two steps the removal of noise from the sequencing itself and the removal of PCR point errors. This project also includes the Perseus algorithm for chimera removal.	Genobioinfo Cluster: Ask for Install
atac dnase pipelines	ATAC-seq and DNase-seq processing pipeline. This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.	Genobioinfo Cluster: Ask for Install
ATLAS	ATLAS stands for Analysis Tools for Low-coverage and Ancient Samples. These tools cover all programs necessary to obtain variant calls, estimates of heterozygosity and more from a BAM file. There are sequence data processing tools, diagnostic tools, and variant discovery tools, similar to GATK by the Broad Institute.	Genobioinfo Cluster: How to use
Atropos	Atropos is tool for specific, sensitive, and speedy trimming of NGS reads. It is a fork of Cutadapt read trimmer.	Genobioinfo Cluster: Ask for Install
BCOOL	BCOOL is a read corrector for NGS sequencing data that align reads on a de Bruijn graph.	Genobioinfo Cluster: Ask for Install
biastools	The toolkits to analyze reference bias of short DNA read alignment.	Genobioinfo Cluster: How to use
Btrim	A fast and accurate adapter, barcodes, and low-quality region trimming and binning program written in C for next-generating sequencing reads. The search algorithm is based on Eugene Myers' fast bit-vector algorithm.	Genobioinfo Cluster: Ask for Install
charcoal	Remove contaminated contigs from genomes using k-mers and taxonomies.	Genobioinfo Cluster: Ask for Install
CheckM2	Assessing the quality of metagenome-derived genome bins using machine learning.	Genobioinfo Cluster: How to use
CheckV	CheckV is a fully automated command-line pipeline for assessing the quality of single-contig viral genomes, including identification of host contamination for integrated proviruses, estimating completeness for genome fragments, and identification of closed genomes.	Genobioinfo Cluster: Ask for Install
chopper	This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. chopper is a tool that reunites the now outdated softwares NanoFilt and NanoLyse. It permits to filter QC files and has a faster execution time than NanoFilt and NanoLyse.	Genobioinfo Cluster: How to use
ClipAndMerge	Clip&Merge is a tool to clip off adapters from sequencing reads and merge overlapping paired end reads together.	Genobioinfo Cluster: How to use
compleasm	A genome completeness evaluation tool based on miniprot.	Genobioinfo Cluster: How to use
Consensify	Consensify is a method for generating a consensus pseudohaploid genome sequence with greatly reduced error rates compared to standard pseudohaploidisation.	Genobioinfo Cluster: Ask for Install
ContScout	ContScout is a pipeline developed for the identification and removal of contaminating sequences in draft genomes.	Genobioinfo Cluster: How to use
CroCo	A program to detect potential cross contaminations in HTS assembled transcriptomes using expression level quantification.	Genobioinfo Cluster: Ask for Install
cutadapt	Cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.	Genobioinfo Cluster: How to use
DeChat	Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat.	Genobioinfo Cluster: How to use
decOM	decOM is a high-accuracy microbial source tracking method that is suitable for contamination quantification in paleogenomics, namely the analysis of collections of possibly contaminated ancient oral metagenomic data sets.	Genobioinfo Cluster: How to use
DeconSeq	Detect and remove contaminations from your sequence data.	Genobioinfo Cluster: Ask for Install
DeDup	A merged read deduplication tool capable to perform merged read deduplication on single end data.	Genobioinfo Cluster: Ask for Install
ecoPCR	ecoPCR is an electronic PCR software developed by LECAand Helix-Project . It helps you to estimate Barcode primers quality. In conjunction with OBItools, you can postprocess ecoPCR output to compute barcode coverage and barcode speci?city.	Genobioinfo Cluster: How to use
ecoPrimers	ecoPrimer is a barcoding software which is written in C language. It finds universal primers from a set of input DNA sequences by finding conserved regions without "a priori" on candidate sequences. It also evaluates the quality of the primers and barcode regions by measuring the "barcode specificity" and "barcode coverage" indices	Genobioinfo Cluster: How to use
EMA	EMA uses a latent variable model to align barcoded short-reads (such as those produced by 10x Genomics' sequencing platform).	Genobioinfo Cluster: Ask for Install
EMBLmyGFF3	An efficient way to convert gff3 annotation files into EMBL format ready to submit.	Genobioinfo Cluster: How to use
fastix	A simple command line tool to add prefixes to FASTA headers.	Genobioinfo Cluster: How to use
fastk-medians	A set of utilities to calculate the median number of times the k-mers in a sequence of interest occur across the whole set.	Genobioinfo Cluster: How to use
fastp	A tool designed to provide fast all-in-one preprocessing for FastQ files.	Genobioinfo Cluster: How to use
fastplong	Ultra-fast preprocessing and quality control for long-read sequencing data.	Genobioinfo Cluster: How to use
FastQ Screen	FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.	Genobioinfo Cluster: How to use
fastq_illumina_filter	This program can filter FASTQ files produced by CASAVA 1.8, and keep/discard reads based on this filter flag.	Genobioinfo Cluster: How to use
fastq-tools	A collection of small and efficient programs for performing some common and uncommon tasks with FASTQ files.	Genobioinfo Cluster: Ask for Install
FastQC	A Quality Control application for FastQ files. FastQC is an application which takes a FastQ file and runs a series of tests on it to generate a comprehensive QC report.	Genobioinfo Cluster: How to use
fastQValidator	The fastQValidator validates the format of fastq files	Genobioinfo Cluster: Ask for Install
FCS	The NCBI Foreign Contamination Screen (FCS) is a tool suite (FCS-adaptator et FCS-gx) for identifying and removing contaminant sequences in genome assemblies.	Genobioinfo Cluster: How to use
FCS-GX	FCS-GX detects contamination from foreign organisms in genome sequences. This tool is one module within the NCBI Foreign Contamination Screening (FCS) program suite.	Genobioinfo Cluster: How to use
Filtlong	Filtlong is a tool for filtering long reads by quality.	Genobioinfo Cluster: How to use
FLAS	FLAS is software that makes self-correction for PacBio long reads with fast speed and high throughput.	Genobioinfo Cluster: Ask for Install
Flexbar	Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases read mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.	Genobioinfo Cluster: How to use
FMLRC	FMLRC, or FM-index Long Read Corrector, is a tool for performing hybrid correction of long read sequencing using the BWT and FM-index of short-read sequencing data.	Genobioinfo Cluster: Ask for Install
FrameBot	RDP FrameBot is a tool for correcting frameshift errors caused by insertions and deletions in DNA sequences.	Genobioinfo Cluster: Ask for Install
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
ggCaller	A de Bruijn graph-based gene-caller and pangenome analysis tool.	Genobioinfo Cluster: How to use
HELEN	HELEN (Homopolymer Encoded Long-read Error-corrector for Nanopore) uses a Recurrent-Neural-Network (RNN) based Multi-Task Learning (MTL) model that can predict a base and a run-length for each genomic position using the weights generated by MarginPolish. This installation includes MarginPolish.	Genobioinfo Cluster: Ask for Install
HG-CoLoR	HG-CoLoR (Hybrid method based on a variable-order de bruijn Graph for the error Correction of Long Reads) is a hybrid method for the error correction of long reads that both aligns the short reads to the long reads, and uses a variable-order de Bruijn graph, in a seed-and-extend approach.	Genobioinfo Cluster: Ask for Install
HiFiAdapterFilt	Convert .bam to .fastq and remove reads with remnant PacBio adapter sequences.	Genobioinfo Cluster: How to use
Inspector	A tool for evaluate long-read de novo assembly results.	Genobioinfo Cluster: How to use
ITSxpress	Software to trim the ITS region of FASTQ sequences for amplicon sequencing analysis.	Genobioinfo Cluster: How to use
Jabba	A hybrid error correction tool for sequencing reads.	Genobioinfo Cluster: Ask for Install
KAD	KAD is designed for evaluating the accuracy of nucleotide base quality of genome assemblies.	Genobioinfo Cluster: Ask for Install
KAT	KAT (The K-mer Analysis Toolkit) is a suite of tools that generate, analyse and compare k-mer spectra produced from sequence files.	Genobioinfo Cluster: How to use
KCOSS	A fast and space-saving multi-threaded k-mer frequency statistics algorithm	Genobioinfo Cluster: Ask for Install
klumpy	Klumpy is a bioinformatic tool for identifying possibly incorrectly assembled regions in a long-read based assembly, with the additional capabilities of annotating sequences given a set of query sequences.	Genobioinfo Cluster: How to use
kmer-counter	A fast k-mer counter written in Rust.	Genobioinfo Cluster: How to use
komplexity	A command-line tool built in Rust to quickly calculate and/or mask low-complexity sequences from a FAST file. This uses the number of unique k-mers over a sequence divided by the length to assess complexity.	Genobioinfo Cluster: How to use
LongQC	LongQC is a tool for the data quality control of the PacBio and ONT long reads, and it has two functionalities: sample qc and platform qc.	Genobioinfo Cluster: How to use
LoRDEC	LoRDEC is a program to correct sequencing errors in long reads from 3rd generation sequencing with high error rate, and is especially intended for PacBio reads. It uses a hybrid strategy, meaning that it uses two sets of reads: the reference read set, whose error rate is assumed to be small, and the PacBio read set, which is then corrected using the reference set. Typically, the reference set contains Illumina reads.	Genobioinfo Cluster: How to use
mapDamage	tracking and quantifying damage patterns in ancient DNA sequences.	Genobioinfo Cluster: How to use
MECAT	MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads.	Genobioinfo Cluster: Ask for Install
Medaka	Medaka demonstrates a framework for error correcting sequencing data, particularly aimed at nanopore sequencing. Tools are provided for both training and inference. The code exploits the keras deep learning library.	Genobioinfo Cluster: How to use
MinIONQC	Fast and effective quality control for MinION and PromethION sequencing data	Genobioinfo Cluster: Ask for Install
MiniScrub	MiniScrub is a de novo long sequencing read preprocessing method that improves read quality by predicting and removing ("scrubbing") read segments that have a high concentration of errors.	Genobioinfo Cluster: Ask for Install
MitoZ	MitoZ is a Python3-based toolkit which aims to automatically filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences from the genome assembly result, annotate mitogenome (genbank file as result), and mitogenome visualization.	Genobioinfo Cluster: How to use
Musket	Musket is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing.	Genobioinfo Cluster: Ask for Install
Nano-Q	Python script for conservatively cleaning ONT reads from bam files and estimate variant frequencies.	Genobioinfo Cluster: Ask for Install
NanoComp	Compare multiple runs of long read sequencing data and alignments.	Genobioinfo Cluster: in Python-3.11.1
NanoFilt	Filtering and trimming of Oxford Nanopore sequencing data	Genobioinfo Cluster: Ask for Install
NanoLyse	Remove reads mapping to the lambda phage genome from a fastq file	Genobioinfo Cluster: How to use
NanoStat	Create statistic summary of an Oxford Nanopore read dataset	Genobioinfo Cluster: How to use
NextPolish	NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both.	Genobioinfo Cluster: How to use
nQuire	A statistical framework for ploidy estimation using NGS short-read data.	Genobioinfo Cluster: How to use
OMArk	OMArk is a software for proteome (protein-coding gene repertoire) quality assessment.	Genobioinfo Cluster: How to use
Pacasus	Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification.	Genobioinfo Cluster: Ask for Install
PanACoTA	PANgenome with Annotations, COre identification, Tree and corresponding Alignments.	Genobioinfo Cluster: How to use
PECAT	PECAT is a phased error correction and assembly tool for long reads. It includes a haplotype-aware correction method and an efficient diploid assembly method.	Genobioinfo Cluster: How to use
Platanus_trim	Platanus_trim is a tool for trimming adaptor sequences and low quality regions. In contrast, Platanus_internal_trim is a tool for trimming internal adaptor sequence, adaptor sequences, and low quality regions. Platanus_trim is designed for paired-end library and Platanus_internal_trim is for mate-pair library.	Genobioinfo Cluster: Ask for Install
Porechop	Porechop is a tool for finding and removing adapters from Oxford Nanopore reads. Adapters on the ends of reads are trimmed off, and when a read has an adapter in its middle, it is treated as chimeric and chopped into separate reads. Porechop performs thorough alignments to effectively find adapters, even at low sequence identity.	Genobioinfo Cluster: How to use
Porechop_ABI	Porechop_abi (ab initio) is an extension of Porechop that is able to infer the adapter sequence from the Oxford Nanopore reads. It discovers the adapter sequence from the reads using approximate k-mers and assembly, and add the sequence found to the adapter list (adapters.py file).	Genobioinfo Cluster: How to use
PRINSEQ	PRINSEQ is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. The standalone version is primarily designed for data preprocessing and does not generate summary statistics in graphical form.	Genobioinfo Cluster: How to use
Pychopper	Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.	Genobioinfo Cluster: How to use
pycoQC	pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)	Genobioinfo Cluster: How to use
PyroCleaner	PyroCleaner is intended to clean reads coming from pyrosequencing in order to ease the assembly process.	Genobioinfo Cluster: Ask for Install
Qualimap	Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.	Genobioinfo Cluster: How to use
Quorum	QuorUM (Quality Optimized Reads from the University of Maryland) is an error corrector for Illumina reads.	Genobioinfo Cluster: Ask for Install
READv2	Relationship Estimation from Ancient DNA version 2.	Genobioinfo Cluster: How to use
Recentrifuge	Robust comparative analysis and contamination removal for metagenomics	Genobioinfo Cluster: How to use
RSeQC	RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data	Genobioinfo Cluster: How to use
Sabre	A barcode demultiplexing and trimming tool for FastQ files.	Genobioinfo Cluster: Ask for Install
samblaster	samblaster is a fast and flexible program for marking duplicates in read-id grouped1 paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file.	Genobioinfo Cluster: How to use
schmutzi	Bayesian maximum a posteriori contamination estimate for ancient samples.	Genobioinfo Cluster: How to use
seqclean	SeqClean is a tool for validation and trimming of DNA sequences from a flat file database (FASTA format).	Genobioinfo Cluster: How to use
seqconverter	The command-line program seqconverter can read and write text files containing aligned or unaligned DNA or protein sequences.	Genobioinfo Cluster: How to use
seqOutATACBias	A CLI that corrects the sequence bias of Tn5 transposase in ATAC-seq data using a rule ensemble model.	Genobioinfo Cluster: How to use
seqOutBias	Universal correction of enzymatic sequence bias.	Genobioinfo Cluster: How to use
sequence-stats	A fast and beginner-friendly program to generate statistics from FASTQ and FASTA files (written AWK and Bash), e.g. genome assembly sizes and GC content (%).	Genobioinfo Cluster: How to use
sickle	Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads and also determines when the quality is sufficiently high enough to trim the 5'-end of reads.	Genobioinfo Cluster: How to use
SMRTLink	SMRT Link is the web-based end-to-end workflow manager for the Sequel™ System. (installed in mode command line on our cluster)	Genobioinfo Cluster: How to use
SnoReport	Computational identification of snoRNAs with unknown targets. Detecting novel or orphan snoRNAs in RNA sequence data using sequence and structure information only without relying on target information	Genobioinfo Cluster: Ask for Install
sonic	Some Organism's Nucleotide Information Container.	Genobioinfo Cluster: Ask for Install
Sprai	Sprai (single-pass read accuracy improver) is a tool to correct sequencing errors in single-pass reads for de novo assembly.	Genobioinfo Cluster: Ask for Install
spruceup	Tools to discover, visualize, and remove outlier sequences in large multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
Transrate	Transrate is software for de-novo transcriptome assembly quality analysis.	Genobioinfo Cluster: How to use
Trim Galore	A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries.	Genobioinfo Cluster: How to use
Trimmomatic	Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.The selection of trimming steps and their associated parameters are supplied on the command line.	Genobioinfo Cluster: How to use
UMI-tools	Tools for handling Unique Molecular Identifiers in NGS data sets	Genobioinfo Cluster: How to use
unique-kmer-counts	This program calculates the number of distinct k-mers for each sequence record in a fasta file and divides it by the total number of k-mers in that record.	Genobioinfo Cluster: How to use
Vcfstats	Vcfstats is a tool that can generate metrics from a vcf file.	Genobioinfo Cluster: How to use
VeChat	Correcting errors in noisy long reads using variation graphs.	Genobioinfo Cluster: How to use
verkko-fillet	verkko-fillet is an easy-to-use toolkit for cleaning Verkko assemblies.	Genobioinfo Cluster: How to use
ViPER	Bioinformatics pipeline used in the Laboratory of Viral Metagenomics (KU Leuven) to trim and assemble paired-end Illumina reads, and classify resulting contigs.	Genobioinfo Cluster: How to use
yacrd	Yet Another Chimeric Read Detector for long reads	Genobioinfo Cluster: Ask for Install
Yak	Yak is initially developed for two specific use cases: 1) to robustly estimate the base accuracy of CCS reads and assembly contigs, and 2) to investigate the systematic error rate of CCS reads.	Genobioinfo Cluster: Ask for Install

RAD-seq analysis

Application	Description	Availability/Use
BayesAss3-SNPs	Modification of BayesAss 3.0.4 to allow handling of large SNP datasets.	Genobioinfo Cluster: How to use
fineRADstructure	A complete, easy to use, and fast population inference package for RAD-seq data.	Genobioinfo Cluster: Ask for Install
fragmatic	Simple program for in silico restriction digest of genomic sequences, to simulate RAD-family NGS library prep methods.	Genobioinfo Cluster: Ask for Install
ipyrad	An interactive toolkit for assembly and analysis of restriction-site associated genomic data sets (e.g., RAD, ddRAD, GBS) for population genetic and phylogenetic studies.	Genobioinfo Cluster: How to use
RADIS	Analysis of RAD-seq data for InterSpecific phylogeny	Genobioinfo Cluster: Ask for Install
radsex	Find sex signal in RAD-Sequencing data.	Genobioinfo Cluster: How to use
Stacks	Stacks is a software suite for analysing RAD Sequencing data by Julian Catchen at the University of Oregon. It will process raw Illumina RAD data or RAD data aligned to a reference genome, and produce genotypes that can be viewed and filtered via a web interface.	Genobioinfo Cluster: How to use

Repeats

Application	Description	Availability/Use
CENSOR	CENSOR compares and masks protein or nucleotide sequences.	Genobioinfo Cluster: How to use
centroAnno	centroAnno is a prior-independent tool for automatic and efficient centromere/tendem repeat structural analysis across multiple species. centroAnno supports the analysis of repeat units and higher-order tandem repeat units (HORs) in genome/assembly, centromere sequence, and single sequencing long read.	Genobioinfo Cluster: How to use
cnD	cnD is a program to detect copy number variants from short-read sequence data.	Genobioinfo Cluster: How to use
EarlGrey	A fully automated TE curation and annotation pipeline.	Genobioinfo Cluster: How to use
ExpansionHunter	A tool for estimating repeat sizes.	Genobioinfo Cluster: How to use
GraffiTE	GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.	Genobioinfo Cluster: How to use
HipSTR	Genotype and phase short tandem repeats using Illumina whole-genome sequencing data.	Genobioinfo Cluster: How to use
ICEscreen	ICEscreen is a bioinformatic pipeline for the detection and annotation of ICEs (Integrative and Conjugative Elements) and IMEs (Integrative and Mobilizable Elements) in Bacillota genomes.	Genobioinfo Cluster: How to use
ISEScan	A python pipeline to identify IS (Insertion Sequence) elements in genome and metagenome. ISEScan can be used to identify/annotate full-length or non-full-length IS elements in any DNA sequence but ISEScan was only tested on prokarytoic genome including draft genome and meta-genome. Among the existing tools identifying IS elements, ISEScan might be the only one that gives TIR (Terminal Inverted Repeat) sequences.	Genobioinfo Cluster: How to use
Look4TRs	A de-novo tool for detecting simple tandem repeats using self-supervised hidden Markov models.	Genobioinfo Cluster: How to use
LTR_FINDER_parallel	A parallel wrapper for LTR_FINDER (LTR_Finder is an efficient program for finding full-length LTR retrotranspsons in genome sequences.)	Genobioinfo Cluster: Ask for Install
LtrDetector	A tool-suite for detecting long terminal repeat retrotransposons de-novo on the genomic scale.	Genobioinfo Cluster: How to use
MCHelper	An automatic tool to curate transposable element libraries.	Genobioinfo Cluster: How to use
MinCED	MinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such as metagenomes, in which sequence size can be anywhere from 100 to 800 bp. MinCED runs from the command-line and was derived from CRT	Genobioinfo Cluster: Ask for Install
Mobster	Mobster is used to detect novel (non-reference) Mobile Element Insertion (MEI) events in BAM files and uses both a discordant read pair method and a split-read method.	Genobioinfo Cluster: Ask for Install
mreps	Software for tandem repeat identification in DNA.	Genobioinfo Cluster: How to use
MSIsensor-pro	MSIsensor-pro is an updated version of msisensor. MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data.	Genobioinfo Cluster: How to use
NSEG	NSEG is used to mask nucleic acid sequences, needed by RepeatScout.	Genobioinfo Cluster: How to use
ntHits	ntHits is a method for identifying reapeats in high-throughput DNA sequencing data.	Genobioinfo Cluster: Ask for Install
Onecodetofindthemall	One code to find them all is a set of perl scripts to extract useful information from RepeatMasker about transposable elements, retrieve their sequences and get some quantitative information.	Genobioinfo Cluster: Ask for Install
pantera	Identification of transposable element families from pangenome polymorphisms. A pangenome is a collection of genomes or haplotypes that can be aligned and stored as a variation graph in gfa format. pantera receives as input a list of gfa files of non overlapping variation graphs and produces a library of transposable elements found to be polymorphic on that pangenome.	Genobioinfo Cluster: How to use
parseRM	Few scripts facilitating the extraction of info from Repeat Masker .out files	Genobioinfo Cluster: Ask for Install
PILER	Genomic repeat analysis software.	Genobioinfo Cluster: How to use
PILERCR	PILERCR is public domain software for finding CRISPR repeats.	Genobioinfo Cluster: Ask for Install
RDXplorer	The RDXplorer (Read Depth eXplorer) is a computational tool for copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage.	Genobioinfo Cluster: How to use
RECON	A package for automated de novo identification of repeat families from genomic sequence.	Genobioinfo Cluster: How to use
RepAHR	RepAHR is used to identify repeats(repetitive sequences) in genome using Next-Generation Sequencing reads.	Genobioinfo Cluster: Ask for Install
REPdenovo	REPdenovo is designed for constructing repeats directly from sequence (paired-end) reads. It based on the idea of frequent k-mer assembly. REPdenovo provides many functionalities, and can generate much longer repeats than existing tools. Internally, REPdenovo uses Jellyfish for k-mer counting, Velvet for assembly, and bwa to map reads on the Transposable Elements.	Genobioinfo Cluster: Ask for Install
RepeatAfterMe	A package for the extension of repetitive DNA sequences.	Genobioinfo Cluster: How to use
RepeatExplorer	RepeatExplorer is a computational pipeline designed to identify and characterize repetitive DNA elements in next-generation sequencing data from plant and animal genomes.	Genobioinfo Cluster: Ask for Install
RepeatMasker	RepeatMasker is a program that screens DNA sequences for interspersed repeats (thanks to RepBase repeats databanks specially formatted) and low complexity DNA sequences.	Genobioinfo Cluster: How to use
RepeatModeler	RepeatModeler is a de-novo repeat family identification and modeling package.	Genobioinfo Cluster: How to use
RepeatScout	RepeatScout is a tool to discover repetitive substrings in DNA.	Genobioinfo Cluster: How to use
RepEnrich2	RepEnrich2 is an updated method to estimate repetitive element enrichment using high-throughput sequencing data.	Genobioinfo Cluster: Ask for Install
REPET	The REPET package (t Flutre et al, 2011 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale.	Genobioinfo Cluster: How to use
RetroSeq	RetroSeq is a bioinformatics tool that searches for mobile element insertions from aligned reads in a BAM file and a library of reference transposable elements.	Genobioinfo Cluster: Ask for Install
REViewer	A tool for visualizing alignments of reads in regions containing tandem repeats	Genobioinfo Cluster: How to use
RMBlast	RMBlast is a RepeatMasker compatible version of the standard NCBI BLAST suite. The primary difference between this distribution and the NCBI distribution is the addition of a new program "rmblastn" for use with RepeatMasker and RepeatModeler. RMBlast supports RepeatMasker searches by adding a few necessary features to the stock NCBI blastn program. These include: - Support for custom matrices ( without KA-Statistics ). - Support for cross_match-like complexity adjusted scoring. Cross_match is Phil Green's seeded smith-waterman search algorithm. - Support for cross_match-like masklevel filtering.	Genobioinfo Cluster: How to use
SEDEF	SEDEF is a quick tool to find all segmental duplications in the genome.	Genobioinfo Cluster: Ask for Install
spaTyper	Computational method for finding spa types. Staphylococcus aureus is a major human pathogen causing skin and tissue infections, pneumonia, septicemia, and device-associated infections. The emergence of strains resistant to methicillin (MRSA) and other antibacterial agents has become a major concern, especially in the hospital environment, because of the high mortality of the infections caused by these strains. Single locus DNA-sequencing of the repeat region of the Staphylococcus protein A gene (spa) can be used for reliable, accurate and discriminatory typing of MRSA. Repeats are assigned a numerical code and the spa-type is deduced from the order of specific repeats. However, spa-typing was hampered in the past by the lack of a consensus on assignments of new spa-repeats and -types.	Genobioinfo Cluster: Ask for Install
SQuIRE	SQuIRE reveals locus-specific regulation of interspersed repeat expression, Nucleic Acids Research	Genobioinfo Cluster: How to use
SRF	Satellite Repeat Finder, or SRF in brief, assembles motifs in satellite DNA that are tandemly repeated many times in the genome.	Genobioinfo Cluster: How to use
T-lex	T-lex is a computational pipeline that detects presence and/or absence of annotated individual transposable elements (TEs) using next-generation sequencing (NGS) data.	Genobioinfo Cluster: Ask for Install
Tandem Repeats Finder	Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides.	Genobioinfo Cluster: How to use
TE_finder	A suite of C++ programs developed for transposable element search and their annotation in large eukaryotic genome sequence. A part of the REPET package.	Genobioinfo Cluster: How to use
TEsorter	It is coded for LTR_retriever to classify long terminal repeat retrotransposons (LTR-RTs) at first. It can also be used to classify any other TE sequences, including Class I and Class II elements which are covered by the REXdb database.	Genobioinfo Cluster: How to use
TETools	Dfam TE Tools includes RepeatMasker, RepeatModeler, and coseg. This container is an easy way to get a minimal yet fully functional installation of RepeatMasker and RepeatModeler and is additionally useful for testing or reproducibility purposes.	Genobioinfo Cluster: How to use
tidk	tidk is a toolkit to identify and visualise telomeric repeats for the Darwin Tree of Life genomes.	Genobioinfo Cluster: How to use
Tn3+TA_finder	Tn3 Transposon/Toxin Finder (Tn3+TA_finder) is a program for the automatic prediction of transposable elements of the Tn3 family associated with type II toxin and antitoxin pairs in bacteria and archaea.	Genobioinfo Cluster: How to use
TnComp_finder	Composite Transposon Finder (TnComp_finder) is a program for the prediction of putative composite transposons in bacterial and archaeal genomes based on insertion sequence replicas in a relatively short span.	Genobioinfo Cluster: How to use
transposon_annotation_tools	A set of bioconda packages for transposon annotation and transposon feature annotation in nucleotide sequences. transposon_annotation_tools is part of TransposonUltimate. The package includes a series of transposable element discovery tools, such as: MUSTv2, HelitronScanner, SineFinder, MiteTracker, MiteFinderII, SineScan, TirVish, LtrHarvest, RepeatModeler, TransposonPSI, and TransposonProteinNCBICDD1000. You can then use these tools independently.	Genobioinfo Cluster: How to use
transposon_classifier_rfsb	Transposon classification tool for nucleotide sequence classification, providing classification, model training and prediction evaluation. RFSB is part of TransposonUltimate.	Genobioinfo Cluster: How to use
TRASH	Tandem Repeat Annotation and Structural Hierarchy: a package to identify and extract tandem repeats in genome sequences and investigate their higher order structures.	Genobioinfo Cluster: How to use

Ribo-seq

Application	Description	Availability/Use
RiboTaper	RiboTaper is a new analysis pipeline for Ribosome Profiling (Ribo-seq) experiments, which exploits the triplet periodicity of ribosomal footprints to call translated regions.	Genobioinfo Cluster: Ask for Install

scDNA

Application	Description	Availability/Use
CellRanger ARC	Cell Ranger ARC's pipelines analyze sequencing data produced from Chromium Single Cell Multiome ATAC + Gene Expression.	Genobioinfo Cluster: How to use
Cellsnp-lite	Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs.	Genobioinfo Cluster: How to use
SEACells	Single-cEll Aggregation for High Resolution Cell States. SEACells algorithm for Inference of transcriptional and epigenomic cellular states from single-cell genomics data.	Genobioinfo Cluster: How to use
souporcell	souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.	Genobioinfo Cluster: How to use

Sequences alignment

Application	Description	Availability/Use
3rdChimeraMiner	Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.	Genobioinfo Cluster: How to use
AC-DIAMOND	AC-DIAMOND attempts to speed up DIAMOND via better SIMD parallelization and compressed indexing. Experimental results show that AC-DIAMOND was about 6~7 times faster than DIAMOND on aligning DNA reads or contigs while retaining the essentially the similar sensitivity. AC-DIAMOND was developped based on DIAMOND v0.7.9.	Genobioinfo Cluster: Ask for Install
Accel-align	Accel-align is a fast alignment tool implemented in C++ programming language.	Genobioinfo Cluster: Ask for Install
ALFATClust	ALignment-Free Adaptive Threshold Clustering:Biological sequence clustering tool with dynamic threshold for individual clusters. Suitable for clustering multiple groups of homologous sequences.	Genobioinfo Cluster: How to use
AMBER	Analyzing Mapping Biases and Evaluating Read Reliability.	Genobioinfo Cluster: How to use
AnchorWave	AnchorWave (Anchored Wavefront Alignment) identifies collinear regions via conserved anchors (full-length CDS and full-length exon have been implemented currently) and breaks collinear regions into shorter fragments, i.e., anchor and inter-anchor intervals.	Genobioinfo Cluster: Ask for Install
AvP	AvP performs automatic detection of HGT candidates within a phylogenetic framework.	Genobioinfo Cluster: How to use
Back_to_sequences	Given a set of kmers (fasta / fastq <.gz> format) and a set of sequences (fasta / fastq <.gz> format), this tool will extract the sequences containing some of those kmers.	Genobioinfo Cluster: How to use
bam2plot	Make coverage plots from bam files.	Genobioinfo Cluster: How to use
Bamstats (notsame as BAMstats)	Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.	Genobioinfo Cluster: Ask for Install
BBMap	a short read aligner, as well as various other bioinformatic tools.	Genobioinfo Cluster: How to use
BCOOL	BCOOL is a read corrector for NGS sequencing data that align reads on a de Bruijn graph.	Genobioinfo Cluster: Ask for Install
BELLA	A computationally-efficient and highly-accurate long-read to long-read aligner and overlapper.	Genobioinfo Cluster: Ask for Install
biohazard-tools	This is a collection of command line utilities that do useful stuff involving BAM files for Next Generation Sequencing data.	Genobioinfo Cluster: Ask for Install
Blasr	Reference-based alignment	Genobioinfo Cluster: ( See SMRTLink)
blat	The BLAST-Like Alignment Tool: similarity search in databanks. BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more.	Genobioinfo Cluster: How to use
BOLDigger	Python program to query .fasta files against the different databases of www.boldsystems.org	Genobioinfo Cluster: How to use
Bowtie	Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).	Genobioinfo Cluster: How to use
Bowtie2	Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.	Genobioinfo Cluster: How to use
BPP	Bayesian analysis of genomic sequence data under the multispecies coalescent model.	Genobioinfo Cluster: Ask for Install
BS-Seeker2-	BS-Seeker2 is a seamless and versatile pipeline for accurately and fast mapping the bisulfite-treated reads.	Genobioinfo Cluster: Ask for Install
BSMAP	BSMAP is a short reads mapping software for bisulfite sequencing reads. Bisulfite treatment converts unmethylated Cytosines into Uracils (sequenced as Thymine) and leave methylated Cytosines unchanged, hence provides a way to study DNA cytosine methylation at single nucleotide resolution. BSMAP aligns the Ts in the reads to both Cs and Ts in the reference	Genobioinfo Cluster: Ask for Install
bwa	Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates.	Genobioinfo Cluster: How to use
bwa-mem2	Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces alignment identical to bwa and is ~80% faster.	Genobioinfo Cluster: How to use
bwa-meth	Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.	Genobioinfo Cluster: How to use
Cactus	Cactus is a reference-free whole-genome alignment program, as well as a pagenome graph construction toolkit.	Genobioinfo Cluster: How to use
chromeister	A dotplot generator for large chromosomes.	Genobioinfo Cluster: How to use
Clustal Omega	Clustal Omega is the latest addition to the Clustal family. It offers a significant increase in scalability over previous versions, allowing hundreds of thousands of sequences to be aligned in only a few hours. It will also make use of multiple processors, where present. In addition, the quality of alignments is superior to previous versions, as measured by a range of popular benchmarks	Genobioinfo Cluster: How to use
ClustalW	Multiple sequence alignment program for DNA or proteins.	Genobioinfo Cluster: Ask for Install
Conterminator	Conterminator is an efficient method for detecting incorrectly labeled sequences across kingdoms by an exhaustive all-against-all sequence comparison.	Genobioinfo Cluster: Ask for Install
DALIGNER	The commands below permit one to find all significant local alignments between reads encoded in Dazzler database. The assumption is that the reads are from a PACBIO RS IIlong read sequencer.	Genobioinfo Cluster: Ask for Install
DIAMOND	Accelerated BLAST compatible local sequence aligner.	Genobioinfo Cluster: How to use
ecoPrimers	ecoPrimer is a barcoding software which is written in C language. It finds universal primers from a set of input DNA sequences by finding conserved regions without "a priori" on candidate sequences. It also evaluates the quality of the primers and barcode regions by measuring the "barcode specificity" and "barcode coverage" indices	Genobioinfo Cluster: How to use
EGAPx	EGAPx is the publicly accessible version of the updated NCBI Eukaryotic Genome Annotation Pipeline.	Genobioinfo Cluster: How to use
eggNog-mapper	eggnog-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database to transfer functional information from fine-grained orthologs only.	Genobioinfo Cluster: How to use
EMA	EMA uses a latent variable model to align barcoded short-reads (such as those produced by 10x Genomics' sequencing platform).	Genobioinfo Cluster: Ask for Install
EMBOSS	EMBOSS is "The European Molecular Biology Open Software Suite". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.	Genobioinfo Cluster: How to use
Exonerate	A generic tool for sequence alignment.	Genobioinfo Cluster: How to use
FAMSA	Algorithm for large-scale multiple sequence alignments (400k proteins in 2 hours and 8BG of RAM)	Genobioinfo Cluster: Ask for Install
FASTA	FASTA is a sequence similarity search tool which uses heuristics for fast local alignment searching.	Genobioinfo Cluster: How to use
FastANI	FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI)	Genobioinfo Cluster: How to use
fixchr	This package selects homologous chromosomes between two genomes by comparing whole-genome alignments between them. Additionally, it generates dotplots for quick checking of the output.	Genobioinfo Cluster: How to use
fpa	Filter Pairwise Alignment	Genobioinfo Cluster: Ask for Install
gaftools	gaftools is a fast and comprehensive toolkit designed for processing pangenome alignments.	Genobioinfo Cluster: How to use
GEM-library	A set of very optimized tools for indexing/querying huge genomes/files.	Genobioinfo Cluster: How to use
GeneSeqer	Sensitive spliced-alignment of cDNAs or proteins.	Genobioinfo Cluster: Ask for Install
Genewise	Wise2 is a package focused on comparisons of biopolymers, commonly DNA sequence and protein sequence.	Genobioinfo Cluster: How to use
GenMap	Fast and Exact Computation of Genome Mappability.	Genobioinfo Cluster: How to use
GenomOrder	GenomOrder is a Nextflow pipeline reordering and renaming scaffolds from up to 5 assemblies using a reference. It is also able to produce D-Genies back-up files allowing rapid visual comparison of chromosomes of the assemblies versus the reference. These files can be uploaded and visualized with the online tool D-Genies : http://dgenies.toulouse.inra.fr/ The assembly mapping versus the reference is performed with minimap2. These assemblies can be scaffolded or not. If they are not, an option enables to scaffold them according to the reference. The pipeline produces D-Genies back-up file for a user defined list of reference chromosomes. The chromosome file contains one reference chromosome name per line.	Genobioinfo Cluster: Ask for Install
GMAP-GSNAP	GMAP: A Genomic Mapping and Alignment Program for mRNA and EST SequencesGSNAP: Genomic Short-read Nucleotide Alignment Program	Genobioinfo Cluster: How to use
Goalign	Goalign is a set of command line tools to manipulate multiple alignments.	Genobioinfo Cluster: How to use
GraphAligner	Seed-and-extend program for aligning long error-prone reads to genome graphs.	Genobioinfo Cluster: How to use
GraphMap	A highly sensitive and accurate mapper for long, error-prone reads.	Genobioinfo Cluster: Ask for Install
GSAlign	An ultra-fast sequence alignment algorithm for intra-species genome comparison.	Genobioinfo Cluster: How to use
HarvestTools	HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations.	Genobioinfo Cluster: How to use
HISAT2	HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome).	Genobioinfo Cluster: How to use
i-ADHoRe	i-ADHoRe is a highly sensitive software tool to detect degenerated homology relations within and between different genomes.	Genobioinfo Cluster: How to use
JustOrthologs	A Fast, Accurate, and User-Friendly Ortholog-Finding Algorithm	Genobioinfo Cluster: Ask for Install
KMA	KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend.	Genobioinfo Cluster: How to use
LAST	LAST finds similar regions between sequences.	Genobioinfo Cluster: How to use
lastp_aai	A simple Python script for calculating pairwise amino acid identity (AAI) between protein files (extension .faa)	Genobioinfo Cluster: Ask for Install
LASTZ	A tool for aligning two DNA sequences, and inferring appropriate scoring parameters automatically.	Genobioinfo Cluster: How to use
leeHom	A program for the Bayesian reconstruction of ancient DNA.	Genobioinfo Cluster: Ask for Install
MACSE	Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons: a wide range of molecular analyses relies on multiple sequence alignments (MSA).	Genobioinfo Cluster: How to use
MAFFT	MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <?200 sequences), FFT-NS-2 (fast; for alignment of <?10,000 sequences), etc.	Genobioinfo Cluster: How to use
Magic-BLAST	Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome.	Genobioinfo Cluster: How to use
MALT	MALT (MEGAN alignment tool) is an extension of MEGAN (metagenome analyzer). MALT performs alignment of metagenomic reads against a database of reference sequences (such as NR, GenBank or Silva) and produces a MEGAN RMA file as output. The software is currently under development.	Genobioinfo Cluster: How to use
MapSplice	Accurate mapping of RNA-seq reads for splice junction discovery.	Genobioinfo Cluster: Ask for Install
MashMap	MashMap implements a fast and approximate algorithm for computing local alignment boundaries between long DNA sequences. It can be useful for mapping genome assembly or long reads (PacBio/ONT) to reference genome(s). Given a minimum alignment length and an identity threshold for the desired local alignments, Mashmap computes alignment boundaries and identity estimates using k-mers. It does not compute the alignments explicitly, but rather estimates a k-mer based Jaccard similarity using a combination of Minimizers and MinHash. This is then converted to an estimate of sequence identity using the Mash distance.	Genobioinfo Cluster: Ask for Install
MCScanX	MCScan is an algorithm to scan multiple genomes or subgenomes to identify putative homologous chromosomal regions, then align these regions using genes as anchors.	Genobioinfo Cluster: How to use
MECAT	MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads.	Genobioinfo Cluster: Ask for Install
MetaMaps	MetaMaps is tool specifically developed for the analysis of long-read (PacBio/ONT) metagenomic datasets. It simultaenously carries out read assignment and sample composition estimation. It is faster than classical exact alignment-based approaches, and its output is more information-rich than that of kmer-spectra-based methods. For example, each MetaMaps alignment comes with an approximate alignment location, an estimated alignment identity and a mapping quality.	Genobioinfo Cluster: Ask for Install
meteor	Meteor (Metagenomic Explorator), a software for profiling metagenomic data at gene level.	Genobioinfo Cluster: How to use
Minigraph	Minigraph is a sequence-to-graph mapper and graph constructor.	Genobioinfo Cluster: How to use
Minimap	Experimental tool to find approximate mapping positions between long sequences	Genobioinfo Cluster: How to use
Miniprot	Aligning proteins to genomes with splicing and frameshift.	Genobioinfo Cluster: How to use
MixMHCpred	A pan-Allele predictor designed for predicting peptide bindings to MHC alleles and can also perform sequence alignment and binding motif plotting.	Genobioinfo Cluster: How to use
MMseqs2	MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge proteins/nucleotide sequence sets.	Genobioinfo Cluster: How to use
MOSAIK	MOSAIK is a reference-guided assembler comprising of two main modular programs	Genobioinfo Cluster: Ask for Install
Mugsy	Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for pairwise alignment, a custom graph based segmentation procedure for identifying collinear regions, and the segment-based progressive multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft genomes in the form of multi-FASTA files and does not require a reference genome. Angiuoli SV and Salzberg SL. Mugsy: Fast multiple alignment of closely related whole genomes. Bioinformatics 2011 27(3):334-4	Genobioinfo Cluster: How to use
MultAlin	Multiple sequence alignment with hierarchical clustering.	Genobioinfo Cluster: Ask for Install
MUMmer	MUMmer is a package for rapidly aligning entire genomes, whether in complete or draft form.	Genobioinfo Cluster: How to use
MUSCLE	Multiple sequence alignment (nucleic or proteic).	Genobioinfo Cluster: How to use
NCBI_Blast	Similarity search against databanks.	Genobioinfo Cluster: How to use
NCBI_Blast+	Similarity search against databanks.	Genobioinfo Cluster: How to use
NGMLR	NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations SV detection from paired end reads mapping	Genobioinfo Cluster: How to use
PALEOMIX	The PALEOMIX pipeline is a set of free and open-source pipelines and tools designed to enable the rapid processing of Next Generation Sequencing (NGS) data, starting from de-multiplexed reads from one or more samples, through sequence processing and alignment, and ending with genotyping, phylogenetic inference on the samples, as well as metagenomic analysis of the samples.	Genobioinfo Cluster: How to use
Paragraph	Graph realignment tools for structural variants.	Genobioinfo Cluster: How to use
Parsnp	Parsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection.	Genobioinfo Cluster: How to use
PASTA	PASTA estimates alignments and ML trees from unaligned sequences using an iterative approach. In each iteration, it first estimates a multiple sequence alignment using the current tree as a guide and then estimates an ML tree on (a masked version of) the alignment.	Genobioinfo Cluster: How to use
pblat	Parallelized blat with multi-threads support.	Genobioinfo Cluster: How to use
pbmm2	A minimap2 frontend for PacBio native data format.	Genobioinfo Cluster: How to use
picard-tools	Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.	Genobioinfo Cluster: How to use
PLAST	PLAST is a fast, accurate and NGS scalable bank-to-bank sequence similarity search tool providing significant accelerations of seeds-based heuristic comparison methods, such as the Blast suite of algorithms.	Genobioinfo Cluster: Ask for Install
PRANK	PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences.	Genobioinfo Cluster: How to use
PROBCONSRNA	PROBCONS is a tool for generating multiple alignments of protein sequences.	Genobioinfo Cluster: Ask for Install
ProgressiveCactus	Progressive Cactus is a whole-genome alignment package.	Genobioinfo Cluster: Ask for Install
ProtHint	ProtHint is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.	Genobioinfo Cluster: How to use
Puffaligner	Puffaligner is a fast, sensitive and accurate aligner built on top of the Pufferfish index.	Genobioinfo Cluster: Ask for Install
pysam	Pysam is a python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API.	Genobioinfo Cluster: Ask for Install
pysamstats	A Python utility for calculating statistics against genome positions based on sequence alignments from a SAM or BAM file.	Genobioinfo Cluster: How to use
qgrs-cpp	C++ implementation of QGRS mapping algorithm (QGRS Mapper is a software program that generates information on composition and distribution of putative Quadruplex forming G-Rich Sequences (QGRS) in nucleotide sequences.)	Genobioinfo Cluster: Ask for Install
Qualimap	Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.	Genobioinfo Cluster: How to use
quant3p	A set of scripts for 3' RNA-seq quantification.	Genobioinfo Cluster: How to use
read2tree	read2tree is a software tool that allows to obtain alignment matrices for tree inference.	Genobioinfo Cluster: How to use
ROSE	To create stitched enhancers, and to separate super-enhancers from typical enhancers using sequencing data (.bam) given a file of previously identified constituent enhancers (.gff)	Genobioinfo Cluster: How to use
Salmon	Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using lightweight alignments	Genobioinfo Cluster: How to use
samblaster	samblaster is a fast and flexible program for marking duplicates in read-id grouped1 paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file.	Genobioinfo Cluster: How to use
samclip	Filter SAM file for soft and hard clipped alignments	Genobioinfo Cluster: Ask for Install
samtools	SAM (Sequence Alignment/Map). SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.	Genobioinfo Cluster: How to use
Satsuma	Highly sensitive whole-genome synteny alignments.	Genobioinfo Cluster: Ask for Install
SECAPR	Used to process targeted sequencing (or Gene capture) data by applying assembly and subsequent mapping algorithms (reducing paralogs unlike Hybpiper).	Genobioinfo Cluster: Ask for Install
segemehl	segemehl is a software to map short sequencer reads to reference genomes. segemehl implements a matching strategy based on enhanced suffix arrays (ESA).	Genobioinfo Cluster: Ask for Install
seqOutBias	Universal correction of enzymatic sequence bias.	Genobioinfo Cluster: How to use
seqwish	seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments.	Genobioinfo Cluster: How to use
shiver	shiver is a tool for mapping paired-end short reads to a custom reference sequence constructed using do novo assembled contigs, in order to minimise the biased loss of information that occurs from mapping to a reference that differs from the sample.	Genobioinfo Cluster: Ask for Install
SHRiMP	SHRiMP is a software package for aligning genomic reads against a target genome. It was primarily developed with the multitudinous short reads of next generation sequencing machines in mind, as well as Applied Biosystem's colourspace genomic representation.	Genobioinfo Cluster: Ask for Install
SibeliaZ	SibeliaZ is a whole-genome alignment and locally-coliinear blocks construction pipeline. The blocks coordinates are output in GFF format and the alignment is in MAF.	Genobioinfo Cluster: How to use
Silix	The software package SiLiX implements an ultra-efficient algorithm for the clustering of homologous sequences, based on single transitive links (single linkage) with alignment coverage constraints.	Genobioinfo Cluster: How to use
SLR	SLR is a program to detect sites in coding DNA that are unusually conserved and/or unusually variable (that is, evolving under purify or positive selection) by analysing the pattern of changes for an alignment of sequences on an evolutionary tree.	Genobioinfo Cluster: How to use
SMALT	SMALT aligns DNA sequencing reads with a reference genome. Reads from a wide range of sequencing platforms can be processed, for example Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger. Paired reads are supported. There is no support for SOLiD reads.	Genobioinfo Cluster: Ask for Install
Spaln	Spaln (space-efficient spliced alignment) is a stand-alone program that maps and aligns a set of cDNA or protein sequences onto a whole genomic sequence in a single job.	Genobioinfo Cluster: How to use
SpeedSeq	A flexible framework for rapid genome analysis and interpretation.	Genobioinfo Cluster: Ask for Install
Spoa	A multiple sequence alignment tool/library that implements the POA (partial order alignement) algorithm using SIMD.	Genobioinfo Cluster: Ask for Install
spruceup	Tools to discover, visualize, and remove outlier sequences in large multiple sequence alignments.	Genobioinfo Cluster: Ask for Install
SqueezeMeta	A fully automated metagenomics pipeline, from reads to bins.	Genobioinfo Cluster: How to use
srnaMapper	This tool maps reads produced by sRNA-Seq to a genome.	Genobioinfo Cluster: Ask for Install
STAR	RNA-seq aligner	Genobioinfo Cluster: How to use
StrobeAlign	Aligns short reads using dynamic seed size with strobemers.	Genobioinfo Cluster: Ask for Install
Subread	A tool kit for processing next-gen sequencing data	Genobioinfo Cluster: How to use
Sumaclust	Fast and exact clustering of sequences.	Genobioinfo Cluster: How to use
SYNY	The SYNY pipeline investigates gene collinearity (synteny) between genomes by reconstructing clusters from conserved pairs of protein-coding genes identified from DIAMOND homology searches. It also infers collinearity from pairwise genome alignments with minimap2 or MashMap3.	Genobioinfo Cluster: How to use
T-Coffee	T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle...) into one unique alignmen.	Genobioinfo Cluster: How to use
Tophat	TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.	Genobioinfo Cluster: How to use
Tracy	Tracy is an efficient and versatile command-line application to basecall, align, assemble and deconvolute Sanger Chromatogram trace files.	Genobioinfo Cluster: Ask for Install
trimAl	trimAl: a tool for automated alignment trimmin	Genobioinfo Cluster: How to use
uLTRA	uLTRA is a tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations.	Genobioinfo Cluster: How to use
USEARCH	USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.	Genobioinfo Cluster: How to use
vcf2maf	Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms.	Genobioinfo Cluster: How to use
vg	Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.	Genobioinfo Cluster: How to use
VSEARCH	Versatile open-source tool for metagenomics	Genobioinfo Cluster: How to use
WFA	The wavefront alignment (WFA) algorithm is an exact gap-affine algorithm that takes advantage of homologous regions between the sequences to accelerate the alignment process.	Genobioinfo Cluster: Ask for Install
wfmash	wfmash is an aligner for pangenomes based on sparse homology mapping and wavefront inception.	Genobioinfo Cluster: Ask for Install
wgatools	A Rust library and tools for whole genome alignment files.	Genobioinfo Cluster: How to use
WGDI	WGDI (Whole-Genome Duplication Integrated analysis), a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidizations and cross-species genome alignments.	Genobioinfo Cluster: How to use
Winnowmap	Winnowmap is a long-read mapping algorithm, and a result of our exploration into superior minimizer sampling techniques.	Genobioinfo Cluster: Ask for Install
wu-blast	Similarity search against databanks, Washington University Blast.(OBSOLETE)	Genobioinfo Cluster: How to use

Single-cell

Application	Description	Availability/Use
alevin-fry	alevin-fry is an efficient and flexible tool for processing single-cell sequencing data, currently focused on single-cell transcriptomics and feature barcoding.	Genobioinfo Cluster: How to use
BLAZE	Barcode identification from Long reads for AnalyZing single-cell gene Expression. SingleCell Nanopore sequencing data analysis.	Genobioinfo Cluster: How to use
CellBender	CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.	Genobioinfo Cluster: How to use
SAMap	Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.	Genobioinfo Cluster: How to use
Scanpy	Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.	Genobioinfo Cluster: How to use
vireoSNP	Vireo: Variational Inference for Reconstructing Ensemble Origin by expressed SNPs in multiplexed scRNA-seq data.	Genobioinfo Cluster: How to use

SNP and structural variations

Application	Description	Availability/Use
ACACIA	Allele CAlling proCedure for Illumina Amplicon sequencing data: This workflow aims at extracting allele information out of paired-end Illumina FASTQC files.	Genobioinfo Cluster: How to use
AdamaJava	The AdamaJava project holds code for variant callers and pipeline tools related to next-generation sequencing (NGS).	Genobioinfo Cluster: How to use
adegenet	R package dedicated to the exploratory analysis of genetic data. It implements a set of tools ranging from multivariate methods to spatial genetics and genome-wise SNP data analysis	Genobioinfo Cluster: Ask for Install
ALFATClust	ALignment-Free Adaptive Threshold Clustering:Biological sequence clustering tool with dynamic threshold for individual clusters. Suitable for clustering multiple groups of homologous sequences.	Genobioinfo Cluster: How to use
AlleleSeq	pipeline which constructs a diploid personal genome from genomic sequence variants of a family trio, including SNPs, indels and structural variants and maps functional genomic data onto this personal genome.	Genobioinfo Cluster: Ask for Install
ANGSD	ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities.	Genobioinfo Cluster: How to use
AnnotSV	An integrated tool for Structural Variations annotation and ranking.	Genobioinfo Cluster: How to use
ANNOVAR	ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others)	Genobioinfo Cluster: Ask for Install
Aquila	Diploid personal genome assembly and comprehensive variant detection based on linked-reads.	Genobioinfo Cluster: Ask for Install
Assemblytics	Assemblytics is a bioinformatics tool to detect and analyze structural variants from a genome assembly by comparing it to a reference genome.	Genobioinfo Cluster: Ask for Install
ATLAS	ATLAS stands for Analysis Tools for Low-coverage and Ancient Samples. These tools cover all programs necessary to obtain variant calls, estimates of heterozygosity and more from a BAM file. There are sequence data processing tools, diagnostic tools, and variant discovery tools, similar to GATK by the Broad Institute.	Genobioinfo Cluster: How to use
BayesAss3-SNPs	Modification of BayesAss 3.0.4 to allow handling of large SNP datasets.	Genobioinfo Cluster: How to use
BCFtools	utilities for variant calling and manipulating VCFs and BCFs.	Genobioinfo Cluster: How to use
Beagle	BEAGLE is a state of the art software package for analysis of large-scale genetic data sets with hundreds of thousands of markers genotyped on thousands of samples.	Genobioinfo Cluster: How to use
BISER		Genobioinfo Cluster: How to use
BisSNP	Accurate combined SNP/Methylation calling.	Genobioinfo Cluster: Ask for Install
BreakDancer	SV detection from paired end reads mapping.	Genobioinfo Cluster: How to use
breseq	breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data for haploid microbial-sized genomes.	Genobioinfo Cluster: How to use
cgMLSTFinder	Core genome Multi-Locus Sequence Typing cgMLSTFinder runs KMA <1> against a chosen core genome MLST (cgMLST) database and outputs the detected alleles in a matrix file.	Genobioinfo Cluster: Ask for Install
Clair3	Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.	Genobioinfo Cluster: How to use
ClinSV		Genobioinfo Cluster: How to use
cnD	cnD is a program to detect copy number variants from short-read sequence data.	Genobioinfo Cluster: How to use
CNVkit	A command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing. CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent. Read the full documentation at: http://cnvkit.readthedocs.io	Genobioinfo Cluster: How to use
CNVnator	A tool for CNV discovery and genotyping from depth of read mapping.	Genobioinfo Cluster: Ask for Install
Cnvpipelines	A pipeline to detect copy number variations (CNV) on several samples.	Genobioinfo Cluster: Ask for Install
Consensify	Consensify is a method for generating a consensus pseudohaploid genome sequence with greatly reduced error rates compared to standard pseudohaploidisation.	Genobioinfo Cluster: Ask for Install
CRISP	CRISP is a software program to detect SNPs and short indels from pooled sequencing data.	Genobioinfo Cluster: Ask for Install
cuteSV	Long read based human genomic structural variation detection with cuteSV.	Genobioinfo Cluster: How to use
cyvcf2	cyvcf2 is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.	Genobioinfo Cluster: How to use
DeepVariant	DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.	Genobioinfo Cluster: How to use
Delly	DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.	Genobioinfo Cluster: How to use
detettore	A program to detect transposable element polymorphisms	Genobioinfo Cluster: Ask for Install
Discovar	Assemble genomes and find variants with DISCOVAR & DISCOVAR de novo	Genobioinfo Cluster: Ask for Install
dysgu	dysgu-SV is a collection of tools for calling structural variants using short or long reads.	Genobioinfo Cluster: How to use
elPrep	elPrep is a high-performance tool for analyzing .sam/.bam files (up to and including variant calling) in sequencing pipelines.	Genobioinfo Cluster: How to use
Ensembl-VEP	VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.	Genobioinfo Cluster: How to use
eva-sub-cli	The eva-sub-cli tool is a command line interface tool for data validation and upload.	Genobioinfo Cluster: How to use
EVE	EVE is a set of protein-specific models providing for any single amino acid mutation of interest a score reflecting the propensity of the resulting protein to be pathogenic.	Genobioinfo Cluster: How to use
FLAMES	Full-length transcriptome splicing and mutation analysis.	Genobioinfo Cluster: How to use
FreeBayes	FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.	Genobioinfo Cluster: How to use
GATK	The GATK is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.	Genobioinfo Cluster: How to use
gemBS	gemBS is a high performance bioinformatic pipeline designed for highthroughput analysis of DNA methylation data from whole genome bisulfites sequencing data (WGBS). It combines GEM3, a high performance read aligner and bs_call, a high performance variant and methyation caller, into a streamlined and efficient pipeline for bisulfite sequence analysis.	Genobioinfo Cluster: How to use
GenomeSTRiP	Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.	Genobioinfo Cluster: Ask for Install
gIMble	A genome-wide IM blockwise likelihood estimation toolkit	Genobioinfo Cluster: Ask for Install
graphtyper	graphtyper is a graph-based variant caller capable of genotyping population-scale short read data sets.	Genobioinfo Cluster: How to use
GRIDSS	GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.	Genobioinfo Cluster: How to use
Gtools	GTOOL is a program for transforming sets of genotype data for use with the programs SNPTEST and IMPUTE.	Genobioinfo Cluster: Ask for Install
HapCUT2	Software tools for haplotype assembly from sequence data	Genobioinfo Cluster: How to use
HarvestTools	HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations.	Genobioinfo Cluster: How to use
HGTector	Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.	Genobioinfo Cluster: Ask for Install
Iris	A module which corrects the sequences of structural variant calls (currently only insertions).	Genobioinfo Cluster: How to use
IRMA	IRMA was designed for the robust assembly, variant calling, and phasing of highly variable RNA viruses. Currently IRMA is deployed with modules for influenza and ebolavirus. IRMA is free to use and parallelizes computations for both cluster computing and single computer multi-core setups.	Genobioinfo Cluster: How to use
iVar	Var is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates. We implemented the following functions in iVar: (1) trimming of primers and low-quality bases, (2) consensus calling, (3) variant calling - both iSNVs and insertions/deletions, and (4) identifying mismatches to primer sequences and excluding the corresponding reads from alignment files.	Genobioinfo Cluster: How to use
Jasmine	JASMINE: Jointly Accurate Sv Merging with Intersample Network Edges. This tool is used to merge structural variants (SVs) across samples. Each sample has a number of SV calls, consisting of position information (chromosome, start, end, length), type and strand information, and a number of other values. Jasmine represents the set of all SVs across samples as a network, and uses a modified minimum spanning forest algorithm to determine the best way of merging the variants such that each merged variants represents a set of analogous variants occurring in different samples. Manual : Jasmine User Manual · mkirsche/Jasmine Wiki · GitHub Jasmine also includes a module for automating the creation of IGV screenshots of variants of interest.	Genobioinfo Cluster: How to use
Jvarkit	Java utilities for Bioinformatics (only requested tools are compiling)	Genobioinfo Cluster: Ask for Install
KING	KING is a toolset that makes use of high-throughput SNP data typically seen in a genome-wide association study (GWAS) or a sequencing project.	Genobioinfo Cluster: How to use
kSNP4	kSNP4 identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs.	Genobioinfo Cluster: How to use
loco-pipe	loco-pipe is an automated Snakemake pipeline that streamlines a set of essential population genomic analyses for low-coverage whole genome sequencing (lcWGS) data.	Genobioinfo Cluster: How to use
LoFreq	A sequence-quality aware, ultra-sensitive variant caller for NGS data.	Genobioinfo Cluster: How to use
LongRanger	Long Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants.	Genobioinfo Cluster: How to use
longshot	Longshot is a variant calling tool for diploid genomes using long error prone reads such as Pacific Biosciences (PacBio) SMRT and Oxford Nanopore Technologies (ONT).	Genobioinfo Cluster: Ask for Install
LUMPY	A general probabilistic framework for structural variant discovery	Genobioinfo Cluster: How to use
Manta	Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads.	Genobioinfo Cluster: How to use
MapThin	Reduce the number of SNPs in a gene marker dense map computed by PLINK. First, by eliminating linked SNPs. Then, by applying different criteria.	Genobioinfo Cluster: Ask for Install
Merfin	Evaluate variant calls and its combination with k-mer multiplicity.	Genobioinfo Cluster: Ask for Install
mity	A highly sensitive mitochondrial variant analysis pipeline for whole genome sequencing data	Genobioinfo Cluster: How to use
Mobster	Mobster is used to detect novel (non-reference) Mobile Element Insertion (MEI) events in BAM files and uses both a discordant read pair method and a split-read method.	Genobioinfo Cluster: Ask for Install
NanoCaller	NanoCaller is a computational method that integrates long reads in deep convolutional neural network for the detection of SNPs/indels from long-read sequencing data.	Genobioinfo Cluster: How to use
PALEOMIX	The PALEOMIX pipeline is a set of free and open-source pipelines and tools designed to enable the rapid processing of Next Generation Sequencing (NGS) data, starting from de-multiplexed reads from one or more samples, through sequence processing and alignment, and ending with genotyping, phylogenetic inference on the samples, as well as metagenomic analysis of the samples.	Genobioinfo Cluster: How to use
PanGenie	A short-read genotyper for various types of genetic variants (such as SNPs, indels and structural variants) represented in a pangenome graph.	Genobioinfo Cluster: How to use
Paragraph	Graph realignment tools for structural variants.	Genobioinfo Cluster: How to use
Parsnp	Parsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection.	Genobioinfo Cluster: How to use
PCAngsd	Framework for analyzing low depth next-generation sequencing (NGS) data in heterogeneous populations using principal component analysis (PCA).	Genobioinfo Cluster: How to use
pggb	Pangenome graph builder.	Genobioinfo Cluster: How to use
Pilon	Pilon is an automated genome assembly improvement and variant detection tool.	Genobioinfo Cluster: How to use
Pindel	Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads.	Genobioinfo Cluster: How to use
ploidyNGS	A model-free, open source tool to visualize and explore ploidy levels in a newly sequenced genome, exploiting short read data.	Genobioinfo Cluster: Ask for Install
popins	Population-scale detection of novel-sequence insertions.	Genobioinfo Cluster: Ask for Install
PopIns2	Population-scale detection of non-reference sequence variants using colored de Bruijn Graphs.	Genobioinfo Cluster: Ask for Install
RAiSD	RAiSD (Raised Accuracy in Sweep Detection) is a stand-alone software implementation of the μ statistic for selective sweep detection.	Genobioinfo Cluster: How to use
RDXplorer	The RDXplorer (Read Depth eXplorer) is a computational tool for copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage.	Genobioinfo Cluster: How to use
REFMAKER	REFMAKER is a command-line and user-friendly pipeline providing different tools to create nuclear references from genomic assemblies of shotgun libraries.	Genobioinfo Cluster: How to use
RegTools	RegTools is a set of tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.	Genobioinfo Cluster: Ask for Install
ROHan	ROHan is a Bayesian framework to estimate local rates of heterozygosity, infer runs of homozygosity (ROH) and compute global rates of heterozygosity.	Genobioinfo Cluster: How to use
RTGTools	RTG Tools is a subset of RTG Core that includes several useful utilities for dealing with VCF files and sequence data. Probably the most interesting is the `vcfeval` command which performs sophisticated comparison of VCF files.	Genobioinfo Cluster: How to use
SEDEF	SEDEF is a quick tool to find all segmental duplications in the genome.	Genobioinfo Cluster: Ask for Install
SequenceTools	Tools for population genetics on sequencing datas	Genobioinfo Cluster: How to use
SHAPEIT	SHAPEIT is a fast and accurate method for estimation of haplotypes (aka phasing) from genotype or sequencing data.	Genobioinfo Cluster: How to use
SIFT4G	Sorting Intolerant From Tolerant For Genomes.	Genobioinfo Cluster: How to use
SIFT4G_Annotator	Annotating VCF files using the SIFT4G databases.	Genobioinfo Cluster: Ask for Install
snape-pooled	SNAPE-pooled computes the probability distribution for the frequency of the minor allele in a certain population, at a certain position in the genome.	Genobioinfo Cluster: Ask for Install
Sniffles	A fast structural variant caller for long-read sequencing, Sniffles2 accurately detect SVs on germline, somatic and population-level for PacBio and Oxford Nanopore read data.	Genobioinfo Cluster: How to use
Snippy	Rapid haploid variant calling and core genome alignment. Snippy finds SNPs between a haploid reference genome and your NGS sequence reads. It will find both substitutions (snps) and insertions/deletions (indels). It will use as many CPUs as you can give it on a single computer (tested to 64 cores). It is designed with speed in mind, and produces a consistent set of output files in a single folder. It can then take a set of Snippy results using the same reference and generate a core SNP alignment (and ultimately a phylogenomic tree).	Genobioinfo Cluster: How to use
SNP-sites	Rapidly extracts SNPs from a multi-FASTA alignment.	Genobioinfo Cluster: How to use
snpArcher	snpArcher is a reproducible workflow optimized for nonmodel organisms and comparisons across datasets, built on the Snakemake workflow management system, for dataset acquisition, variant calling, quality control, and downstream analysis.	Genobioinfo Cluster: How to use
SnpEff	SnpEff is a variant annotation and effect prediction tool. ttttIt annotates and predicts the effects of variants on genes (such as amino acid changes)	Genobioinfo Cluster: How to use
SNPGenie	SNPGenie is a collection of Perl scripts for estimating πN/πS, dN/dS, and gene diversity from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.	Genobioinfo Cluster: Ask for Install
SNPhylo	a pipeline to generate a phylogenetic tree from huge SNP data	Genobioinfo Cluster: How to use
SNPsplit	SNPsplit is an allele-specific alignment sorter which is designed to read alignment files in SAM/BAM format and determine the allelic origin of reads that cover known SNP positions.	Genobioinfo Cluster: How to use
SpeedSeq	A flexible framework for rapid genome analysis and interpretation.	Genobioinfo Cluster: Ask for Install
STAR-Fusion	STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set (using a GTF file, ideally the same annotation file used during the STAR genome index building process during the intial STAR setup).	Genobioinfo Cluster: How to use
Strelka	Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs.	Genobioinfo Cluster: How to use
Subread	A tool kit for processing next-gen sequencing data	Genobioinfo Cluster: How to use
SURVIVOR	SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.	Genobioinfo Cluster: How to use
SvABA	Structural variation and indel detection by local assembly	Genobioinfo Cluster: How to use
SVanalyzer	Tools for the analysis of structural variation in genomes.	Genobioinfo Cluster: How to use
SVDetect	A tool to detect genomic structural variations from paired-end and mate-pair sequencing data.	Genobioinfo Cluster: Ask for Install
SVIM	SVIM is a structural variant caller for long reads. It is able to detect, classify and genotype five different classes of structural variants.	Genobioinfo Cluster: How to use
SVIM-asm	SVIM-asm (pronounced SWIM-assem) is a structural variant caller for haploid or diploid genome-genome alignments.	Genobioinfo Cluster: How to use
svimmer	Merges similar SVs from multiple single sample VCF files. The tool was written for merging SVs discovered using Manta calls, but should support (almost) any SV VCFs.	Genobioinfo Cluster: Ask for Install
SVJedi	SVJedi is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: Ask for Install
SVJedi-graph	SVJedi-graph is a structural variation (SV) genotyper for long read data.	Genobioinfo Cluster: How to use
SVMerge	A pipeline to detect structural variants (SVs) by integrating calls from several existing SV callers, which are then validated and the breakpoints refined using local de novo assembly.	Genobioinfo Cluster: How to use
svtyper	Bayesian genotyper for structural variants.	Genobioinfo Cluster: How to use
SyRI	SyRI is a comprehensive tool for predicting genomic differences between related genomes using whole-genome assemblies (WGA).	Genobioinfo Cluster: How to use
T1K	T1K (The ONE genotyper for Kir and HLA) is a computational tool to infer the alleles for the polymorphic genes such as KIR and HLA. T1K calculates the allele abundances based on the RNA-seq/WES/WGS read alignments on the provided allele reference sequences. The abundances are used to pick the true alleles for each gene. T1K provides the post analysis steps, including novel SNP detection and single-cell representation. T1K supports both single-end and paired-end sequencing data with any read length.	Genobioinfo Cluster: How to use
TARDIS	Toolkit for automated and rapid discovery of structural variants.	Genobioinfo Cluster: Ask for Install
TEnest	TEnest is a tool for finding and annotating transposable element (TE) insertions.	Genobioinfo Cluster: How to use
Tracy	Tracy is an efficient and versatile command-line application to basecall, align, assemble and deconvolute Sanger Chromatogram trace files.	Genobioinfo Cluster: Ask for Install
Truvari	Structural variant comparison tool for VCFs	Genobioinfo Cluster: How to use
VarDict	VarDict is an ultra sensitive variant caller for both single and paired sample variant calling from BAM files.	Genobioinfo Cluster: Ask for Install
Variabel	A novel approach and method for intrahost variant detection, which outperforms existing ONT variant callers.	Genobioinfo Cluster: Ask for Install
VarScan	VarScan is a platform-independent software tool developed at the Genome Institute at Washington University to detect variants in NGS data.	Genobioinfo Cluster: How to use
VarTrix	VarTrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.	Genobioinfo Cluster: Ask for Install
vawk	An awk-like VCF parser	Genobioinfo Cluster: Ask for Install
VCF-kit	Assorted utilities for the variant call format.	Genobioinfo Cluster: Ask for Install
vcf2maf	Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms.	Genobioinfo Cluster: How to use
vcfbub	Popping bubbles in vg deconstruct VCFs.	Genobioinfo Cluster: How to use
vcflib	C++ library and cmdline tools for parsing and manipulating VCF files.	Genobioinfo Cluster: How to use
Vcfstats	Vcfstats is a tool that can generate metrics from a vcf file.	Genobioinfo Cluster: How to use
vg	Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.	Genobioinfo Cluster: How to use
Vt	A tool set for short variant discovery in genetic sequence data.	Genobioinfo Cluster: How to use
WhatsHap	WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.	Genobioinfo Cluster: How to use

Systems biology

Application	Description	Availability/Use
gapseq	Informed prediction and analysis of bacterial metabolic pathways and genome-scale networks.	Genobioinfo Cluster: How to use
goatools	Python library to handle Gene Ontology (GO) terms.	Genobioinfo Cluster: How to use
MiMiC2	MiMiC2 is a bioinformatic pipeline for the selection of a few microbial genomes that functionally represent an entire ecosystem, termed a synthetic community (SynCom).	Genobioinfo Cluster: How to use

Transcription factors

Application	Description	Availability/Use
PromoTech	Promotech is a machine-learning-based classifier trained to generate a model that generalizes and detects promoters in a wide range of bacterial species.	Genobioinfo Cluster: How to use

Visualization

Application	Description	Availability/Use
bam2plot	Make coverage plots from bam files.	Genobioinfo Cluster: How to use
CVIT	Chromosome Viewing Tool. A collection of Perl scripts that enable quick visualizations of features on linkage groups, psuedochromosomes or cytogenetic maps.	Genobioinfo Cluster: How to use
g3d	Genomics 3D visualizer tool sets. `g3d` is a binary file format for storing genomic 3D structure data, `g3d` is short for genomic 3D format.	Genobioinfo Cluster: How to use
GCI	Genome Continuity Inspector (GCI) is an assembly assessment tool for high-quality genomes (e.g. T2T genomes), in base resolution.	Genobioinfo Cluster: How to use
GfaViz	Graphical interactive tool for the visualization of sequence graphs in GFA format.	Genobioinfo Cluster: How to use
GrapeTree	GrapeTree is a fully interactive, tree visualization program within EnteroBase, which supports facile manipulations of both tree layout and metadata.	Genobioinfo Cluster: How to use
Haplostrips	Haplostrips produce plots that depict variants in a genomic window among different samples. Visualize similarities between haplotypes with respect to a reference haplotype through haplotype clustering and sorting, useful for revealing hidden population structure.	Genobioinfo Cluster: How to use
HarvestTools	HarvestTools is a utility for creating and interfacing with Gingr files, which are efficient archives that the Harvest Suite uses to store reference-compressed multi-alignments, phylogenetic trees, filtered variants and annotations.	Genobioinfo Cluster: How to use
HyPhy	HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses.	Genobioinfo Cluster: How to use
iFeature	iFeature is a comprehensive Python-based toolkit for generating various numerical feature representation schemes from protein or peptide sequences. Install with Spann model: https://github.com/nicolagulmini/spaan	Genobioinfo Cluster: How to use
Infomap	Multi-level network clustering based on the Map Equation.	Genobioinfo Cluster: Ask for Install
Jasmine	JASMINE: Jointly Accurate Sv Merging with Intersample Network Edges. This tool is used to merge structural variants (SVs) across samples. Each sample has a number of SV calls, consisting of position information (chromosome, start, end, length), type and strand information, and a number of other values. Jasmine represents the set of all SVs across samples as a network, and uses a modified minimum spanning forest algorithm to determine the best way of merging the variants such that each merged variants represents a set of analogous variants occurring in different samples. Manual : Jasmine User Manual · mkirsche/Jasmine Wiki · GitHub Jasmine also includes a module for automating the creation of IGV screenshots of variants of interest.	Genobioinfo Cluster: How to use
MAGeCK	Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens technology. MAGeCK-VISPR is a comprehensive quality control, analysis and visualization workflow for CRISPR/Cas9 screens. MAGeCKFlute (R package):Integrative analysis pipeline for pooled CRISPR functional genetic screens Manual and video tutorials : liulab / mageck-vispr — Bitbucket	Genobioinfo Cluster: How to use
Met4j	Met4J is an open-source Java library dedicated to the structural analysis of metabolic networks. It also came with a toolbox gathering CLI for several analyses relevant to metabolism-related research.	Genobioinfo Cluster: Ask for Install
plotsr	Tool to plot synteny and structural rearrangements between genomes.	Genobioinfo Cluster: How to use
PretextView	OpenGL Powered Pretext Contact Map Viewer.	Genobioinfo Cluster: How to use
proteinortho_curves	Draw pan- and core-genome curves from proteinortho output	Genobioinfo Cluster: How to use
REViewer	A tool for visualizing alignments of reads in regions containing tandem repeats	Genobioinfo Cluster: How to use
TBtools-II	GUI/CommandLine Tool Box for biologistists to utilize NGS data.	Genobioinfo Cluster: How to use
viu	A small command-line application to view images from the terminal written in Rust.	Genobioinfo Cluster: How to use

Workflow

Application	Description	Availability/Use
chimerascan	chimerascan is a software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets.	Genobioinfo Cluster: Ask for Install
CulebrONT	An open-source, scalable, modular and traceable Snakemake pipeline, able to launch multiple assembly tools in parallel, giving you the possibility of circularise, polish, and correct assemblies, checking quality. CulebrONT can help to choose the best assembly between all possibilities.	Genobioinfo Cluster: Ask for Install
FROGS	FROGS is a CLI workflow designed to produce an OTU count matrix from high depth sequencing amplicon data.	Genobioinfo Cluster: How to use
MAGeCK	Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens technology. MAGeCK-VISPR is a comprehensive quality control, analysis and visualization workflow for CRISPR/Cas9 screens. MAGeCKFlute (R package):Integrative analysis pipeline for pooled CRISPR functional genetic screens Manual and video tutorials : liulab / mageck-vispr — Bitbucket	Genobioinfo Cluster: How to use
MitoFinder	Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data. MitoFinder is also designed to find and annotate mitochondrial sequences in existing genomic assemblies (generated from Hifi/PacBio/Nanopore/Illumina sequencing data...)	Genobioinfo Cluster: How to use
nf-core workflows	This module provide access to workflows nf-core, there are automatically downloaded into your home. More info at nf-core/config page.	Genobioinfo Cluster: How to use
RepeatExplorer	RepeatExplorer is a computational pipeline designed to identify and characterize repetitive DNA elements in next-generation sequencing data from plant and animal genomes.	Genobioinfo Cluster: Ask for Install
snakePipes	Customizable workflows based on snakemake and python for the analysis of NGS data.	Genobioinfo Cluster: Ask for Install