RNAseq alignment, quantification and transcript discovery with statistics

RNAseq alignment, quantification and transcript discovery with statistics (14/05/2024 - 17/05/2024)

The Toulouse Genotoul bioinformatics platform, in collaboration with the Genotoul Biostatistics platform, and the MIAT unit, organize a 3,5 days long training course for bio-informaticians and biologists aiming at learning sequence analysis. It focuses on (protein coding) gene expression analysis using reads produced by ‘RNA-Seq’. This training session is designed to introduce sequences from ‘NGS’ (Next Generation Sequencing), particularly Illumina platforms (HiSeq). You will discover the standards file formats, learn about the usual biases of this type of data and run different kinds of analyses, such as spliced alignment on a reference genome, novel gene and transcript discovery, expression quantification of coding genes and transcripts. Finally you will be able to extract the differentially expressed genes.



This training focuses on practice. It consists of modules with a large variety of exercises described hereunder (PROVISIONAL SCHEDULE):

  • Introduction (Day 1): What will be my experimental plan? What is gene expression? What kind of technology can be used to monitor gene expression? What do the reads produced by NGS platforms (Illumina) using the RNA-Seq protocol look like? Which are the known biases of these sequences? Presentation of the dataset for the practical exercises
  • Sequence quality (Day 1).
  • Sequence cleaning (Day 1).
  • Splice aligning reads on a reference genome, Visualizing alignments and splice sites using IGV (Integrated Genome Viewer) (Day 1).
  • Raw count vs. abundance estimate (Day 2).
  • Discovering novel genes and transcripts Part 1 (Day 2).
  • Comparison of models, visualization and results of gene expression quantification and conclusions (Day 2).
  • Statistics: Exploratory analysis of count data and normalisation (Day 3).
  • Statistics: Differential expression analysis (morning of day 4).


The session will take place in the room ‘salle de formation MIAT’ at the INRAE center of Toulouse-Auzeville, Building C8.


Prerequisites: ability to use a Linux and Cluster environment and basic knowledge in R.
You can check available R training session at Biostat platform and Unix and cluster session at bioinfo platform .

For self train we will try to list here available ressources : https://perso.math.univ-toulouse.fr/dejean/files/2020/12/intro_R.pdf


Bioinfo part : Material.

Biostat part : Material


Bookings: RNAseq alignment, quantification and transcript discovery with statistics

This event is fully booked.