High throughput sequencing is a powerful tool used to rapidly obtain information about nucleic acids and this can increase our understanding of the biology of a cell and assess changes that may indicate disease. Researchers use the capabilities of high throughput sequencing to survey, characterize, and quantify gene expression in a biological sample (RNA-Seq). Subsequent differential expression (statistical) analysis studies on the results produced from RNA-Seq can quantify changes observed under different biological conditions (e.g., drought, heat) and allow us to investigate the corresponding impact. However, interpreting the data generated often relies on Bioinformaticians with highly specialized skills, and this can slow the analysis down.
Eager to help, we developed the 3D RNA-seq App, which is a web-based graphical user interface, to enable flexible, rapid and accurate differential expression analysis. The full analysis can be done within hours, even without extensive bioinformatics experience. Furthermore, in November 2020, a SEFARI Gateway Responsive Opportunity funded online workshop provided researcher training for SEFARI colleagues and researchers at the Universities of Dundee, Aberdeen, and Birmingham, as well as the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Germany.
Directory of Expertise
Alternative splicing is an essential mechanism in eukaryotes (plants, animals and fungi), which has greatly increased the biodiversity of proteins encoded by the genome (our genetic instruction book). This mechanism allows a single gene in the genome to copy (transcribe) multiple messenger Ribonucleic Acids (mRNAs) to form detailed records (transcripts) by alternatively splicing out (removing) the non-coding regions of the gene (Figure 1). This process leads to different proteins (variants) that may have different functions and properties important for the cell’s biology. Ribonucleic Acid sequencing or RNA-seq (a technique to reveal the presence and quantity of RNA in a biological sample) has been widely used to study the abundance of such protein changes which can be caused by different stimuli such as drought, high salinity and heat stress.
Figure 1: Schema of alternative splicing: A gene includes coding sequences (exons) and non-coding sequences (introns). The gene expression (the process by which information from a gene is used in the synthesis of a functional gene) mechanism alternatively splices out (removes) the introns to synthesis (generate) different transcripts (detailed records), which are sub-sequences (derived by deleting some of the sections of the original sequence without changing the order) of the gene. This process (expression of transcripts) can be quantified by counting the number of fragmented sequences (reads) of RNA-seq that could be aligned to the corresponding sequences. The gene expression is estimated by the sum of all the transcript expression.
Many RNA-seq analysis programs are unable to handle complex experimental designs of multivariate studies (such as time-series and developmental-series data), are error prone, require dedicated bioinformaticians and currently take months to obtain the results of the analysis. Despite the importance of alternative splicing in regulation of gene expression and protein diversity at the transcript level, most RNA-seq analysis pipelines only focus on gene expression changes, thus ignoring an essential level of information relevant to the expression reprogramming in response to stimuli. RNA-seq, despite its potential for accurate and rapid expression analysis at the transcript level, is often a source of frustration for biologists due to lack of easy-to-use and accessible tools to analyse the data.
In response, we have created a 3D RNA-seq App that integrates state-of-the-art, highly rated differential analysis tools and adopts the best practice for RNA-seq analysis. It employs various user-controllable steps of data pre-processing to optimize the analysis and visualizes the intermediate and final results through graphics and tables (Figure 2). The App also generates a publication-quality report at the end of the analysis with all the methods and results embedded.
Figure 2: 3D RNA-seq App web interface
Through support from the SEFARI Gateway Responsive Opportunity Fund, we then developed a 3D RNA-seq workshop, which was designed and delivered to train participants in the effective use of the 3D RNA-seq App. By using participant’s own experiments (or publicly available datasets) the workshop could demonstrate application to their own work and empower researchers in being able to perform complex differential expression while producing experimental results in a fast, robust and reproducible way.
To familiarize researchers with the entire analysis pipeline, the online 3D RNA-seq workshop (Figure 3) was held across two days. The main lecture session was held first and composed of an introduction of transcriptomics and included a demo of 3D analysis. Follow-up one-to-one sessions were also arranged, so that participants could discuss their own data in more detail.
The workshop was well-attended by a multi-disciplinary range of participants. Their research areas covered a wide range of topics and species and included: pathogen resistance in potatoes, genetic studies (relationship between genotype and phenotype) in leaf shapes, understanding the cattle gut system, viral diseases of ruminant livestock, the human intestinal epithelium in response to chemical exposures and gut bacteria studies in humans.
Figure 3: Screenshot of the online 3D RNA-seq workshop.
After the workshop a participant survey provided us with feedback. Comments were very positive in terms of the workshop’s organisation, training contents, and the skill and responsiveness of the instructors. For example, the participants commented:
“The 3D RNA-seq is very useful. Not only this is easy to handle, but it also shows the standard workflow needed to be done for RNA-seq experiments” and “The flow of app is very clear and it follows the order of necessity”.
Participants also provided suggestions to improve the training, such as:
“It would be good to have some more colour options in some plots so that we can choose the colour” and “After I attend the course, I realized [..] it would be good if there is also a theoretical session to explain the terminology and methods.”
Overall, the participants considered the 3D RNA-seq App is a useful tool because:
- it is easy to use and follow, particularly with the aid of the graphical manual and YouTube tutorial.
- and all of the participants would like to recommend the 3D RNA-seq App to their colleagues.
Finally, some participants have already managed to finish the 3D RNA-seq analysis using either the example data or their own data and have posted their results on Twitter (Figure 4).
Figure 4: 3D RNA-seq analysis results tweeted by participants.
The 3D RNA-seq App has been shown to be universally applicable to all species data and this training will increase skill capacity across SEFARI in analysing RNA-sequencing data. Furthermore, the software tool can significantly improve the speed and productivity for biologists using the RNA-seq technique, irregardless of whether their research question relates to plants, animals, humans or food.
The 3D RNA-seq App won the Best Innovation Award (2019) of School of Life Sciences in the University of Dundee. It currently has over 5,000 users and 1,200 returning users from 68 countries. The App has also been used to teach undergraduates in several universities and improve the skills of biologists in performing complex and accurate RNA-seq data analysis by themselves. Very positive feedback has been received from participants at all training events and we believe that biologists, and especially early career scientists, will benefit greatly from these training workshops.
Finally, such training opportunities allows us (and participants) to build up a network of support and unlock collaborations with experts from different research fields. Consequently, we look forward to carrying out this workshop regularly to continually build skill capacity and know-how across disciplines and our wider networks.
James Hutton Institute
University of Dundee
- 3D RNA-seq: a powerful and flexible tool for rapid and accurate differential expression and alternative splicing analysis of RNA-seq data for biologists
- Rapid and dynamic alternative splicing impacts the arabidopsis cold response transcriptome
- Chromatin accessibility landscapes activated by cell surface and intracellular immune receptors
- Downy Mildew effector HaRxL21 interacts with the transcriptional repressor TOPLESS to promote pathogen susceptibility
- Differential nucleosome occupancy modulates alternative splicing in Arabidopsis thaliana
- BaRTv1.0: An improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq
- Nonsense-mediated RNA decay factor UPF1 is critical for posttranscriptional and translational gene regulation in arabidopsis