Moderation is important in small samples size comparisons, increasing both the power and accuracy of a DE test. A feature in all of these methods is moderation of gene-wise variance estimates to improve DE inference. To date, there are many methods that provide a test of whether a gene is DE or not, including cufflinks, DESeq and edgeR. The identification of differential expressed (DE) genes and transcripts is still a key question of interest in many biological studies. Even with the lowering cost of sequencing data, the majority of RNA-Seq experiments are still suffering from low replication numbers. In the post-genomic era, the development of technologies for sequencing the genome and transcriptome has become a key issue in the global analysis of biological systems. These improved variance models could easily be implemented in both DESeq and edgeR and highlight the need for a database that offers a profile of gene variances over a range of tissue types and organisms. The results of this are promising, with our differential expression test, Tshrink+, performing favourably when compared to existing methods such as DESeq and edgeR when considering both gene ranking and sensitivity. These sources of additional information include gene length and gene-wise sample variances from other RNA-Seq and microarray datasets, of both related and seemingly unrelated tissue types. Using biological data we show that utilising additional external information can improve the modelling of the common variance and hence the calling of differentially expressed genes. We have been able to achieve improved estimation of the common variance by using gene-wise sample variances from external experiments, as well as gene length. Existing methods share information between genes of similar average expression by shrinking, or moderating, the gene-wise variances to a fitted common variance. We have proposed a novel approach called Tshrink+ for inferring differential gene expression through improved modelling of the gene-wise variances. However, with the wealth of microarray and other publicly available gene expression data readily accessible on public repositories, these sources of information can be leveraged to make improvements in variance estimation. Estimating the variances of gene expression estimates becomes both a challenging and interesting problem in these situations of low replication. Despite this, experiments with four or less biological replicates are still quite common. The cost of RNA-Seq has been decreasing over the last few years.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |