Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • uqfgaiti
    started a topic Low number of replicates DESeq

    Low number of replicates DESeq

    Hi all,

    I am using DESeq for DGE analysis.

    I have STRANDED RNA-Seq data for 4 developmental stages with no replicates.
    To have a more reliable DGE I should have replicates and so I obtained (from another lab member) UNSTRANDED RNA-Seq data with 3 replicates per stage.

    Before doing a DGE, I thought to test the correlation between these samples, just to show that similar samples “cluster” together. If so, I can then use the unstranded data for my DGE analysis to have more replicates per each stage.

    I mapped the raw reads to the genome using TOPHAT, sorted the bam files by name and used htseq-count to get the raw reads counts for both the data. For the stranded data I used the option -s yes and for the unstranded data I used -s no.

    I used DESeq to include metadata and for normalization, and I removed the genes that always have a 0 value. I then calcualted the correlation which was really low.

    I then tried to use htseq-count with the option -s reverse for the stranded data and still got really low correlation.

    So I reran htseq-count on the stranded data selecting the option -s no and in this way I got a very similar number of total counts between the unstranded and stranded data (while both cases before the stranded ones were double in number). I then included metadata, estimated the new size factors, normalized and calculated the new correlation. Both Pearson and Spearman performed pretty well, confirmed by both a PCA and correlogram.

    Though, I'd still like to figure out a way to use the stranded counts. I am not sure if I lose some information running htseq-count using -s no on the stranded data.

    What I had in mind was using unstranded data to estimate the level of variation to get a threshold for DE detection but still use the stranded data as expression values. Not sure if I can do that though given one is stranded and the other is not.

    I would like to hear from you if you have any thoughts about this.

    Let me know if you need more information to better understand the issue.

    Thanks a lot
    Federico

Latest Articles

Collapse

  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM
  • seqadmin
    Choosing Between NGS and qPCR
    by seqadmin



    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
    10-18-2024, 07:11 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 11-08-2024, 11:09 AM
0 responses
34 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-08-2024, 06:13 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-01-2024, 06:09 AM
0 responses
32 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-30-2024, 05:31 AM
0 responses
23 views
0 likes
Last Post seqadmin  
Working...
X