Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to merge/compare RNAseq data using different protocal

    Hello,

    I have RNA-Sequencing data generated from Illumina HT2000 for 14 samples from the same tissue. The technician applied the same protocol for these samples except for the last sequencing step by accident.

    Batch 1 (8 samples) using 50bp paired-end
    Batch 2 (6 samples) using 75bp paired-end

    Since I have to analyze these sample together, what is the best way to avoid batch effect or protocol difference and reviewer's criticism?

    Method1: align them separately, but merge them into one matrix using RPKM/FPKM for each gene and sample
    Method2: for batch2 samples, only use the first 50bp reads for alignment, ...
    Method3: re-sequencing Batch1 with 75bp paired-end protocol using the library left.

    Many thanks,
    Shirley

  • #2
    This depends on your experiment. What are you trying to measure? If it's differential expression between samples or between batch 1 and batch 2, then you can't directly compare 75bp and 50bp reads. If you just want an overall profile of the tissue type, I see no reason not to merge everything together. Also, maybe you could post a quality histogram. If the 75bp reads have terrible quality at the tail, you might not get any drawbacks from chop the reads to 50bp anyway (that's unlikely but worth mentioning). Still, read 2 in 2x75bp pairs may have slightly lower quality than read 2 in 2x50bp pairs, even after cutting everything to 50bp, which is a small but real source of possible bias.

    No matter what the experiment, Method 3 would be fine, but it's also the most costly in time and money.

    Comment


    • #3
      Thanks Brian for your quick response. We would like to differential expression analysis,
      alternative splicing, novel transcripts, etc.

      But most importantly, in the near future, we will have 2 more batches samples, and need to be merged with these first two batches. So to be consistent and avoid potential reviewer's criticism, we would like to choose which way is the best at this early stage.

      Thanks again for your information,
      Shirley

      Comment


      • #4
        If you are going to be generating more data anyway, then I'd go with method 3. It's really the only one that cannot be criticized no matter what experiment you wish to do, and for alternative splicing with RNA-seq, 75bp is much better than 50bp (though of course 100bp is better still).

        Comment


        • #5
          Great. Thank you for your suggestions. Have a nice day!

          Comment


          • #6
            shirley0818,

            It seems that the 2x50 run was your intended sequence output. Therefore, there is no reason to repeat sequencing of your batch 2. Simply trim the 75 base reads to 50 base and the data will be directly comparable with no potential mapping bias due to longer read length.

            As an aside, the longer paired-end reads are not an advantage for the typical RNA-Seq experiment because the average insert size is ~150 bases. Therefore, a 2x100 will simply be generating overlapping reads that provide no additional advantage but significantly more cost.

            Comment


            • #7
              Thanks MU Core.

              In our experiments, the average RNA fragments is about 200-300bp, so the 2x75bp might be better for alternative splicing as Brian suggested. we will definitely not go to 2x100 at this point due to the cost and time of resequencing both Batch1 and 2 samples.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 05-14-2024, 07:03 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-10-2024, 06:35 AM
              0 responses
              37 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-09-2024, 02:46 PM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-07-2024, 06:57 AM
              0 responses
              39 views
              0 likes
              Last Post seqadmin  
              Working...
              X