Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shirley0818
    Member
    • Apr 2013
    • 13

    How to merge/compare RNAseq data using different protocal

    Hello,

    I have RNA-Sequencing data generated from Illumina HT2000 for 14 samples from the same tissue. The technician applied the same protocol for these samples except for the last sequencing step by accident.

    Batch 1 (8 samples) using 50bp paired-end
    Batch 2 (6 samples) using 75bp paired-end

    Since I have to analyze these sample together, what is the best way to avoid batch effect or protocol difference and reviewer's criticism?

    Method1: align them separately, but merge them into one matrix using RPKM/FPKM for each gene and sample
    Method2: for batch2 samples, only use the first 50bp reads for alignment, ...
    Method3: re-sequencing Batch1 with 75bp paired-end protocol using the library left.

    Many thanks,
    Shirley
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    This depends on your experiment. What are you trying to measure? If it's differential expression between samples or between batch 1 and batch 2, then you can't directly compare 75bp and 50bp reads. If you just want an overall profile of the tissue type, I see no reason not to merge everything together. Also, maybe you could post a quality histogram. If the 75bp reads have terrible quality at the tail, you might not get any drawbacks from chop the reads to 50bp anyway (that's unlikely but worth mentioning). Still, read 2 in 2x75bp pairs may have slightly lower quality than read 2 in 2x50bp pairs, even after cutting everything to 50bp, which is a small but real source of possible bias.

    No matter what the experiment, Method 3 would be fine, but it's also the most costly in time and money.

    Comment

    • shirley0818
      Member
      • Apr 2013
      • 13

      #3
      Thanks Brian for your quick response. We would like to differential expression analysis,
      alternative splicing, novel transcripts, etc.

      But most importantly, in the near future, we will have 2 more batches samples, and need to be merged with these first two batches. So to be consistent and avoid potential reviewer's criticism, we would like to choose which way is the best at this early stage.

      Thanks again for your information,
      Shirley

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #4
        If you are going to be generating more data anyway, then I'd go with method 3. It's really the only one that cannot be criticized no matter what experiment you wish to do, and for alternative splicing with RNA-seq, 75bp is much better than 50bp (though of course 100bp is better still).

        Comment

        • shirley0818
          Member
          • Apr 2013
          • 13

          #5
          Great. Thank you for your suggestions. Have a nice day!

          Comment

          • MU Core
            Member
            • Apr 2008
            • 60

            #6
            shirley0818,

            It seems that the 2x50 run was your intended sequence output. Therefore, there is no reason to repeat sequencing of your batch 2. Simply trim the 75 base reads to 50 base and the data will be directly comparable with no potential mapping bias due to longer read length.

            As an aside, the longer paired-end reads are not an advantage for the typical RNA-Seq experiment because the average insert size is ~150 bases. Therefore, a 2x100 will simply be generating overlapping reads that provide no additional advantage but significantly more cost.

            Comment

            • shirley0818
              Member
              • Apr 2013
              • 13

              #7
              Thanks MU Core.

              In our experiments, the average RNA fragments is about 200-300bp, so the 2x75bp might be better for alternative splicing as Brian suggested. we will definitely not go to 2x100 at this point due to the cost and time of resequencing both Batch1 and 2 samples.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 11:58 AM
              0 responses
              13 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              25 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              36 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              60 views
              0 reactions
              Last Post SEQadmin2  
              Working...