Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to merge/compare RNAseq data using different protocal

    Hello,

    I have RNA-Sequencing data generated from Illumina HT2000 for 14 samples from the same tissue. The technician applied the same protocol for these samples except for the last sequencing step by accident.

    Batch 1 (8 samples) using 50bp paired-end
    Batch 2 (6 samples) using 75bp paired-end

    Since I have to analyze these sample together, what is the best way to avoid batch effect or protocol difference and reviewer's criticism?

    Method1: align them separately, but merge them into one matrix using RPKM/FPKM for each gene and sample
    Method2: for batch2 samples, only use the first 50bp reads for alignment, ...
    Method3: re-sequencing Batch1 with 75bp paired-end protocol using the library left.

    Many thanks,
    Shirley

  • #2
    This depends on your experiment. What are you trying to measure? If it's differential expression between samples or between batch 1 and batch 2, then you can't directly compare 75bp and 50bp reads. If you just want an overall profile of the tissue type, I see no reason not to merge everything together. Also, maybe you could post a quality histogram. If the 75bp reads have terrible quality at the tail, you might not get any drawbacks from chop the reads to 50bp anyway (that's unlikely but worth mentioning). Still, read 2 in 2x75bp pairs may have slightly lower quality than read 2 in 2x50bp pairs, even after cutting everything to 50bp, which is a small but real source of possible bias.

    No matter what the experiment, Method 3 would be fine, but it's also the most costly in time and money.

    Comment


    • #3
      Thanks Brian for your quick response. We would like to differential expression analysis,
      alternative splicing, novel transcripts, etc.

      But most importantly, in the near future, we will have 2 more batches samples, and need to be merged with these first two batches. So to be consistent and avoid potential reviewer's criticism, we would like to choose which way is the best at this early stage.

      Thanks again for your information,
      Shirley

      Comment


      • #4
        If you are going to be generating more data anyway, then I'd go with method 3. It's really the only one that cannot be criticized no matter what experiment you wish to do, and for alternative splicing with RNA-seq, 75bp is much better than 50bp (though of course 100bp is better still).

        Comment


        • #5
          Great. Thank you for your suggestions. Have a nice day!

          Comment


          • #6
            shirley0818,

            It seems that the 2x50 run was your intended sequence output. Therefore, there is no reason to repeat sequencing of your batch 2. Simply trim the 75 base reads to 50 base and the data will be directly comparable with no potential mapping bias due to longer read length.

            As an aside, the longer paired-end reads are not an advantage for the typical RNA-Seq experiment because the average insert size is ~150 bases. Therefore, a 2x100 will simply be generating overlapping reads that provide no additional advantage but significantly more cost.

            Comment


            • #7
              Thanks MU Core.

              In our experiments, the average RNA fragments is about 200-300bp, so the 2x75bp might be better for alternative splicing as Brian suggested. we will definitely not go to 2x100 at this point due to the cost and time of resequencing both Batch1 and 2 samples.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin



                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                06-06-2024, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 06-07-2024, 06:58 AM
              0 responses
              179 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-06-2024, 08:18 AM
              0 responses
              221 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-06-2024, 08:04 AM
              0 responses
              183 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-03-2024, 06:55 AM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Working...
              X