Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running Tophat on a small subset of Reads

    Hello.

    I am trying to find out which library type to run for a single stranded RNA seq run. the manual says


    I am not sure which library type to use (fr-firststrand or fr-secondstrand), what should I do?

    "One possible way to figure out the correct library-type is to run TopHat with a small subset of the reads (e.g., 1M) as follows.

    run TopHat with fr-firststrand and count the number of junctions in junctions.bed (one of the output files from TopHat)
    run TopHat with fr-secondstrand and count the number of junctions in junctions.bed

    Since the splice junction finding algorithm of TopHat makes use of library-type information (if provided), one of the two TopHat runs would result in many more splice junctions than the other one. You can then use the library type that gives more junctions. If this is not the case TopHat might not work well with your sequencing protocol. Please let us know more details about your protocol so we can add support for new library types."



    For 10 samples, I have ran the first strand library type and completed the alignment producing the alignment report.

    Now I am running the second library type for a single sample and counting the number junctions in the junctions.bed file (when it copmletes).

    My question is this

    1) Say for this single sample A, if the second library type has more junctions then the second library type is the correct library type. But does the manual mean to say that it is the correct library type FOR ALL samples, or FOR JUST THIS ONE?

    2) If for this single sample, if the alignment for the second library type comes out to have less junctions in the junctions.bed output file, does this mean that the second library type is the incorrect library type FOR ALL samples, or for just this one?

  • #2
    I think if your samples were library prepped and sequenced at the same time, then whatever the result is likely to be broadly applicable - as it's the library prep that defines which type of stranded flag you need to use..

    Although, there's threads on here which hint at more straightforward ways of checking, but knowing which protocol was used to do the library prep is essential. I'd be more keen to ask the lab what they did than waste time processing all my samples through tophat2 twice..

    Comment


    • #3
      for my alignment output I have received read scores of 41.6% concordant pair alignment

      Left reads:
      Input : 35732348
      Mapped : 16332729 (45.7% of input)
      of these: 476549 ( 2.9%) have multiple alignments (4626 have >20)
      Right reads:
      Input : 35732348
      Mapped : 16383018 (45.8% of input)
      of these: 426962 ( 2.6%) have multiple alignments (4550 have >20)
      45.8% overall read mapping rate.

      Aligned pairs: 14952780
      of these: 374540 ( 2.5%) have multiple alignments
      80381 ( 0.5%) are discordant alignments
      41.6% concordant pair alignment rate.



      However, another manual I am reading says

      Accurate differential analysis depends on accurate spliced read alignments. Typically, at
      least 70% of RNA-seq reads should align to the genome, and lower mapping rates may
      indicate poor quality reads or the presence of contaminant.


      I have done QC and my reads were good. So does this mean that perhaps I should try the other library type?

      Comment


      • #4
        Thank you very much for your response, I contacted the lab and found that they have used the TrueSeq Stranded Prep Kit

        From the manual, the library type should be set as "firststrand"

        However, I still do not understand why I have such low percentage rates.

        Comment


        • #5
          The lab also used Ribozo Prep kit, which would also use library type --firststrand

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM
          • seqadmin
            Addressing Off-Target Effects in CRISPR Technologies
            by seqadmin






            The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
            08-27-2024, 04:44 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 06:25 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 01:02 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-18-2024, 06:39 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-11-2024, 02:44 PM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Working...
          X