Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat2 - reads removed from the analysis?

    Hello
    We analyse Solid 5500 RNA seq reads. I used tophat2 for the alignment. We first performed the analysis ONLY on 2 chr in order to have a quick reply (that's why our number of reads aligned - see below - is very low)
    We noticed some reads were "missing" at the end of the pipeline and we're wondering why.

    These are .info files
    ::::::::::::::
    left_kept_reads.info
    ::::::::::::::
    min_read_len=50
    max_read_len=50
    reads_in =25357934
    reads_out=25319876
    ::::::::::::::
    right_kept_reads.info
    ::::::::::::::
    min_read_len=35
    max_read_len=35
    reads_in =25357934
    reads_out=25283245

    and samtools flagstat done on accepted_hits.bam

    1552361 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    1552361 + 0 mapped (100.00%:-nan%)
    1552361 + 0 paired in sequencing
    769043 + 0 read1
    783318 + 0 read2
    549840 + 0 properly paired (35.42%:-nan%)
    661184 + 0 with itself and mate mapped
    891177 + 0 singletons (57.41%:-nan%)
    5522 + 0 with mate mapped to a different chr
    5522 + 0 with mate mapped to a different chr (mapQ>=5)

    and 15435015 reads are in the unmapped.bam files

    --> so that we've got 15.435.015 unmapped + 1.552.361 mapped ~ 17.000.000 reads have been analysed.
    --> we had 25.357.934 + 25.357.934 reads_in to analyse ~ 50.000.000 reads were available for the analysis.

    We're wondering where are the other reads. We expected summing the number of reads in accepted + unmapped bam files would lead to the number of reads_in but it's not the cas. If you have any explanation may I ask you to help us please?
    Thanks a lot for your time

  • #2
    any idea please?
    thanks

    Comment

    Latest Articles

    Collapse

    • seqadmin
      The Impact of AI in Genomic Medicine
      by seqadmin



      Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
      02-26-2024, 02:07 PM
    • seqadmin
      Multiomics Techniques Advancing Disease Research
      by seqadmin


      New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

      A major leap in the field has
      ...
      02-08-2024, 06:33 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:12 AM
    0 responses
    19 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 02-23-2024, 04:11 PM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 02-21-2024, 08:52 AM
    0 responses
    74 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 02-20-2024, 08:57 AM
    0 responses
    65 views
    0 likes
    Last Post seqadmin  
    Working...
    X