Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Three reads with the same name in the BAM file

    Hi all,

    I am dealing with the paired-end BAM file, and come up with many warnings like this:

    Code:
    WARNING: Could not find pair for HWI-ST430:177:2:1:4979:15503#0
    WARNING: Could not find pair for HWI-ST430:177:2:1:5127:13427#0
    WARNING: Could not find pair for HWI-ST430:177:2:1:6521:21452#0
    I check the warning reads in the BAM file, and find all the warning reads have three reads with the same name. For example:

    Code:
    [COLOR="Red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR]	65	chr32	26100696	60	79M21S	chr5	36697147	0	ACTTTGCAATTTAAGTTTTACTTACTTTTTAACTAATATACATGCCTAAAATTTACAAAAACAATAATAAAAACAACAGAACACTGGAAACATTTTTAAA	>;=<>=<<=======<====;===;=======<=>>>>>><=>>==>>>>=>>>>==>?>=<<==>?>>>?>?==><=?>><=<>>>?>?=>??>?===>	BD:Z:FFHFCIKKIHG@EEEHF??DGGEDGGE???DEEGGEFFFFGDHHHHGGE??FF?DGDG???EDGFGFGGF@@@FEHFEIEGFEEIJJIHBHGLJDD@EF@	MD:Z:79	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:FFIECHGIHFEAFEEHEAAFFHDFFHDAAAFEEIHFGGHGGGHHGHHHFBBGFBGGGHBBBFGHGGFGGFBBBGHIGHJGHGHFKJJJJEIKLJGHBGFB	NM:i:0	AS:i:79	XS:i:19
    [COLOR="red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR]	129	chr5	36697147	60	72M28S	chr32	26100696	0	ATTTGCCCCTGGGCTATTTTTTTCCTNCCATGTAAGATTCCGTTTTAAAAATGTTTCCAGTGTTCTGTTGTTTTTATTATTGTTTTTGTAAATTTTAGGC	===<=<<<<====<=>========<<!<<<=><<=>>>>>=5=>>>>>>>>>>=>>>==>=>=>>>>=?>=>>>>>>>>=?>=>>>?>>>??>??>;<=>	SA:Z:chr32,26100739,-,36M64S,60,0;	BD:Z:FFG@JKKFFHIIEHIGFF?????EGGEEEGHHEGEEDGFEGEGF??DE???FHEF?EGGHIFFGFEIFGGFG@@@EGGEGGGFHAAAHGJHBJJDDEHHI	MD:Z:26T37T7	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:FFFBHHHFFHGGDGHGGEAAAAADFGEEEIHHGHFFFGFEGHHFBBGFBBBGHGFBEGIIIFGFEFHGFHHGCCCHIGHIGHHGDDDIIKIFKJGHGHGH	NM:i:2	AS:i:65	XS:i:21
    [COLOR="red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR]	401	chr32	26100739	60	36M64H	=	26100696	-79	GCCTAAAATTTACAAAAACAATAATAAAAACAACAG	===<=>>=>>===>===<=>===========>;===	SA:Z:chr5,36697147,+,72M28S,60,2;	BD:Z:IHHE??FF?EGEF???FEFFFDFGE@@AHHIJFIFF	MD:Z:36	PG:Z:MarkDuplicates	RG:Z:Basenji	BI:Z:HGHGBBFFAEGFFAAAEFFEGFEGFABBFGHGGHFF	NM:i:0	AS:i:36	XS:i:22
    The BAM file is alignment of HiSeq reads aligned to the reference genome using bwa, and use picard to remove redundancy. Base realignments were done using gatk.


    My confusion is:
    1、Why there are three reads with the same name, but have no relation?
    2、Maybe the first two are treated as mate pairs and the third as a single read. So could I just ignore it?

    Could eveyone help me? Many thanks for your help!

  • #2
    3rd read flag value 401 has not primary alignment bit.

    2nd read has "SA" tag:
    SA is : Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list: ( rname , pos , strand , CIGAR , mapQ , NM [[...]+. Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line, the [...] element points to the primary line.

    it's pointing to 3rd read via the location.

    So, looks like your software suppors reads that have parts that maps to different locations.

    Comment


    • #3
      Originally posted by Richard Finney View Post
      3rd read flag value 401 has not primary alignment bit.

      2nd read has "SA" tag:
      SA is : Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list: ( rname , pos , strand , CIGAR , mapQ , NM [[...]+. Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line, the [...] element points to the primary line.

      it's pointing to 3rd read via the location.

      So, looks like your software suppors reads that have parts that maps to different locations.
      Thank you for your reply!

      I read your reply carefully but there is some difficulty for me to understand.

      Could you explain the three reads more easy to understand? or how can I solve the warnings "Could not find pair for HWI-ST430:177:2:1:4979:15503#0".

      Thank you very much!
      Last edited by Alphabets; 03-28-2016, 07:51 PM.

      Comment


      • #4
        What is your goal?

        What program is reporting the warning?

        Check the manual for your alignment software and check the notes on when it produces an "SA" tag.

        Read one is one mate pair.
        The next two represent the other read with two entries , that is it is a "chimeric" read [ I think ].

        Ignoring it could be thing to do, depending on your goals.

        If you are looking for chimeric reads or possible errors in the reference, then you have struck gold

        Comment


        • #5
          Originally posted by Richard Finney View Post
          What is your goal?

          What program is reporting the warning?

          I want to call STRs with lobSTR dealing with the BAM file.

          I run lobSTR with the paired-end BAM file and it occurs many warnings like that.

          The BAM file I use is downloaded from web and I don't know more about it.

          When I run lobSTR treating it as the single-end BAM file, there is no warnings.
          The lobSTR to run single-end and single-end BAM file have different parameters.

          So, any other suggestions? Thanks!
          Last edited by Alphabets; 03-28-2016, 10:27 PM.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Advances in Sequencing Analysis Tools
            by seqadmin


            The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
            Today, 07:48 AM
          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 07:17 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-02-2024, 08:06 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-30-2024, 12:17 PM
          0 responses
          20 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-29-2024, 10:49 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Working...
          X