Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    as I understand the flag meaning for 69, the query is unmapped, but the mate is mapped
    This seems to be an unfixed bug.

    Moreover, I get some lines with MAPQ to 0 and flag to 99/147 with coherent insert size
    I am not sure why you think this is problematic.

    how can one pair have a MAPQ at 37 and the other at 0
    I have answered this question several times in the mailing list. I now put it in FAQ on the bwa homepage.

    Comment


    • #32
      thanks a lot for the replies for the replies

      Comment


      • #33
        would you please explain what xt:a:n and xt:a:M means ?
        (I haven't found any xt:a:n yet but some xt:a:m ...
        I understand the xt:a:r and xt:a:u

        Comment


        • #34
          Originally posted by bioinfosm View Post
          what would a flag 0 imply in sam output? I used novo align and the only flags I see are 0 4 and 16. 4 is unmapped read. 16 is for the strand, but how to interpret 0.

          Also, How can I ascertain the reads that are *not* uniquely mapped. I read that the 5th column MAPQ should be of help to determine multiply-mapped reads. Is MAPQ=0 an indication that the read is multiply-mapped?

          Thanks
          I have a similar question. All I see are 0, 4, 16, and 20s. I do not understand how to interpret 20. I know the hexadecimal would be 14, which should mean this is a combination of both strand and not mapped. Please correct me if i'm wrong. I see a MAPQ score and also reference hit, so does it still mean it mapped? Thanks

          Comment


          • #35
            Unmapped reads can have strand. This only means the unmapped sequence is given on its reverse strand.

            Comment


            • #36
              Thank you Heng

              Comment


              • #37
                Hello new to the group

                Comment


                • #38
                  Originally posted by Nix View Post
                  I should say that now I understand bitwise flags, they are a pretty clever trick for compressing a bunch of boolean flags in a binary file. For SAM spec 2 though, they should be removed from the text format.
                  I'd go further and say the flag (and other things) will need redoing to cope with more than just paired reads - it will need to cope with N-tuples of reads each separated by an insert of some estimated size (e.g. Strobe Reads from Pacific Biosciences, or what Helicos calls dark fill).

                  Comment


                  • #39
                    sam FLAG

                    hi guys,
                    I found the SAM FLAG encoding method is very clever for storing the alignment information. But I also found that the the negative sign for the insert field in the following pair-end example:
                    The manual said the negative sign of insert fileld means the mapping position is smaller than the current one. But the fact is the reverse.
                    And also, in the following pair-end, the mapping position fileds are equal for the pair-end reads (2005683). But they are not equal just having overlap.

                    Any buddy can help me? Thanks in advance?

                    GRC076_1_35_8988_3804/GRC076_1_35_8988_3804 pPr1 NT_004350 2005683 255 76M = 2005683 101 TCCGGGTGGGGGCAGGGGCCCTGGAGGGGTCACTCGGCTGCCGTCTGTCACTTGGGTCCAGAGGAGCTTCTGGTGG CCCCCCCCCBBBBCCCCCCCCCCCCCCCB>CCCCCCCCCCCCDDBDCBDACDC>@B>BBBBBB@=BB>ABBB@?BC
                    GRC076_1_35_8988_3804/GRC076_1_35_8988_3804 pP2 NT_004350 2005683 255 76M = 2005683 -101 GTGGCCTCGGGAGCAAGGGTCAGACCCACCAGAAGCTCCTCTGGACCCAAGTGACAGACGGCAGCCGAGTGACCCC BCD@BB7@?<A<=AABBBCBBCCCCCCCCCCCC@CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

                    Originally posted by lh3 View Post
                    @yxi

                    Please use "samtools view -X" to see a human readable FLAG. I agree that not specifying a better FLAG field initially is a shortcoming, but it is too late to change the spec at the moment. samtools view -X comes as a temporary hack which I find useful.

                    Could you suggest a better format for the aux fields or to make SAM simpler? Note that SAM should be both human readable and machine readable. The current form is the best we can come to so far. Genbank/EMBL files are human readable, but they cause a lot of troubles in parsing, and we do not want to go in that way again. I think the best solution to human readability is not to change the spec, but to write a script to print a SAM alignment in multiple lines in a beautiful way. If you want to contribute to such a script, that would be great. Thanks.

                    Comment


                    • #40
                      hello everyone, the information above help me further understand the flag in the SAM format. But I still have problems in fully understanding the flag, like:
                      0x0002 the read is mapped in a proper pair
                      0x0004 the query sequence itself is unmapped
                      0x0008 the mate is unmapped
                      I don't know what is the meaning of "a proper pair" and the difference between "pair" and "mate", could anyone help me explaining them ?
                      And another question is I used tophat to deal with my PairEnd Illumina Seq, but the SAM file produced by tophat is like below:
                      FC30W3GAAXX:7:53:723:1789#0 73 1 487961 3 62M1849N13M * 0 0 ATCAGCTTCATTCCCTCAACAGTGTTCTTC
                      TTCAACGGGCAGCACATGAAGGTCGACTATGGATCTCCAGATCAC 84AB:B@:=A=-9BB?BB>@7>A@ABBBBB=;@B:BABBB?B>@AC@@AAB=CCA6@>?ABB>9@@ACCCA@C@B NM:i:7
                      XS:A:+ NH:i:2
                      FC30W3GAAXX:7:89:981:2025#0 137 1 487982 3 41M1849N34M * 0 0 GTGTTCTTCTTCAACGGGCAGCACATGAAG
                      GTCGACTATGGATCTCCAGATCACACCAAGTTTGTGGGAAGCTTC 8:;?6886<=:><6>8<=?>A:7=;8A?@@:BAA=A@@A@A@AAB@?@@@@B;B@AABA@BBABCCCBB@CBA;< NM:i:4
                      XS:A:+ NH:i:2
                      I want to know whether colomn 7-9(* 0 0) indicate my data were not considered as PE?

                      Comment


                      • #41
                        In the context of SAM/BAM, and "pair" is two reads from either end of the same fragment of DNA; the "mate" is the partner read in a pair of reads.

                        Thus if you are looking at the forward or /1 read, the mate is the reverse or /2 read, and vice versa). The pair is the combination of the forward and reverse reads (or the /1 and /2 reads depending on your naming convention).

                        With that in mind, does the FLAG bit field make more sense?

                        Comment


                        • #42
                          Thank you, maubp. I follow your explation, so in my opinion,
                          0x0008 the mate is unmapped
                          0x0020 strand of the mate
                          the two bit is only used if the data is in mate pattern, not useful in pair-end data?

                          Comment


                          • #43
                            Originally posted by northbio View Post
                            Thank you, maubp. I follow your explation, so in my opinion,
                            0x0008 the mate is unmapped
                            0x0020 strand of the mate
                            the two bit is only used if the data is in mate pattern, not useful in pair-end data?
                            If you have an unpaired read (i.e. singleton read where FLAG bit 0x0001 is not set), then it has no mate (no partner) so yes, 0x0008 and 0x0020 are meaningless and should not be set.

                            Comment


                            • #44
                              Hello -
                              This may be a naive question, but if the bitwise flag is 4 in a .sam file, shouldn't there always be an asterisk in the RNAME column? I'm getting reads that have a 4 in the FLAG column but also a legitimate reference in the RNAME column. If I understand correctly (which I may not), the RNAME column refers to the place that read maps to, yet a 4 in the FLAG column means it's unmapped. ??? Any help on this would be greatly appreciated - I think I'm throwing out aligned reads because of the 4 in the FLAG column and that is suboptimal. Thank you!!!
                              SH1

                              Comment


                              • #45
                                Hi.. I have problem in understanding the flag too. I used bwa sampe for alignment & have used picards http://picard.sourceforge.net/explain-flags.html to decipher the meaning of the flags.

                                Below is the flags and its description
                                99 -
                                read paired
                                read mapped in proper pair
                                mate reverse strand
                                first in pair

                                151 -
                                read paired
                                read mapped in proper pair
                                read unmapped
                                read reverse strand
                                second in pair

                                and following is an example to illustrate my doubt.

                                HWI-ST220_63:5:1101:6002:72582 99 chrX 166650106 0 50M = 166650248 192 TAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGTTAGGGTTAGGGTT CB@FFDEFHHHFHGIJJGHGJJJJGHCHJIFGGHIGGGGHIJFHGHIIGI XT:A:R NM:i:0 SM:i:0 AM:i:0 X0:i:9X1:i:3 XM:i:0 XO:i:0 XG:i:0 MD:Z:50
                                HWI-ST220_63:5:1101:6002:72582 151 chrX 166650248 0 50M = 166650106 -192 AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG BJIJIJJJIJJJJJJJJJJJJJJJJJJJJJJJJJJJJHHHGHFFFFFCCC XT:A:R NM:i:0 SM:i:0 AM:i:0 X0:i:670 XM:i:0 XO:i:0 XG:i:0 MD:Z:50

                                Both reads are repetitive and shows a reference to which it maps however one read shows the value 99 while its mate shows 151. How is that one is assigned a value which says mapped (as in 99) and the other unmapped (as in 151) unless the meaning as provided in picard is wrong or my understanding..

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM
                                • seqadmin
                                  Recent Developments in Metagenomics
                                  by seqadmin





                                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                  09-23-2024, 06:35 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 10-02-2024, 04:51 AM
                                0 responses
                                104 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-01-2024, 07:10 AM
                                0 responses
                                112 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 09-30-2024, 08:33 AM
                                1 response
                                115 views
                                0 likes
                                Last Post EmiTom
                                by EmiTom
                                 
                                Started by seqadmin, 09-26-2024, 12:57 PM
                                0 responses
                                21 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X