Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • simonandrews
    replied
    Originally posted by aurelielaugraud View Post
    We want to use SAM format but we don't get how to use the flag field, is anyone having an exemple for us please?
    The SAM flag field although it appears as a single number actually contains several pieces of information which have been combined together. It is a bitwise field, which means that it makes use of the way that computers represent numbers to store several small values stored in one large value.

    If you think of a standard integer as being composed of 32 bits (0 or 1) then it would look like:

    00000000000000000000000000000000

    However SAM uses this single number as a series of boolean (true false) flags where each position in the array of bits represents a different sequence attribute

    Bit 0 = The read was part of a pair during sequencing
    Bit 1 = The read is mapped in a pair
    Bit 2 = The query sequence is unmapped
    Bit 3 = The mate is unmapped
    Bit 4 = Strand of query (0=forward 1=reverse)

    etc.

    Constructing the value from the individual flags is fairly easy. If the flag is false don't add anything to the total. If its true then add 2 raised to the power of the bit position.

    For example:

    Bit 0 - false - add nothing
    Bit 1 - true - add 2**1 = 2
    Bit 2 - false - add nothing
    Bit 3 - true - add 2**3 = 8
    Bit 4 - true - add 2**4 = 16

    Bit pattern = 11010 = 16+8+2 = 26

    So the flag value would be 26.

    To extract the individual flags from the compound value you can use a logical AND operation. This will tell you if a specific bit in the compound value is true or not. The exact syntax will depend on the language you're using, but in Perl for instance you could do:

    if ($compound & 16) {
    print "Reverse";
    }
    else {
    print "Forward";
    }

    To extract the information from the 4th (therefore 2**4 = 16) bit field.

    I hope that makes it a bit clearer.

    Leave a comment:


  • aurelielaugraud
    replied
    Hello and thanks for taking some time to reply .
    I have already had a look at samtools manual regarding the flag section but I am sorry it doesn't seem to be enough for me and I would really appreciate an example. or some documentation on how to use bitwise data ...
    Maybe I can leave the MAPQ field with 255 value as indicated in the manual is the program I use to map the reads doesn't caclulate this value for me. ( the more it goes, the least I wish to calculate it myself)

    Leave a comment:


  • totalnew
    replied
    FLAG field is normally associated with other attributes to parse the read, likely, if this read is mapped, if this red is paired..... see the description table in samtools manual.

    For MAPQ calculation, it is complicated. The mapping quality is the phred-scaled probability that a read alignment may be wrong. In practice, MAPQ is approximated in some ways. bwa approximates the MAPQ as 0, 23, 25 37, 255,...according to the factors of number of best hits, number of second best hits.......

    Leave a comment:


  • aurelielaugraud
    started a topic bitwise flag in sam format and others

    bitwise flag in sam format and others

    Hello,
    new to the seqanswers community, I have started with NGS data (illumina) about 2 month ago.
    We align with GEM (not published yet but quite good as far as we tried it), we tried Maq but is doesn't seem to work for the moment on our machines (says it is running but not writting anything ....), and I am currently running bwa at the moment as another test.
    We want to use SAM format but we don't get how to use th flag field, is anyone having an exemple for us please ? I know bwa outputs can be converted directly but we would need to do it ourselves for the GEM output.

    moreover, there is a question on the MAPQ field. we don't have any direct information about that I think. Can we infer it ? how is it calculated ? we have phred-based quality scores for each base.

    thanks to the community.

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
31 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
33 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
53 views
0 likes
Last Post seqadmin  
Working...
X