Header Leaderboard Ad

Collapse

Odd characters in samtools mpileup output

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Odd characters in samtools mpileup output

    I'm struggling to figure out what some of the characters in my samtools mpileup output are. Here's one of the offending bases (scaffold1:25513). There are many.

    Code:
    scaffold1       25513   G       20      <<,,,,,a,A,aa,a,AA..    #A.CCFG6G6F67E:F<6GF
    So it tells me that the read depth is 20, and this is confirmed by counting the number of characters in each of the last two columns. But I have absolutely no idea what the "<" character represents in the read_bases column (column #5).

    The only special characters I'm expecting to see are '.' and ',' (indicating forward and reverse matches) '+' and '-' (indicating indels), and '^' (followed by a symbol indicating read-mapping quality) and '$' (indicating the beginning and end of a read respectively).

    So can anyone tell me what '<' means in column 5?


    EDIT: To answer my own question somewhat, '<' and '>' indicate a "reference skip" according to the mpileup documentation. (Although they are not mentioned in the pileup format documentation, which is why I couldn't find them.) However, I have absolutely no idea what "reference skip" means, so I'm still out of luck. If it's referring to a base that is not covered (e.g. due to splicing) then shouldn't the coverage ideally be reported as 18, not 20?
    Last edited by Bueller_007; 08-26-2011, 04:52 PM.

Latest Articles

Collapse

  • seqadmin
    Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
    by seqadmin



    Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
    03-21-2023, 01:49 PM
  • seqadmin
    Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
    by seqadmin




    Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
    03-10-2023, 05:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 11:44 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-24-2023, 02:45 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2023, 12:26 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-17-2023, 12:32 PM
0 responses
19 views
0 likes
Last Post seqadmin  
Working...
X