Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RockChalkJayhawk
    replied
    Originally posted by nilshomer View Post
    You can use the MRNM and MPOS fields in the SAM file.
    So in that case, my MRNM does not equal "=" OR MRNM equals "=" and the difference between POS and MPOS > 1 million.

    Is this correct?
    Last edited by RockChalkJayhawk; 04-13-2010, 01:17 PM. Reason: Incorrect assumption

    Leave a comment:


  • nilshomer
    replied
    Originally posted by RockChalkJayhawk View Post
    Lets say I have RNA-Seq data (Paired-End) and I want to find out if the mates are mapped > 1 Mb on the same chromosome or map to 2 different chromosomes. How do I determine that from the FLAGS?
    You can use the MRNM and MPOS fields in the SAM file.

    Leave a comment:


  • RockChalkJayhawk
    replied
    FLAGS for fusion detection

    Lets say I have RNA-Seq data (Paired-End) and I want to find out if the mates are mapped > 1 Mb on the same chromosome or map to 2 different chromosomes. How do I determine that from the FLAGS?

    Leave a comment:


  • jdiezperezj
    replied
    So, is it already possible to convert soap aligner output format to SAM or BAM formats.
    Best.
    Javi

    Originally posted by lh3 View Post
    To corthay:

    You are quick. I am planning a new bwa release as I realized that I could improve it a little without much work (PS: the new version is released now). Wgsim, wgsim_eval.pl and converters for soap and bowtie are available from SVN only:

    svn co https://samtools.svn.sourceforge.net...s/dev/samtools samtools

    Leave a comment:


  • jeffhsu3
    replied
    If an insertion or deletion occurs at the end of the pileup read bases string, they don't seem to the extra character after the '\+[0-9]+[ACGTNacgtn]+' pattern.

    For example:
    chr1 2263 C 4 ,$.$.,+1t CC9C FFFF.

    Am I missing something? The pattern is described here: pileup format, and it mentions the in/del pattern '\+[0-9]+[ACGTNacgtn]+' but there appears to be an extra character in the examples given on the page:

    seq2 156 A 11 .$......+2AG.+2AG.+2AGGG <975;:<<<<<

    That extra character appears to be missing if the in/del occurs at the end of the read bases string. Including that extra character as part of the insertion/deletion it makes the read_bases match with the read number.
    Last edited by jeffhsu3; 04-05-2010, 12:03 PM. Reason: Made more clear and added examples.

    Leave a comment:


  • GoneSouth
    replied
    why do deletions in the pileup-file have a quality attached

    Hi guys,

    Does anyone know why deletions in the pileup file have an quality attached??? How can a deletion have a quality?
    And how is this calculated??

    For example:

    YHet 23690 N 1 a-1n Q
    YHet 23691 N 1 * [
    YHet 23692 N 1 c [


    or

    YHet 25409 N 5 AAA-2NNa-2nnA-2NN VTW`a
    YHet 25410 N 5 A$A$*** USR`a
    YHet 25411 N 3 *** SG`


    best ro

    Leave a comment:


  • drio
    replied
    Originally posted by Solyris View Post
    Hi,

    I am quite new to NGS data here and I work with a commercial software from CLCbio which also offers a mapping algorithm of its own, called Genomic Workbench.

    I would want to convert my SAM output from the software to BAM to allow using the samtools function like pileup.

    I get the following error when i ran the command in Ubuntu OS

    >./samtools view -huS -o DATA/test.bam DATA/s_2_1_sequence_SS200_LAwMM.sam
    [samopen] SAM header is present: 24 sequences.
    Parse error at line 113: CIGAR and sequence length are inconsistent
    Aborted

    I read somewhere in this thread that currently the samtools does not allow sam file processing without the reference sequence, so is the whats giving the problem? If so can anyone point me to a place to generate the correct reference sequence file, I tried reading through the manual but there is nowhere telling me how the reference file should be formatted. And I am looking at the whole human reference genome with 24 gbk files from NCBI.

    Any help is appreciated.

    Thanks
    Sol
    samtools performs some sanity checks in the CIGAR string and it is telling you something is not right. Have you looked to that particular alignment to confirm if the CIGAR is correct?

    Leave a comment:


  • Solyris
    replied
    Hi,

    I am quite new to NGS data here and I work with a commercial software from CLCbio which also offers a mapping algorithm of its own, called Genomic Workbench.

    I would want to convert my SAM output from the software to BAM to allow using the samtools function like pileup.

    I get the following error when i ran the command in Ubuntu OS

    >./samtools view -huS -o DATA/test.bam DATA/s_2_1_sequence_SS200_LAwMM.sam
    [samopen] SAM header is present: 24 sequences.
    Parse error at line 113: CIGAR and sequence length are inconsistent
    Aborted

    I read somewhere in this thread that currently the samtools does not allow sam file processing without the reference sequence, so is the whats giving the problem? If so can anyone point me to a place to generate the correct reference sequence file, I tried reading through the manual but there is nowhere telling me how the reference file should be formatted. And I am looking at the whole human reference genome with 24 gbk files from NCBI.

    Any help is appreciated.

    Thanks
    Sol

    Leave a comment:


  • seq_GA
    replied
    Hi,

    I have few queries about samtools.

    1. I am using eland mapping output and start using export2sam.pl. All the PF reads from export are being used for down stream analysis.
    2. How the uniquely mapped and multiple mapped are hadled during pileup command?
    3. The extended CIGAR column always shows 35M (ie the length of the read). How did the mismatch information would be incorporated? The column 15 of export contains the match descriptor information.

    Thanks.
    Last edited by seq_GA; 02-28-2010, 08:57 PM.

    Leave a comment:


  • ylc
    replied
    pick chromosome before bam sort?

    A newbie question:
    I can view by chromosome after a .bam file is sorted and indexed. Is it possible to extract by a chromosome number from the bam file and then do sorting and indexing? It will save time if I'm only interested in certain chromosomes and have many samples.

    Thanks.

    Leave a comment:


  • nilshomer
    replied
    Originally posted by NSTbioinformatics View Post
    I think, "4" is the "unmapped" flag, is it right?

    What is difference between flag 4 and 20?

    Thank you very much.
    Use the "-X" option in "samtools view", it will probably help your interpretation of the FLAG field.

    Leave a comment:


  • NSTbioinformatics
    replied
    I think, "4" is the "unmapped" flag, is it right?

    What is difference between flag 4 and 20?

    Thank you very much.

    Leave a comment:


  • lh3
    replied
    This read is mapped to the junction between two adjacent reference, so it gets an "unmapped" flag.

    Leave a comment:


  • NSTbioinformatics
    replied
    Question about the output of bwa?

    I got the output, see below:
    HWI-EAS307:1:54:758:902#0 20 19641_CLSZ1904.b1_P20.ab1_CLSZ_L._sativa_library_forward_335 301 20 36M * 0 0 CAAATCGGTGTGTTTTCACTGGTCGTGCTCGTTCCG aabaaaaaaaaababaa`aaaabaabaabbabaaaa XT:A:U NM:i:1 X0:i:1 X1:i:2 XM:i:1 XO:i:0 XG:i:0 MD:Z:35T0 XA:Z:13134_QGB27J17.yg.ab1_QGB_L._sativa_library_forward_448,-58,36M,2;7061_CLS_S3_Contig6993_CLS_S3_L._sativa_library_forward_968,-404,36M,2;

    I can not understand the flag value 20. I used "samse" to process single reads.
    "XT:A:U" indicates the read uniquely mapped to the reference, why i still got XA for alternative alignment inforamtion?
    It is confused me. Someone could help me a bit for that? Thank you very much

    Leave a comment:


  • lh3
    replied
    Try breakdancer.

    Leave a comment:

Latest Articles

Collapse

  • SEQadmin2
    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
    by SEQadmin2


    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
    ...
    Yesterday, 10:05 AM
  • SEQadmin2
    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
    by SEQadmin2


    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


    Introduction

    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
    05-22-2026, 06:42 AM
  • SEQadmin2
    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
    by SEQadmin2

    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
    05-06-2026, 09:04 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, Yesterday, 12:03 PM
0 responses
19 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, Yesterday, 11:40 AM
0 responses
14 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 05-28-2026, 11:40 AM
0 responses
29 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 05-26-2026, 10:12 AM
0 responses
31 views
0 reactions
Last Post SEQadmin2  
Working...