Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • @luisczul

    Samtools indicates that the error happens to line 164507. What does that line look like?As for the second problem, it seems like a bug. You are using very short reference. Could you send me an example? Thanks.

    @henry
    Are you generating results with "samse -n"? With -n, the output is NOT sam. You can tell this from the bwa manual page.

    These three questions are less relevant to samtools. They are mostly related to bwa.

    Comment


    • Originally posted by lh3 View Post
      @luisczul

      Samtools indicates that the error happens to line 164507. What does that line look like?As for the second problem, it seems like a bug. You are using very short reference. Could you send me an example? Thanks.

      @henry
      Are you generating results with "samse -n"? With -n, the output is NOT sam. You can tell this from the bwa manual page.

      These three questions are less relevant to samtools. They are mostly related to bwa.
      hi, lh3,
      I ran ./bwa samse -n 255 ref.fa tissue.sai tissue.fastq> tissue.sam
      I will try bwa without setting -n.
      Thank you for your help.

      Best

      Jing

      Comment


      • Originally posted by lh3 View Post
        For NGS data analysis, an aligner tends to be successful when it comes with utilities for comprehensive downstream analyses such as reference based assembly, SNP/indel calling and alignment viewer. Eland/GAPipeline, Soap and Maq are such examples. Unfortunately, it is non-trivial to implement all these downstream analyses and implementing these for each aligner would be a waste of time and human resources as well. Mostly we want to separate alignment from the downstream analyses after the alignment. To achieve this, we need a generic alignment format that makes all aligners happy. NovoAlign and Bowtie can output Maq alignment format to take the advantage of Maq downstream data processing. However, Maq format does not really suit the goal. It does not support longer reads nor alignment with more than one indel and it is too specific to Maq. To solve this problem, the 1000Genome Project Committee decided to develop a generic alignment format. And now the first version of specification and implementation have come out.

        The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:



        I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.
        With SAMtools, can I extract the number of unique aligned reads, the number of aligned reads with one mismatches, the number of aligned reads with two mismatches?
        With SAMtools, can I extract the number of aligned reads in each chromosome?

        Thank you a lot.

        Jing

        Comment


        • Does anyone have a sense for why the output of 'samtools flagstat' doesn't match the read numbers in the original fastq file? Total reads reported is actually the number of reads mapped, according to maq.

          The other numbers (mapped, paired) also seem out. This happens both when converting from maq with maq2sam, and with SSAHA2's sam output. I've compared the output produced by 'maq map' vs samtools flagstat, and the numbers are different.

          Comment


          • Originally posted by tgolubch View Post
            Does anyone have a sense for why the output of 'samtools flagstat' doesn't match the read numbers in the original fastq file? Total reads reported is actually the number of reads mapped, according to maq.

            The other numbers (mapped, paired) also seem out. This happens both when converting from maq with maq2sam, and with SSAHA2's sam output. I've compared the output produced by 'maq map' vs samtools flagstat, and the numbers are different.
            I agree with your assessment. It is based on the aligner, since most aligners do not include unmapped reads in their SAM file. This does not happen with my own software BFAST. If you want a more accurate assessment, see the "dbamstats" utility in the DNAA package.

            Nils

            Comment


            • Thanks, Nils, that clears up the first question. I'm still not sure why the other numbers are different.

              Re BFAST - looks interesting. Can BFAST produce SAM output files?

              Comment


              • DNAA and SAM

                Does DNAA have any documentation?

                Comment


                • Originally posted by tgolubch View Post
                  Thanks, Nils, that clears up the first question. I'm still not sure why the other numbers are different.

                  Re BFAST - looks interesting. Can BFAST produce SAM output files?
                  BFAST outputs to the SAM format, as well as UCSC MAF, GFF, as well as other formats. Also note it includes unmapped reads in the SAM file, and for SOLiD reads it retains the color space information, which is critically important for variant discovery.

                  Originally posted by ech View Post
                  Does DNAA have any documentation?
                  "Release early, Release often" ~ Eric S. Raymond

                  So this is the "early" step. Unfortunately there is very little actual documentation (beyond the code) and I hope to get it up soon. PM me if you want examples or if you want to know how use the package to solve your problems.

                  Comment


                  • Is there a way to install BFAST without root access? I don't want it going into /usr/bin (and don't have access to it on the work server in any case).

                    Comment


                    • Originally posted by tgolubch View Post
                      Is there a way to install BFAST without root access? I don't want it going into /usr/bin (and don't have access to it on the work server in any case).
                      Technically you can't install it to /usr/bin without root access. I assume you have a typically Unix system and you use BASH.

                      A useful trick to "fake" an install is to make your own "bin" directory and add the "bin" directory to your path. If your "bin" is found at "$HOME/bin", add this line to your "$HOME/.bash_profile" (assuming bash):
                      Code:
                      PATH=$PATH:/home/<username>/bin
                      Now you can put everyting in "$HOME/bin".

                      You can install BFAST in "$HOME/bin" by using the "prefix" option in the "configure" script:
                      Code:
                      ./configure --prefix=$HOME
                      (This is not a typo, --prefix=$HOME/bin would install it in $HOME/bin/bin).
                      Now run:
                      Code:
                      make
                      make install
                      and everything should be installed into $HOME/bin.

                      Nils

                      Comment


                      • hi heng
                        I used bwa to generate a sam file.It works well with samtools, but when I use picard ExampleSamUsage, it reprots Exception:
                        Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MRNM specified but flags indicate unpaired; File e:\temp\aln.sam; Line 1
                        Line: >chrY_1 0 chrY 2154703 37 45M = 2154703 0 ATGGTGGCGAGCGCCTGTAGTCCCAGCTACTCGGGAGGCAGGAGA hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:45
                        at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:168)
                        at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
                        at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:316)
                        at net.sf.samtools.SAMTextReader$RecordIterator.<init>(SAMTextReader.java:202)
                        at net.sf.samtools.SAMTextReader$RecordIterator.<init>(SAMTextReader.java:188)
                        at net.sf.samtools.SAMTextReader.getIterator(SAMTextReader.java:124)
                        at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:197)
                        at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:39)
                        at net.sf.samtools.example.ExampleSamUsage.convertReadNamesToUpperCase(ExampleSamUsage.java:54)
                        at net.sf.samtools.example.ExampleSamUsage.main(ExampleSamUsage.java:93)
                        Java Result: 1
                        Dose picard is well tested to use now?

                        Comment


                        • Picard is well tested. It reports error because it is more rigorous than samtools. This error is actually a bwa problem, not picard. You may try the latest bwa-0.5.2. I believe I have fixed the issue.

                          Comment


                          • Hi,

                            I obtained complete genome sequence data in the form of “.bam” alignment files from ftp://ftp-trace.ncbi.nih.gov/1000genomes/ and I tried to extract a sub alignment of a region of chromosome 1 with the "view" command.


                            ./samtools view -q 30 -t ../human_b36_male.fa.fai ../NA12878.SLX.maq.SRP000031.2009_08.bam chr1:150,792,101–150,884,101

                            I get the following:

                            [sam_header_read2] 114 sequences loaded.
                            [main_samview] random alignment retrieval only works for indexed BAM files.

                            Could anyone help me fix this problem? Thanks a lot.
                            M.

                            Comment


                            • Originally posted by Jeckow View Post
                              Hi,

                              I obtained complete genome sequence data in the form of “.bam” alignment files from ftp://ftp-trace.ncbi.nih.gov/1000genomes/ and I tried to extract a sub alignment of a region of chromosome 1 with the "view" command.


                              ./samtools view -q 30 -t ../human_b36_male.fa.fai ../NA12878.SLX.maq.SRP000031.2009_08.bam chr1:150,792,101–150,884,101

                              I get the following:

                              [sam_header_read2] 114 sequences loaded.
                              [main_samview] random alignment retrieval only works for indexed BAM files.

                              Could anyone help me fix this problem? Thanks a lot.
                              M.
                              It looks like you either have not sorted the BAM file (samtools sort) or you have you have sorted the BAM file but have not created the index (samtools index).

                              Comment


                              • Originally posted by lh3 View Post
                                Picard is well tested. It reports error because it is more rigorous than samtools. This error is actually a bwa problem, not picard. You may try the latest bwa-0.5.2. I believe I have fixed the issue.
                                Yes, I used bwa-0.5.0, bwa-0.5.2 have no such bugs at all.

                                Excellent wrok!!!

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Recent Advances in Sequencing Analysis Tools
                                  by seqadmin


                                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                                  05-06-2024, 07:48 AM
                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:35 AM
                                0 responses
                                14 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-09-2024, 02:46 PM
                                0 responses
                                19 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-07-2024, 06:57 AM
                                0 responses
                                17 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-06-2024, 07:17 AM
                                0 responses
                                19 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X