Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi Ih3,

    When samtools merges two sorted bam file, it sorts the merged file itself or have to sort again?

    Based on the example, if merge three bam files, a file with three header lines (even they are the same) has to be generated before?

    What's the difference between maq SNPfilter and samtools varFilter?

    Thanks

    Comment


    • Number of best hits in BWA alignment

      We are using BWA to perform short-reads alignments, and we have two queries.
      1) We are wondering how to extract statistics on the number of best hits for a read from the alignment file, in particular we are interested in counting "uniquely mapped reads".
      In the BWA manual page (http://bio-bwa.sourceforge.net/bwa.shtml), it seems that the field "X0" contains the number of best hits, but it happens that some mapped reads do not have a "X0" field (see the attached file).

      2) In the manual page, we also read about alignment score (field "AS"), but we found no "AS" field in the output file.

      Commands used to generate alignment were:
      bwa aln -q 34 Populus_trichocarpa.v2.fa s_1_1_sequence.txt > s_1_1_sequence_q34.sai

      bwa aln -q 34 Populus_trichocarpa.v2.fa s_1_2_sequence.txt > s_1_2_sequence_q34.sai
      bwa sampe Populus_trichocarpa.v2.fa s_1_1_sequence_q34.sai s_1_2_sequence_q34.sai s_1_1_sequence.txt s_1_2_sequence.txt > s_1_sequence_q34.sam

      Thank you for your help

      Comment


      • 1) Count how many reads mapped with non-zero mapping quality. X0 tag is not always reliable.

        2) AS tag is optional and bwa sampe/se does not produce this tag. See "NM" for number of mismatches/gaps.

        Comment


        • consensus missing NNNN... in the end

          This may be a quite minor issue in samtools.pl pileup2mq. The consensus sequence extracted from pileup file lost a string of NNNN... in the end, so it will make the consensus sequence a little bit shorter in length compare to reference. All the other gaps, including the beginning NNN..., were including in the consensus, but not for the last one.

          As far as I know, it seems to be that in the sort process, all the gaps information was not written in the sorted file. But in the samtools.pl script, Heng tried to put the gapped seq back to the consensus. All the gaps was placed in the right position, except for the ending one.

          BTW, I really like the idea of SAM format and the SamTools. It is a really good format manipulator for the bioinformatic community

          Comment


          • Originally posted by luisczul View Post
            I also got the following error while producing the SAM file

            GAGACTNAGCACNCAACGGA -",883&,,:#/1-"6&2)-":#6=67*#9,9&/0&3)-"3+=4<-"&5
            /mnt/scratch/awadalla/czuldieg/392Tmod/sources/I392T:35_18_1596 4 * 0 0 * * 0 0 NCCGGCATAGCTNTACCNTTAGACATAGCTCCGTCNCAGACNAACCCCA -"6;:<9>7=5<$-"4?>5-"67:&5=;8&6/0=2.7=-"37*3=-"1:
            [bns_coor_pac2real] bug! Coordinate is longer than sequence (4294967295>=4929). Abort!
            Aborted
            hi luisczul,
            I ran into the same problem. I was wondering how you fixed this?

            Comment


            • I am wondering how you can use SAMtool to convert ELAND alignment file s_N_sorted.txt or S_N_export.txt.
              I did not find a way to do it in the manual of SAMtool.

              Comment


              • Could someone help to compile SAMtool?

                See the errors:
                make[1]: Entering directory `/home/jta/biosoft/samtools-0.1.7a'
                gcc -c -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1 bam_tview.c -o bam_tview.o
                bam_tview.c:5:20: error: curses.h: No such file or directory
                bam_tview.c:7:2: warning: #warning "_CURSES_LIB=1 but NCURSES_VERSION not defined; tview is NOT compiled"
                bam_tview.c:409:2: warning: #warning "No curses library is available; tview is disabled."
                make[1]: *** [bam_tview.o] Error 1
                make[1]: Leaving directory `/home/jta/biosoft/samtools-0.1.7a'
                make: *** [all-recur] Error 1
                ***************
                It showed missing curses.h file. where is it?

                Thanks a lot.

                Comment


                • Originally posted by NSTbioinformatics View Post
                  Could someone help to compile SAMtool?

                  See the errors:
                  make[1]: Entering directory `/home/jta/biosoft/samtools-0.1.7a'
                  gcc -c -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1 bam_tview.c -o bam_tview.o
                  bam_tview.c:5:20: error: curses.h: No such file or directory
                  bam_tview.c:7:2: warning: #warning "_CURSES_LIB=1 but NCURSES_VERSION not defined; tview is NOT compiled"
                  bam_tview.c:409:2: warning: #warning "No curses library is available; tview is disabled."
                  make[1]: *** [bam_tview.o] Error 1
                  make[1]: Leaving directory `/home/jta/biosoft/samtools-0.1.7a'
                  make: *** [all-recur] Error 1
                  ***************
                  It showed missing curses.h file. where is it?

                  Thanks a lot.
                  The ncurses library is not installed in your system.
                  Fire up your package manager and install the ncurses-lib package.
                  -drd

                  Comment


                  • Thank you very much. Indeed after installing curses-lib pakage. It works

                    However......
                    I still don not know to use SAMtool to convert eland alignment into sam format

                    Does someone have experience about it?

                    Comment


                    • Originally posted by lh3 View Post
                      To lparsons:

                      After you compile samtools with "make", you will find "maq2sam-short" and "maq2sam-long" in the "misc/" directory. There is also a script "export2sam.pl" that converts Illumina's export to SAM. I have not thoroughly tested this script on all export files, though.
                      So this doesn't work for Illumina 1.3+?

                      Comment


                      • Yes, you are right.
                        .../samtools-0.1.7a/misc/export2sam.pl
                        does not work for s_N_sorted.txt and s_N_export.txt from GA pipeline 1.5

                        Any other solution?

                        Comment


                        • Li Heng, could you help us solve it?

                          Thank you very much.

                          Comment


                          • I found that there are few programs that actually handle 1.3+ formats. I resort to realigning everything, even though the output of my in-house Chipseq facility is eland format. Most people around here do it like that, anyways, and either use Bowtie, Maq or bwa.

                            Comment


                            • Do you have more experience about converting eland to sam or maq to share with us?

                              Although writing a script to convert eland format and sam format is not difficult, it will cost some time.

                              Comment


                              • Originally posted by NSTbioinformatics View Post
                                Yes, you are right.
                                .../samtools-0.1.7a/misc/export2sam.pl
                                does not work for s_N_sorted.txt and s_N_export.txt from GA pipeline 1.5

                                Any other solution?
                                NST,

                                Did you try the script on the s_N_export.txt file or the s_N_sorted.txt file? The script will work on an export file (as the name implies) but not on a sorted file. The last column of the export file shows whether the read in question is a PF read or not [Y/N]. Since the sorted file only includes PF reads this column would be redundant so it is omitted.

                                export2sam.pl checks the value of this column. If you provide s_N_sorted.txt as an input file the column will be absent and export2sam.pl will through a stream of errors along the lines of:

                                Code:
                                Use of uninitialized value in string ne at /usr/local/samtools/export2sam.pl line 67, <$fh1> line xxxxxx.
                                Here is workaround of you must use the s_N_sorted.txt. Add "<tab>Y" to the end of every line of your sorted file.
                                Last edited by kmcarr; 02-09-2010, 06:36 AM. Reason: Add workaround

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 08:06 AM
                                0 responses
                                16 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-30-2024, 12:17 PM
                                0 responses
                                17 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-29-2024, 10:49 AM
                                0 responses
                                22 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-25-2024, 11:49 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X