Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • I am mad because of samtools sort command

    I got .sam files from Bowtie2.
    now I want to merge these two files.First ,I run:
    samtools view -bSh ERR1.sam >ERR1.bam
    samtools view -bSh ERR2.sam >ERR2.bam

    and,I got the bam file.(they should have the head)
    However,I run the next:
    samtools sort ERR1.bam ERR1.sorted.bam (here,I got the sorted file,lucky)
    samtools sort ERR2.bam ERR2.sorted.bam
    about the ERR2.bam, I didn't get the sorted file, this was the output:

    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [bam_sort_core] truncated file. Continue anyway.
    Segmentation fault (core dumped)

    why?Just because the ERR2.sam is too big(about 66G)?

  • #2
    supplement:
    I run command: samtools view ERR2.bam |less -S
    and I got this:
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [main_samview] fail to read the header from "ERR173170_paired.bam".

    Comment


    • #3
      If the file is too big for sorting you could split the .sam file on chromosome, sort each, recombine, then convert to .bam.

      Comment


      • #4
        Originally posted by biocomputer View Post
        If the file is too big for sorting you could split the .sam file on chromosome, sort each, recombine, then convert to .bam.
        You mean that i got the fault in producing the sam file?but my sam file is okay!

        Comment


        • #5
          like your mean, maybe I need to split my big sam file

          Comment


          • #6
            66 gigs isn't too big to sort, the original BAM file was corrupt, likely due to running out of space or a hardware problem. Make sure you have enough space and then remake the BAM file.

            Comment


            • #7
              The space is enough, how about the memory?

              Comment


              • #8
                The whole SAM file isn't loaded into memory, it's processed line by line (and compressed in blocks).

                Comment


                • #9
                  A transient hardware error is the most likely cause of this sort of thing.

                  Comment


                  • #10
                    I recently got an error like that because I switched my reads with my reference sequence while mapping. I.e. my alignment was of my reference to my reads. Maybe that's your problem?

                    Here's a (correct) bash function that I used to map reads to a reference and only grab the mapped reads from the sam. Hopefully this can help guide you:
                    Code:
                    map () {
                    	bwa index -a bwtsw $refseq
                    	bwa bwasw $refseq ../temp/$1/sampled_reads.fasta > ../temp/$1/alignment.sam
                    	samtools view -bS -F 4 ../temp/$1/alignment.sam > ../temp/$1/mapped.alignment.bam
                    	samtools sort ../temp/$1/mapped.alignment.bam ../results/$1/mapped.sorted.alignment
                    	samtools index ../results/$1/mapped.sorted.alignment.bam
                    }

                    Comment


                    • #11
                      Thank you a lot for sparing your beautiful bash.
                      However ,it seems that your bash is suitable for unpaired alignment. I thought that because the command samtools view -bS -F 4 ../temp/$1/alignment.sam > ../temp/$1/mapped.alignment.bam ,you discard the unmapped reads. But how to set the parameter -F In paired alignment reads (.sam) ?
                      Thanks all .

                      Comment


                      • #12
                        -F 4 will remove unmapped reads in either case. If you want to remove those reads with an unmapped mate then just filter according to that bit in the flag.

                        Comment


                        • #13
                          About the parameter -F of samtools view

                          Originally posted by dpryan View Post
                          -F 4 will remove unmapped reads in either case. If you want to remove those reads with an unmapped mate then just filter according to that bit in the flag.
                          Hi, dpryan. Thank you for your answer. Now I have the similar quetions about -F, . I appreciate and hope you can help me.
                          1) Should I remove the unmapped reads (but its mates mapped) or the unmapped mates(but its reads mapped)
                          2) about the paired reads, if I remove all them above ,should I use -F 12? However , it seems that there's no the value of 12. How about the 77 or 141.

                          Comment


                          • #14
                            1) It depends on what you want to do with the results.
                            2) -F 12 is correct. There don't have to be any flags with that value since this is a bit comparison.

                            Comment


                            • #15
                              Thank you, dpryan.
                              Today I tested it ,and the result is Consistent with your answer!
                              Thanks, again.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Recent Advances in Sequencing Analysis Tools
                                by seqadmin


                                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                                Today, 07:48 AM
                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 07:17 AM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-02-2024, 08:06 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-30-2024, 12:17 PM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-29-2024, 10:49 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X