Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error found while mapping Illumina data using BWA

    While mapping Illumina data against BWA, I noticed several errors :

    ERROR: Record 58, Read name HWI-ST142_0217:1:63:8795:27966#0, Mate negative strand flag does not match read negative strand flag of mate

    The corresponding reads are

    HWI-ST142_0217:1:63:8795:27966#0 89 chr1 27 0 52M1D48M = 27 0 ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCGAACCCAACCCTAACCCTAACCCTAACCCTAACCCTACCCTAACCCTAACCCTA BBBBBBB___T^^ddc^bccddYdaa``]^O_U]\bGbbbZ_PZ\ZPT\Zbdaadeeee`ddddd`caccb\bbbbb\bbdcTdddcadddd`dadcdcdXT:A:R NM:i:3 SM:i:0 AM:i:0 X0:i:3 X1:i:1 XM:i:2 XO:i:1 XG:i:1 MD:Z:46T5^T29A18 XA:Z:chr15,+100338770,17M1D83M,3;chr4,-62,82M1D18M,3;chr1,-21,52M1D48M,4;

    HWI-ST142_0217:1:63:8795:27966#0 181 chr1 27 0 * = 27 0 GGGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTGGTTAGGG bT\^`^bddad\bdadbbdd``Yc`b\d`dbdd^d`ccc\`bYb`TUXYPeT\eecfdfdeeccfcefcfe\eeccfefeffeffffffffffcfcfeff

    The flag of these reads show up as

    1 0 1 1 0 0 1 - first read (forward strand of mate, reverse strand of query)
    1 0 1 1 0 1 0 1 - second read (reverse strand of mate, reverse strand of query)

    The sixth bit (from the left) for both these reads (mate strand) conflicts with the fifth bit (strand of the query).

    Is this the reason for the error I am seeing ? Is there a way to prevent such errors from occurring ?
    Last edited by nirav99; 09-02-2010, 01:47 PM. Reason: Adding new line for clarity of reading

  • #2
    Picard's command line utility FixMateInformation fixes this error.

    Comment


    • #3
      hi, nirav99

      I got the same error "Mate negative strand flag does not match read negative strand flag of mate" and I tried FixMateInformation as:

      java -Xmx2g -jar FixMateInformation.jar INPUT=test.bam OUTPUT=test_fixed.bam VALIDATION_STRINGENCY=SILENT

      but I didn't get the output file and got this error:
      Exception in thread "main" net.sf.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:199)
      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:189)
      at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:120)
      at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:37)
      at net.sf.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:185)
      at net.sf.samtools.util.SortingCollection.add(SortingCollection.java:140)
      at net.sf.picard.sam.FixMateInformation.doWork(FixMateInformation.java:145)
      at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:156)
      at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:117)
      at net.sf.picard.sam.FixMateInformation.main(FixMateInformation.java:74)
      Caused by: java.io.IOException: No space left on device
      at java.io.FileOutputStream.writeBytes(Native Method)
      at java.io.FileOutputStream.write(FileOutputStream.java:260)
      at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
      at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:197)


      ....

      Comment


      • #4
        Hi Cliff,

        Please check the disk space where you are running this. The exception indicates

        "Caused by: java.io.IOException: No space left on device"

        Comment


        • #5
          thanks for getting back to me. I actually have 15TB left..

          Comment


          • #6
            Could it be that the temporary path is not pointing towards your 15TB?

            Comment


            • #7
              The scratching happens in /tmp/<user_name>, so the problem was most likely /tmp. Use TMP_DIR to specify a different temporary dir.
              -drd

              Comment


              • #8
                I specified a temporary directory which has enough space by

                java -Xmx4g -Djava.io.tmpdir=/home/temp -jar

                it still failed...

                Comment


                • #9
                  Hi Cliff,

                  TMP_DIR is a command line parameter for Picard tools.

                  So, the way to do it would be

                  java -Xmx4G -jar FixMateInformation.jar I=input.bam O=output.bam TMP_DIR=/home/temp VALIDATION_STRINGENCY=LENIENT

                  Comment


                  • #10
                    Hi, Nirav99

                    Thanks! I tried your command and still failed..

                    Comment


                    • #11
                      Similar error found when using SortSam

                      Hi Cliff. Did you find a solution to your problem? I'm getting the same error, except for me it occurs when calling SortSam. Similar to your case we have lots of space available (62 TB).

                      java -Xmx20g -Xms8g -jar SortSam.jar MAX_RECORDS_IN_RAM=2000000 VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true SORT_ORDER=coordinate INPUT=sequence.sam OUTPUT=sequence.sorted.bam


                      Runtime.totalMemory()=17176592384
                      Exception in thread "main" net.sf.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
                      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:199)
                      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:189)
                      at net.sf.samtools.util.BinaryCodec.writeString(BinaryCodec.java:285)
                      at net.sf.samtools.BinaryTagCodec.writeTag(BinaryTagCodec.java:168)
                      at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:144)
                      at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:37)
                      at net.sf.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:201)
                      at net.sf.samtools.util.SortingCollection.add(SortingCollection.java:140)
                      at net.sf.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:157)
                      at net.sf.picard.sam.SortSam.doWork(SortSam.java:67)
                      at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:157)
                      at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:117)
                      at net.sf.picard.sam.SortSam.main(SortSam.java:79)
                      Caused by: java.io.IOException: No space left on device
                      at java.io.FileOutputStream.writeBytes(Native Method)
                      at java.io.FileOutputStream.write(FileOutputStream.java:297)
                      at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
                      at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
                      at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:197)
                      ... 12 more



                      Originally posted by cliff View Post
                      Hi, Nirav99

                      Thanks! I tried your command and still failed..

                      Comment


                      • #12
                        Picard java exception

                        Hi I got similar error using Picard MergeSamFiles.jar
                        Wondering if anyone found an answer for it

                        Exception in thread "main" net.sf.samtools.util.RuntimeIOException: java.io.FileNotFoundException: /scratch/temp/sortingcollection.4193001834577553277.tmp (Too many open files)
                        at net.sf.samtools.util.SortingCollection$FileRecordIterator.<init>(SortingCollection.java:445)
                        at net.sf.samtools.util.SortingCollection$MergingIterator.<init>(SortingCollection.java:384)
                        at net.sf.samtools.util.SortingCollection.iterator(SortingCollection.java:254)
                        at net.sf.samtools.util.SortingCollection.iterator(SortingCollection.java:43)
                        at net.sf.samtools.SAMFileWriterImpl.close(SAMFileWriterImpl.java:190)
                        at net.sf.samtools.AsyncSAMFileWriter.synchronouslyClose(AsyncSAMFileWriter.java:42)
                        at net.sf.samtools.util.AbstractAsyncWriter.close(AbstractAsyncWriter.java:78)
                        at net.sf.picard.sam.MergeSamFiles.doWork(MergeSamFiles.java:154)
                        at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
                        at net.sf.picard.sam.MergeSamFiles.main(MergeSamFiles.java:79)
                        Caused by: java.io.FileNotFoundException: /scratch/temp/sortingcollection.4193001834577553277.tmp (Too many open files)
                        at java.io.FileInputStream.open(Native Method)
                        at java.io.FileInputStream.<init>(Unknown Source)
                        at net.sf.samtools.util.SortingCollection$FileRecordIterator.<init>(SortingCollection.java:439)
                        ... 9 more
                        Even in my case the space available in /scratch is around 40T

                        Comment


                        • #13
                          If anyone is still listening to this thread - what versions of bwa is everyone using who is getting this error?

                          Comment


                          • #14
                            I did not use bwa. I used SHRiMP 2.2.2 for mapping. samtools 1.8 for converting sam to bam and Picard version: 1.74 for combining mapped bam files.

                            Comment


                            • #15
                              Regarding the java.io.FileNotFoundException: (Too many open files), there are a number of things to do. Picard sorting seems to open so many files that it can overload default limits on unix systems. To deal with this, you can either (1) ask you sys admin to increase the allowable number of open files using ulimit -n or (2) for some picard programs you can increase the value of the MAX_RECORDS_IN_RAM command-line parameter which instructs Picard to store more records in fewer files and reduce the number of open files. (watch out for increased memory usage though). If your using markdups and see this error you can 3) use the command-line para
                              meter MAX_FILE_HANDLES_FOR_READ_ENDS_MAP. By reducing this number, you reduce the number of concurrently open files.

                              On our hpc, ulimit -n == 4096, so I use MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=4000

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Genetic Variation in Immunogenetics and Antibody Diversity
                                by seqadmin



                                The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                                11-06-2024, 07:24 PM
                              • seqadmin
                                Choosing Between NGS and qPCR
                                by seqadmin



                                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                10-18-2024, 07:11 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 11:09 AM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Today, 06:13 AM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 11-01-2024, 06:09 AM
                              0 responses
                              30 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-30-2024, 05:31 AM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X