Dear All,
I got an error when trying to remove duplicates in my sequence file with MarkDuplicates.jar in Picard tool.
Here is what I did:
I generated the alignment by BWA, then sorted the .bam file with samtools sort to generate a file .sort.bam.
Then I tried to remove duplicates with this command:
java -jar ../picard-tools-1.69/picard-tools-1.69/MarkDuplicates.jar INPUT=file.sort.bam OUTPUT=file.sort.rmdup.bam REMOVE_DUPLICATES=true METRICS_FILE=rmdup.txt AS=true
However, I got error like the following:
Could anyone help me to solve the problem?!
I got an error when trying to remove duplicates in my sequence file with MarkDuplicates.jar in Picard tool.
Here is what I did:
I generated the alignment by BWA, then sorted the .bam file with samtools sort to generate a file .sort.bam.
Then I tried to remove duplicates with this command:
java -jar ../picard-tools-1.69/picard-tools-1.69/MarkDuplicates.jar INPUT=file.sort.bam OUTPUT=file.sort.rmdup.bam REMOVE_DUPLICATES=true METRICS_FILE=rmdup.txt AS=true
However, I got error like the following:
[Thu May 31 22:59:30 EDT 2012] net.sf.picard.sam.MarkDuplicates INPUT=[MASC_OG_K27me3_73010.sort.bam] OUTPUT=MASC_OG_K27me3_73010.sort.rmdup.bam METRICS_FILE=rmdup.txt REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]
[0-9]+)
[0-9]+)
[0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Thu May 31 22:59:30 EDT 2012] Executing as [email protected] on Mac OS X 10.5.8 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_26-b03-384-9M3425; Picard version: 1.69(1209)
INFO 2012-05-31 22:59:30 MarkDuplicates Start of doWork freeMemory: 84166984; totalMemory: 85000192; maxMemory: 129957888
INFO 2012-05-31 22:59:30 MarkDuplicates Reading input file and constructing read end information.
INFO 2012-05-31 22:59:30 MarkDuplicates Will retain up to 515705 data points before spilling to disk.
INFO 2012-05-31 22:59:39 MarkDuplicates Read 1000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 0
INFO 2012-05-31 22:59:44 MarkDuplicates Read 2000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 1
INFO 2012-05-31 22:59:49 MarkDuplicates Read 3000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 2
INFO 2012-05-31 22:59:54 MarkDuplicates Read 4000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 3
INFO 2012-05-31 22:59:59 MarkDuplicates Read 5000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 4
[Thu May 31 23:00:01 EDT 2012] net.sf.picard.sam.MarkDuplicates done. Elapsed time: 0.51 minutes.
Runtime.totalMemory()=85000192
FAQ: http://sourceforge.net/apps/mediawik...itle=Main_Page
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 5277548, Read name HWUSI-EAS525_0013:6:2:12573:3958#0, MAPQ should be 0 for unmapped read.
at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:506)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:487)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:446)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:641)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:619)
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:329)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:122)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:106)
[0-9]+)
[0-9]+)
[0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false[Thu May 31 22:59:30 EDT 2012] Executing as [email protected] on Mac OS X 10.5.8 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_26-b03-384-9M3425; Picard version: 1.69(1209)
INFO 2012-05-31 22:59:30 MarkDuplicates Start of doWork freeMemory: 84166984; totalMemory: 85000192; maxMemory: 129957888
INFO 2012-05-31 22:59:30 MarkDuplicates Reading input file and constructing read end information.
INFO 2012-05-31 22:59:30 MarkDuplicates Will retain up to 515705 data points before spilling to disk.
INFO 2012-05-31 22:59:39 MarkDuplicates Read 1000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 0
INFO 2012-05-31 22:59:44 MarkDuplicates Read 2000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 1
INFO 2012-05-31 22:59:49 MarkDuplicates Read 3000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 2
INFO 2012-05-31 22:59:54 MarkDuplicates Read 4000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 3
INFO 2012-05-31 22:59:59 MarkDuplicates Read 5000000 records. Tracking 0 as yet unmatched pairs. 0 records in RAM. Last sequence index: 4
[Thu May 31 23:00:01 EDT 2012] net.sf.picard.sam.MarkDuplicates done. Elapsed time: 0.51 minutes.
Runtime.totalMemory()=85000192
FAQ: http://sourceforge.net/apps/mediawik...itle=Main_Page
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 5277548, Read name HWUSI-EAS525_0013:6:2:12573:3958#0, MAPQ should be 0 for unmapped read.
at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:506)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:487)
at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:446)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:641)
at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:619)
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:329)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:122)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:106)
Comment