Hi everyone,
I would like to merge the Tophat output files accepted_hits.bam and unmapped.bam into one file for subsequent analysis. The files contain paired-end Illumina reads.
I used a naive approach of just calling samtools merge:
Using the resulting merged.bam file with any Picard tool results in an error however:
The files were created with a Tophat version prior to 2.0.7, so the unmapped reads still have /1 and /2 suffixes. However, removing the suffixes before calling samtools merge results in the same error down the line.
The reads picard complains about look like this:
Software versions used:
Tophat: 2.0.6
Picard: 1.85
samtools: 0.1.18
Has anyone done this successfully?
Thanks a lot,
Chris
I would like to merge the Tophat output files accepted_hits.bam and unmapped.bam into one file for subsequent analysis. The files contain paired-end Illumina reads.
I used a naive approach of just calling samtools merge:
Code:
samtools merge merged.bam accepted_hits.bam unmapped.bam
Code:
INFO 2013-03-07 10:29:53 AddOrReplaceReadGroups Processed 39,000,000 records. Elapsed time: 00:11:07s. Time for last 1,000,000: 16s. Last read position: chrX:153,627,857 [Thu Mar 07 10:29:55 CET 2013] net.sf.picard.sam.AddOrReplaceReadGroups done. Elapsed time: 11.16 minutes. Runtime.totalMemory()=1110376448 FAQ: http://sourceforge.net/apps/mediawiki/picard/index.php?title=Main_Page Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 39136746, Read name HWI-ST587_0094:4:1101:7158:1939.ATCACGA/1, Mapped mate should have mate reference name at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448) at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:541) at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:522) at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:481) at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:672) at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:650) at net.sf.picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:98) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177) at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119) at net.sf.picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:66)
The reads picard complains about look like this:
Code:
samtools view merged.bam | grep "HWI-ST587_0094:4:1101:7158:1939.ATCACGA" HWI-ST587_0094:4:1101:7158:1939.ATCACGA 1177 chr3 129324900 50 38M * 0 0 CTAGCCCCACGGTGGACGCGTTCGGGTGGTTGGCCGCC FFFFHJJJHJJJJJJJJJJJJJJJJHHHHHFFFFFCCC AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:38 YT:Z:UU XS:A:- NH:i:1 HWI-ST587_0094:4:1101:7158:1939.ATCACGA/1 69 * 0 255 * * 0 0 TGNGTGTTCTCGAAGCGGTGGTCCTCCAGGCTGCGGTTGCGCGGGAAGAAGGNGCTGCCGTAACCGGTGTACGTGNCGCCCACGAGCAGGCGGCTGCCCC CC!4ADDFHHHHHJJIJJGHJHIJJJJJJJJJJJJJFHIJJIJGDBDDDDDD!,8?BDDDDDDDDDD>B>CDDCC!+8?BBBDDDDDDDDDDDB)&(28<
Tophat: 2.0.6
Picard: 1.85
samtools: 0.1.18
Has anyone done this successfully?
Thanks a lot,
Chris
Comment