Oh, wait, is says "truncated" so presumably the problem is at the end of the file. Can you run "tail" on the file and post the last two lines?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
HISEQHI:525:HCYWJADXX:2:2213:8924:55099 256 * 942639 0 43M * 0 0 CAAAGGGCTGAGAAGCACTTGAAAAAATGTTCAACATCCTTAA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJIIJJJJJJJJJJJJ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:43 YT:Z:UU NH:i:20 CC:Z:chrX CP:i:128687718 XS:A:+ HI:i:17
HISEQLN:122:HCW3JADXX:2:2207:7052:25724 272 * 944767 0 43M * 0 0 TACTTACATATAATAAATAAATAAATAAATATTTTTTAAAAAA IFIIGJIJIIIGGIJIJIGFFCIHGIGIIHDHFFHFFDDF@@@ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:43 YT:Z:UU NH:i:11 CC:Z:chr6 CP:i:52981629 XS:A:- HI:i:9
HISEQLN:121:HCYV3ADXX:1:1203:18633:64996 0 * 949324 043M * 0 0 CAGAACCCCTGAAATTGGCAAGATAGACGTCAGTGTTAGCAGA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-5 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:5G37 YT:Z:UU NH:i:20 CC:Z:chr6 CP:i:6419658 XS:A:+ HI:i:12
HISEQLN:122:HCW3JADXX:1:1112:13385:80114 272 * 949722 043M * 0 0 GGTGTCCGCTAGTGTCCTGAGGCCTGAGCGAGGGGCTCCTCTC ##A7'?DFD;BD:3GGDDDIHG@EFFEFADB?<7DD::@=1 AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:11T31 YT:Z:UU NH:i:20 CC:Z:chr6 CP:i:71166409 XS:A:- HI:i:15
Comment
-
Assuming all of the things that look like spaces are actually tabs (sorry, tabs often get replaced by spaces on the console), I don't see anything wrong with the sam file and I don't know what the problem is. It may have something to do with a negative number being detected where a positive number is expected, but I'm just speculating.
You could try Picard rather than Samtools, and see if you have better luck. Or, try the most recent version of Samtools, or else v0.1.19. Sometimes there's a problem with a specific version.
Comment
-
Hi,
Sorry to revive this thread, but I have a similar desire to filter based on length and was excited to learn about reformat!
I've run into some issue, but I'm pretty dumb so I'm sure I've just confused something simple.
I've downloaded bbmap and have tried to get reformat to work but I'm not having any luck.
When I try the following:
sh ~/tools/bbmap/reformat.sh in=input.bam out=output.bam minlength=1 maxlength=100
I get the following error message:
Found samtools.
Input is being processed as unpaired
[samopen] SAM header is present: 84 sequences.
java.lang.AssertionError
at stream.SamLine.toShortMatch(SamLine.java:1257)
at stream.SamLine.toRead(SamLine.java:1879)
at stream.SamLine.toRead(SamLine.java:1749)
at stream.SamReadInputStream.toReadList(SamReadInputStream.java:119)
at stream.SamReadInputStream.fillBuffer(SamReadInputStream.java:90)
at stream.SamReadInputStream.nextList(SamReadInputStream.java:74)
at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:656)
at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:635)
Input: 110600 reads 16384426 bases
Short Read Discards: 110034 reads (99.49%) 16340390 bases (99.73%)
Output: 566 reads (0.51%) 44036 bases (0.27%)
Time: 1.287 seconds.
Reads Processed: 110k 85.94k reads/sec
Bases Processed: 16384k 12.73m bases/sec
Exception in thread "main" java.lang.RuntimeException: ReformatReads terminated in an error state; the output may be corrupt.
at jgi.ReformatReads.process(ReformatReads.java:1098)
at jgi.ReformatReads.main(ReformatReads.java:43)
I'm still really excited by the potential of reformat, any advice would be greatly appreciated.
Comment
-
Wow! Thanks for the quick reply GenoMax!
Sadly that doesn't alleviate my issue:
Exception in thread "main" java.lang.RuntimeException: ReformatReads terminated in an error state; the output may be corrupt.
at jgi.ReformatReads.process(ReformatReads.java:1098)
at jgi.ReformatReads.main(ReformatReads.java:43)
Comment
-
It appears that there was some problem processing the line's MD tag. In this case, since you are just filtering based on length, that should not matter and you can just add the flag "-da" to ignore the error, which does not affect the output in this case. I added code to print out the problematic line when that happens in the future. If it's a very small bam file you could email it to me so I can see what the problem is.
Comment
-
Brian,
Would it be possible to use reformat.sh to filter on the fragment length rather than the read length? I'm looking for a way to split paired-end ATAC-Seq .sam files into "nucleosome-free" and "nucleosome-bound" regions based on size of the fragment, and the proposed solutions I've found elsewhere have been a dead end. Thanks!
Comment
Latest Articles
Collapse
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:55 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
||
Comprehensive Sequencing of Great Ape Sex Chromosomes Yields Insights into Evolution and Genetic Variability
by seqadmin
Started by seqadmin, 05-29-2024, 01:32 PM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
05-29-2024, 01:32 PM
|
||
Started by seqadmin, 05-24-2024, 07:15 AM
|
0 responses
215 views
0 likes
|
Last Post
by seqadmin
05-24-2024, 07:15 AM
|
Comment