Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #16
    Oh, wait, is says "truncated" so presumably the problem is at the end of the file. Can you run "tail" on the file and post the last two lines?

    Comment

    • krapulaxdoctor
      Member
      • May 2015
      • 22

      #17
      Originally posted by Brian Bushnell View Post
      Oh, wait, is says "truncated" so presumably the problem is at the end of the file. Can you run "tail" on the file and post the last two lines?
      How do I do this " tail " ?
      Sorry im a beginner...

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #18
        "tail file.sam"

        That will print the last 10 lines to the console.

        Comment

        • krapulaxdoctor
          Member
          • May 2015
          • 22

          #19
          HISEQHI:525:HCYWJADXX:2:2213:8924:55099 256 * 942639 0 43M * 0 0 CAAAGGGCTGAGAAGCACTTGAAAAAATGTTCAACATCCTTAA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJIIJJJJJJJJJJJJ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:43 YT:Z:UU NH:i:20 CC:Z:chrX CP:i:128687718 XS:A:+ HI:i:17
          HISEQLN:122:HCW3JADXX:2:2207:7052:25724 272 * 944767 0 43M * 0 0 TACTTACATATAATAAATAAATAAATAAATATTTTTTAAAAAA IFIIGJIJIIIGGIJIJIGFFCIHGIGIIHDHFFHFFDDF@@@ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:43 YT:Z:UU NH:i:11 CC:Z:chr6 CP:i:52981629 XS:A:- HI:i:9
          HISEQLN:121:HCYV3ADXX:1:1203:18633:64996 0 * 949324 043M * 0 0 CAGAACCCCTGAAATTGGCAAGATAGACGTCAGTGTTAGCAGA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-5 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:5G37 YT:Z:UU NH:i:20 CC:Z:chr6 CP:i:6419658 XS:A:+ HI:i:12
          HISEQLN:122:HCW3JADXX:1:1112:13385:80114 272 * 949722 043M * 0 0 GGTGTCCGCTAGTGTCCTGAGGCCTGAGCGAGGGGCTCCTCTC ##A7'?DFD;BD:3GGDDDIHG@EFFEFADB?<7DD::@=1 AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:11T31 YT:Z:UU NH:i:20 CC:Z:chr6 CP:i:71166409 XS:A:- HI:i:15
          This is the last few lines...

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #20
            Assuming all of the things that look like spaces are actually tabs (sorry, tabs often get replaced by spaces on the console), I don't see anything wrong with the sam file and I don't know what the problem is. It may have something to do with a negative number being detected where a positive number is expected, but I'm just speculating.

            You could try Picard rather than Samtools, and see if you have better luck. Or, try the most recent version of Samtools, or else v0.1.19. Sometimes there's a problem with a specific version.

            Comment

            • krapulaxdoctor
              Member
              • May 2015
              • 22

              #21
              OK , I'll have a try. Thank you for all your help.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #22
                What version of samtools are you using?

                Comment

                • krapulaxdoctor
                  Member
                  • May 2015
                  • 22

                  #23
                  Hi,

                  I am using:
                  Version: 1.2 (using htslib 1.2.1)

                  Comment

                  • DumbOrchid
                    Junior Member
                    • Oct 2016
                    • 2

                    #24
                    Hi,

                    Sorry to revive this thread, but I have a similar desire to filter based on length and was excited to learn about reformat!

                    I've run into some issue, but I'm pretty dumb so I'm sure I've just confused something simple.

                    I've downloaded bbmap and have tried to get reformat to work but I'm not having any luck.

                    When I try the following:

                    sh ~/tools/bbmap/reformat.sh in=input.bam out=output.bam minlength=1 maxlength=100

                    I get the following error message:

                    Found samtools.
                    Input is being processed as unpaired
                    [samopen] SAM header is present: 84 sequences.
                    java.lang.AssertionError
                    at stream.SamLine.toShortMatch(SamLine.java:1257)
                    at stream.SamLine.toRead(SamLine.java:1879)
                    at stream.SamLine.toRead(SamLine.java:1749)
                    at stream.SamReadInputStream.toReadList(SamReadInputStream.java:119)
                    at stream.SamReadInputStream.fillBuffer(SamReadInputStream.java:90)
                    at stream.SamReadInputStream.nextList(SamReadInputStream.java:74)
                    at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:656)
                    at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:635)
                    Input: 110600 reads 16384426 bases
                    Short Read Discards: 110034 reads (99.49%) 16340390 bases (99.73%)
                    Output: 566 reads (0.51%) 44036 bases (0.27%)

                    Time: 1.287 seconds.
                    Reads Processed: 110k 85.94k reads/sec
                    Bases Processed: 16384k 12.73m bases/sec
                    Exception in thread "main" java.lang.RuntimeException: ReformatReads terminated in an error state; the output may be corrupt.
                    at jgi.ReformatReads.process(ReformatReads.java:1098)
                    at jgi.ReformatReads.main(ReformatReads.java:43)


                    I'm still really excited by the potential of reformat, any advice would be greatly appreciated.

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #25
                      Do you still get an error if you remove the minlength=1 directive?

                      Comment

                      • DumbOrchid
                        Junior Member
                        • Oct 2016
                        • 2

                        #26
                        Wow! Thanks for the quick reply GenoMax!

                        Sadly that doesn't alleviate my issue:

                        Exception in thread "main" java.lang.RuntimeException: ReformatReads terminated in an error state; the output may be corrupt.
                        at jgi.ReformatReads.process(ReformatReads.java:1098)
                        at jgi.ReformatReads.main(ReformatReads.java:43)

                        Comment

                        • Brian Bushnell
                          Super Moderator
                          • Jan 2014
                          • 2709

                          #27
                          It appears that there was some problem processing the line's MD tag. In this case, since you are just filtering based on length, that should not matter and you can just add the flag "-da" to ignore the error, which does not affect the output in this case. I added code to print out the problematic line when that happens in the future. If it's a very small bam file you could email it to me so I can see what the problem is.

                          Comment

                          • andrewbcaldwell
                            Junior Member
                            • Apr 2018
                            • 1

                            #28
                            Brian,

                            Would it be possible to use reformat.sh to filter on the fragment length rather than the read length? I'm looking for a way to split paired-end ATAC-Seq .sam files into "nucleosome-free" and "nucleosome-bound" regions based on size of the fragment, and the proposed solutions I've found elsewhere have been a dead end. Thanks!

                            Comment

                            Latest Articles

                            Collapse

                            • SEQadmin2
                              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by SEQadmin2


                              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                              Here are nine questions we think about, in roughly the order they matter, before...
                              06-18-2026, 07:11 AM
                            • SEQadmin2
                              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                              by SEQadmin2


                              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                              ...
                              06-02-2026, 10:05 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, 06-26-2026, 11:10 AM
                            0 responses
                            12 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-17-2026, 06:09 AM
                            0 responses
                            46 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-09-2026, 11:58 AM
                            0 responses
                            106 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-05-2026, 10:09 AM
                            0 responses
                            125 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...