Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #16
    Not currently... I can add that, though. I'll make a note to do that. BBMap has an "idfilter" flag, though.

    Comment

    • darthsequencer
      Member
      • Feb 2012
      • 35

      #17
      Great - I've been using idfilter but it's pretty consuming to have to rerun the mapping (I'm dealing with ~10 lanes of data now).

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #18
        I just uploaded a new version of BBTools - 36.11 - that supports idfilter (and subfilter, editfilter, etc) in Reformat. Bear in mind that reads mapped using old-style cigar strings ('M' symbol instead of 'X' and '=') must also have MD tags. For newer cigar strings MD tags are not necessary. Unmapped reads will not be affected by this filter (they will pass the filter), so if you want to get rid of them you also need to set "mappedonly=t".

        Comment

        • darthsequencer
          Member
          • Feb 2012
          • 35

          #19
          Talk about a quick turnaround! Thanks a bunch BB.

          Comment

          • Gopo
            Member
            • Nov 2013
            • 41

            #20
            I am not sure this is a bug or not, but when I try to use reformat.sh (version 37.76) to add fake qualities of Q30 to a PacBio Sequel fastq file (produced with
            Code:
            bamtools convert -format fastq -in sequel.subreads.bam -out sequel.subreads.fastq
            ), which has default quality of "!", I don't get quality of ">" in the output rather "#".

            Code:
            /opt/bbmap/reformat.sh qin=33 qout=33 qfake=30 in=sequel.subreads.fastq out=sequel.subreads.fqual.fastq

            Comment

            • DrYak
              Member
              • Sep 2013
              • 13

              #21
              Hi,

              Can I use reformat or any other bbtools script to split my fasta file into sub-files?

              eg X.fa (100 sequences) -> X01.fa X02.fa....X10.fa (each with 10 sequences)?

              I don't mind whether I need to select the number of sequences per file or total number of files and it doesn't really matter what order the sequences are in as long as there is no duplication of sequences.

              Cheers,
              Dave

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #22
                faSplit from Jim Kent's utilities is a much better option for splitting fasta files.

                Run faSplit to look at inline help for multiple options available.

                Comment

                • Brian Bushnell
                  Super Moderator
                  • Jan 2014
                  • 2709

                  #23
                  Reformat won't do that, but you can use partition.sh:

                  Code:
                  partition.sh in=X.fa out=X%.fa ways=10
                  That will produce 10 output files with an equal number of sequences and no duplication.

                  Comment

                  • sunnycqcn
                    Member
                    • Apr 2013
                    • 17

                    #24
                    Hi Brian Bushnell,
                    when I used mapPacBio.sh for mapping pacbio reads. I met the errors as following:
                    Exception in thread "Thread-23" java.lang.AssertionError: Read 20, length 10550, exceeds the limit of 6019
                    You can map the reads in chunks by reformatting to fasta, then mapping with the setting 'fastareadlen=6019'
                    at align2.AbstractMapThread.run(AbstractMapThread.java:480)

                    But I did not find how I can reformat it.
                    Could you help me figure out this issue?
                    Thanks,
                    Fuyou

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #25
                      You can use
                      Code:
                      reformat.sh in=your_file.fastq out=newfile.fa
                      to convert the reads to fasta format.

                      That said I think mapPacBio.sh should automatically split reads longer than 6k when it does mapping. Is that not working?

                      Comment

                      • sunnycqcn
                        Member
                        • Apr 2013
                        • 17

                        #26
                        Originally posted by GenoMax View Post
                        You can use
                        Code:
                        reformat.sh in=your_file.fastq out=newfile.fa
                        to convert the reads to fasta format.

                        That said I think mapPacBio.sh should automatically split reads longer than 6k when it does mapping. Is that not working?
                        It is not working. I used fasta format.
                        Thanks,
                        Fuyou

                        Comment

                        • pepe84
                          Junior Member
                          • Jul 2014
                          • 4

                          #27
                          hello folks, I am trying to work on a FASTQ file using reformat.sh, although I have correctly installed Java and tested it in the command line, I still can't get it to work. It seems the problem is that I don't have the FASTQ file in the same directory as the BBMap folder, could that be an issue?

                          Comment

                          • SNPsaurus
                            Registered Vendor
                            • May 2013
                            • 525

                            #28
                            pepe84, do you provide a path to the file? Please copy your command as tried, and then copy the error message.
                            Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                            Comment

                            • pepe84
                              Junior Member
                              • Jul 2014
                              • 4

                              #29
                              here is the command:
                              java -cp C:\BBMap\current\jgi.ReformatReads in=“C:\BBMap\resources\SRRXXXXX.fastq” out1=EFB_R1.fq out2=EFB_R2.fq

                              And here is the error:
                              Error: Could not find or load main class in=C:\BBMap\resources\SRRXXXXX.fastq

                              Just an FYI I am using the command line on windows.

                              Thanks, I appreciate any help


                              Originally posted by SNPsaurus View Post
                              pepe84, do you provide a path to the file? Please copy your command as tried, and then copy the error message.

                              Comment

                              • rwhet052
                                Junior Member
                                • Jan 2011
                                • 8

                                #30
                                reformat.sh hangs in sleep status

                                I used demuxbyname.sh to split four lanes of Illumina data into separate files for 84 samples, and now I'm running a loop with reformat.sh to rename the samples from the index sequences to more biologically relevant names, catenate all four lanes of data from the same sample together, and produce a single file of gzipped interleaved output. The loop is running on a cluster with 37.41 installed, and worked fine for the first 51 samples of the 84, but hung on sample 52. A
                                Code:
                                ps aux | grep <user>
                                command returns
                                Code:
                                <user> 14126  0.6  0.0 6610600 250716 ?      Sl   00:21   4:42 java -ea -Xmx200m -cp /isg/shared/apps/bbmap/37.41/current/ jgi.ReformatReads in=L6_GAGATTCC+CTTCGCCT_#.fq out=L6_A2.fq
                                , which indicates the job is hung at one step in the loop to interleave the individual sample files before catenating all four samples together. The last output to the L6_A2.fq file was over 12 hours ago, so it seems unlikely that the job will recover from this status. Is there a way to avoid this problem?

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                7 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...