Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi. Anybody got any idea why bbduk is only reading (and trimming) 364 reads from my file.

    The HPC is using BBDUK 36.32.

    Here's my code:

    bbduk.sh in1=Vireo1_R1_001.fastq.gz in2=Vireo1_R2_001.fastq.gz out1=Vireo1_R1_trimmed.fastq.gz out2=Vireo1_R2_trimmed.fastq.gz ref=/opt/bbmap/36.32/bbmap/resources/adapters.fa threads=12 k=19 mink=5 hdist=1 ktrim=r qtrim=r minlength=36 trimq=14

    I checked the header of both the input and (short) outputfile. They both appear to be formatted correctly, so there isn't a file corruption issue that I can detect. Also, zcat shows a reasonable number of reads for the input file (about 26 million reads). And the input file size is correct.

    I'm stumped.

    Comment


    • That is a pretty old version of BBMap. I suggest that you start by upgrading to the latest first.

      It seems unlikely but are the rest of the reads failing other limits you have set?

      Comment


      • When you say you checked the header do you mean you looked at the the 364th and 365th read? What happens if you take some other random set of reads from the input and use that? Like
        zcat Vireo1_R1_001.fastq.gz | head -2000 | tail -1000 > test_R1.fastq
        (and for R2). What happens if you just do read1 (are they out of synch?).
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment


        • Can genome be used to filter RNA-seq reads?

          Hi!

          I have plant RNA-seq reads that are contaminated with fungal reads. I have access to a draft genome of the fungus. Is it possible to use BBduk or Seal to filter fungal RNA reads away from plant RNA reads using the DNA sequence of the contaminant?

          Thanks,
          Chris

          Comment


          • You should use bbsplit for this purpose. Provide the genome for your fungus alongside the plant and bin the reads.

            Comment


            • Hey GenoMax,

              Thank you for the reply! I had not heard of bbsplit. Unfortunately, I dont have genomic sequence of the plant. Only the fungus.

              How does this change my options?

              Thanks!

              Comment


              • @cb841011: Since this was also cross-posted and discussed on Biostars I will add a reference to the thread here: https://www.biostars.org/p/302864/

                You could always use a closely related grass genome (if one is available). There would be some loss of real data (or gain of false positives) but since you don't have the genome of your grass it is about the best you can do.

                Since you have
                Draft genome of the fungus
                RNA-seq reads from non-infected grass
                RNA-seq reads from infected grass (contains grass and fungal transcripts)
                RNA-seq reads from the fungus growing in culture
                You could assemble transcriptomes (using Trinity) from non-infected grass and then fungus. Use those to see if you are able to find any new transcripts showing up in the infected grass.

                Comment


                • Error

                  Hello,

                  I am trying to run bbduk on my server with the following command:
                  ~/soft/bbmap/bbduk.sh in=myfile.fastq.gz out=myfile_filtered.fq outm=myfile_low_complexity.fq entropy=0.5

                  and I get this error:

                  Exception in thread "main" java.lang.NoClassDefFoundError: java.util.concurrent.ThreadLocalRandom
                  at java.lang.J9VMInternals.verifyImpl(Native Method)
                  at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
                  at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
                  at jgi.BBDukF.<clinit>(BBDukF.java:4267)
                  at java.lang.J9VMInternals.initializeImpl(Native Method)
                  at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
                  Caused by: java.lang.ClassNotFoundException: java.util.concurrent.ThreadLocalRandom
                  at java.net.URLClassLoader.findClass(URLClassLoader.java:423)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:660)
                  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:346)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
                  ... 6 more
                  Could not find the main class: jgi.BBDukF. Program will exit.

                  Any idea about why this could be happening?


                  A second question: If I want to change the WAYS=7 to WAYS=1 in order to be able to run bbduk on my laptop, how should I do to change it and re-compile, as suggested in the README file?

                  Thanks,

                  Nicolas.

                  Comment


                  • What OS are you using? Did you move any of the files around after you downloaded and uncompressed the software?

                    I am not sure where the WAYS=7 option is in README. You should be able to run bbduk on your laptop. Set threads=N if you want to limit resource usage.

                    Comment


                    • I have run into a bit of a problem with the adapter trimming. It keeps leaving the sequence "AGATCGG" at the end when I run example:

                      ./bbduk.sh -Xmx1g in1=read1_R1.fastq in2=read1_R2.fastq out1=cleanread1_R1.fastq out2=cleanread1_R2.fastq ktrim=r ref=resources/adapters.fa k=28 mink=12 hdist=1

                      My library was made with the NEBNext ultra directional kit with NEBnext primers 1-48. Is there an updated adapters.fa list that will hit these sequences?

                      Comment


                      • @horvathdp: You can provide NEBnext primers in a separate file as multi-fasta sequence. Then use that file with bbduk.sh. Also with paired-end reads use options "tpe tbo" to get residual bases at end of reads.

                        Comment


                        • Thanks! for those options, do I just add -tpe -tbo to the command?

                          Comment


                          • Originally posted by horvathdp View Post
                            Thanks! for those options, do I just add -tpe -tbo to the command?
                            No hyphens. Just tpe and tbo.

                            Comment


                            • trimming Long and Short

                              Hi, do I need to follow a different approach in trimming and filtering Short vs long mate pair reads (Nextera)? And if yes could someone elaborate the pipeline?

                              Comment


                              • Genomax, do we have to trim Mate pair reads differently? I ask because they have the internal adapter. I am not asking about the Nextera mate pair. I ask for the reads made by MatePairSamplePrep v2. Do we have to reverse complement them?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Best Practices for Single-Cell Sequencing Analysis
                                  by seqadmin



                                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                  06-06-2024, 07:15 AM
                                • seqadmin
                                  Latest Developments in Precision Medicine
                                  by seqadmin



                                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                  Somatic Genomics
                                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                  05-24-2024, 01:16 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 06-21-2024, 07:49 AM
                                0 responses
                                14 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-20-2024, 07:23 AM
                                0 responses
                                14 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-17-2024, 06:54 AM
                                0 responses
                                16 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-14-2024, 07:24 AM
                                0 responses
                                25 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X