Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Oh - that's an intentional protection from overwriting files. Just delete the output file first or add the "overwrite" flag.

    Comment


    • #17
      high contaninants

      Thanks.

      Input is being processed as unpaired

      Input: 385043 reads 10781204 bases.
      Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
      Result: 43132 reads (11.20%) 1207696 bases (11.20%)

      What is diffinition of contaminants? It looks very high.

      Comment


      • #18
        I need to read 30 nt for sequences. Miseq read 32 nt in sequencing. Thus many sequences have NN at last 2 positions. Does this relate to high contaminant rate?

        Comment


        • #19
          Are you using bbduk.sh? That's the only one that prints anything about contaminants. Can you show your specific command line?

          Anyway, if you tried filtering out adapters and you got a result like that, it means you have almost no product and mostly adapter sequence.

          Comment


          • #20
            Yes, bbduk.sh.

            Input is being processed as unpaired

            Input: 385043 reads 10781204 bases.
            Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
            Result: 43132 reads (11.20%) 1207696 bases (11.20%)

            Comment


            • #21
              Please give me the exact command line (what you typed before you hit enter).

              Comment


              • #22
                k=16 shows high contaminants than k=26

                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
                java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
                Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_26.txt, k=26, fbm]

                No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
                Initial:
                Memory: free=237m, used=14m

                Added 13 kmers; time: 0.023 seconds.
                Memory: free=228m, used=23m

                Input is being processed as unpaired

                Input: 159642 reads 4469976 bases.
                Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
                Result: 28918 reads (18.11%) 809704 bases (18.11%)

                Time: 0.197 seconds.
                Reads Processed: 159k 811.47k reads/sec
                Bases Processed: 4469k 22.72m bases/sec
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ ^C
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                bduk.sh: command not found
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_16.txt, k=16, fbm]

                No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
                Initial:
                Memory: free=237m, used=14m

                Added 143 kmers; time: 0.028 seconds.
                Memory: free=228m, used=23m

                Input is being processed as unpaired

                Input: 159642 reads 4469976 bases.
                Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
                Result: 7915 reads (4.96%) 221620 bases (4.96%)

                Comment


                • #23
                  So... that's telling you that you are getting matches between the stuff in your input file (probe48mix25fg_S7_L001_R2_001.fastq) and your reference file (ngs13template.fasta). And a shorter kmer will always find more matches in the presence of error.

                  probe48mix25fg_S7_L001_R2_001_26.txt will contain a list of which reference sequences were seen, and how many times they were seen.

                  Comment


                  • #24
                    And a shorter kmer will always find more matches in the presence of error.

                    Here k=16 shows less match sequences than k=26

                    for k=16
                    Input: 159642 reads 4469976 bases.
                    Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
                    Result: 7915 reads (4.96%) 221620 bases (4.96%)

                    for k=26
                    Input: 159642 reads 4469976 bases.
                    Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
                    Result: 28918 reads (18.11%) 809704 bases (18.11%)

                    Comment


                    • #25
                      In this case, the output is misleading... BBDuk assumes that the ref file is a file of contaminants because that's what I originally designed it for. So "Contaminants" actually means "Things that match the reference". I may change the wording eventually.

                      In other words, 95.04% of the reads matched the reference for K=16 and 81.89% did for K=26.

                      Comment


                      • #26
                        Great, thanks.

                        Zheng

                        Comment


                        • #27
                          Is there a size limitation for the referece sequences? It will not work when I add a 20 bp reference sequence.

                          Comment


                          • #28
                            The size limit is the same as kmer length. So, if k=30, it will not work with anything less than a 30bp reference.

                            Comment


                            • #29
                              Thanks.

                              How do you separate unambiguousReads and ambiguousReads in bbmap.sh?

                              Comment


                              • #30
                                Ambiguously mapped reads get a "XT:A:R" tag in the sam output while unambiguously mapped get "XT:A:U".

                                You can also forbid ambiguously-mapping reads using the flag "ambig=toss", which will consider them unmapped.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin


                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  [Article Coming Soon!]...
                                  Today, 08:07 AM
                                • seqadmin
                                  Recent Developments in Metagenomics
                                  by seqadmin





                                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                  09-23-2024, 06:35 AM
                                • seqadmin
                                  Understanding Genetic Influence on Infectious Disease
                                  by seqadmin




                                  During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                                  Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                                  09-09-2024, 10:59 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 10-02-2024, 04:51 AM
                                0 responses
                                13 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-01-2024, 07:10 AM
                                0 responses
                                23 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 09-30-2024, 08:33 AM
                                1 response
                                29 views
                                0 likes
                                Last Post EmiTom
                                by EmiTom
                                 
                                Started by seqadmin, 09-26-2024, 12:57 PM
                                0 responses
                                19 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X