Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Oh - that's an intentional protection from overwriting files. Just delete the output file first or add the "overwrite" flag.

    Comment


    • #17
      high contaninants

      Thanks.

      Input is being processed as unpaired

      Input: 385043 reads 10781204 bases.
      Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
      Result: 43132 reads (11.20%) 1207696 bases (11.20%)

      What is diffinition of contaminants? It looks very high.

      Comment


      • #18
        I need to read 30 nt for sequences. Miseq read 32 nt in sequencing. Thus many sequences have NN at last 2 positions. Does this relate to high contaminant rate?

        Comment


        • #19
          Are you using bbduk.sh? That's the only one that prints anything about contaminants. Can you show your specific command line?

          Anyway, if you tried filtering out adapters and you got a result like that, it means you have almost no product and mostly adapter sequence.

          Comment


          • #20
            Yes, bbduk.sh.

            Input is being processed as unpaired

            Input: 385043 reads 10781204 bases.
            Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
            Result: 43132 reads (11.20%) 1207696 bases (11.20%)

            Comment


            • #21
              Please give me the exact command line (what you typed before you hit enter).

              Comment


              • #22
                k=16 shows high contaminants than k=26

                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
                java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
                Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_26.txt, k=26, fbm]

                No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
                Initial:
                Memory: free=237m, used=14m

                Added 13 kmers; time: 0.023 seconds.
                Memory: free=228m, used=23m

                Input is being processed as unpaired

                Input: 159642 reads 4469976 bases.
                Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
                Result: 28918 reads (18.11%) 809704 bases (18.11%)

                Time: 0.197 seconds.
                Reads Processed: 159k 811.47k reads/sec
                Bases Processed: 4469k 22.72m bases/sec
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ ^C
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                bduk.sh: command not found
                zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
                Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_16.txt, k=16, fbm]

                No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
                Initial:
                Memory: free=237m, used=14m

                Added 143 kmers; time: 0.028 seconds.
                Memory: free=228m, used=23m

                Input is being processed as unpaired

                Input: 159642 reads 4469976 bases.
                Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
                Result: 7915 reads (4.96%) 221620 bases (4.96%)

                Comment


                • #23
                  So... that's telling you that you are getting matches between the stuff in your input file (probe48mix25fg_S7_L001_R2_001.fastq) and your reference file (ngs13template.fasta). And a shorter kmer will always find more matches in the presence of error.

                  probe48mix25fg_S7_L001_R2_001_26.txt will contain a list of which reference sequences were seen, and how many times they were seen.

                  Comment


                  • #24
                    And a shorter kmer will always find more matches in the presence of error.

                    Here k=16 shows less match sequences than k=26

                    for k=16
                    Input: 159642 reads 4469976 bases.
                    Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
                    Result: 7915 reads (4.96%) 221620 bases (4.96%)

                    for k=26
                    Input: 159642 reads 4469976 bases.
                    Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
                    Result: 28918 reads (18.11%) 809704 bases (18.11%)

                    Comment


                    • #25
                      In this case, the output is misleading... BBDuk assumes that the ref file is a file of contaminants because that's what I originally designed it for. So "Contaminants" actually means "Things that match the reference". I may change the wording eventually.

                      In other words, 95.04% of the reads matched the reference for K=16 and 81.89% did for K=26.

                      Comment


                      • #26
                        Great, thanks.

                        Zheng

                        Comment


                        • #27
                          Is there a size limitation for the referece sequences? It will not work when I add a 20 bp reference sequence.

                          Comment


                          • #28
                            The size limit is the same as kmer length. So, if k=30, it will not work with anything less than a 30bp reference.

                            Comment


                            • #29
                              Thanks.

                              How do you separate unambiguousReads and ambiguousReads in bbmap.sh?

                              Comment


                              • #30
                                Ambiguously mapped reads get a "XT:A:R" tag in the sam output while unambiguously mapped get "XT:A:U".

                                You can also forbid ambiguously-mapping reads using the flag "ambig=toss", which will consider them unmapped.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Recent Advances in Sequencing Analysis Tools
                                  by seqadmin


                                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                                  05-06-2024, 07:48 AM
                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 05-14-2024, 07:03 AM
                                0 responses
                                19 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-10-2024, 06:35 AM
                                0 responses
                                42 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-09-2024, 02:46 PM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 05-07-2024, 06:57 AM
                                0 responses
                                42 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X