Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Can you provide the exact command you are using? Your reads do have truseq adapters in them so the inserts may be smaller than you expected.

    Code:
    @M02344:9008:000000000-AJ1PF:1:1110:28244:16073 1:N:0:TGACCAAT+ATAGAGGC
    TCTGCCGTCATCGACTTCGAAGGTTCGAATCCTTCCCCTCTAACCACGGCCGAAATTCAATACCCGGATCAAGCTCAATTCGGGTCGAGGTCGGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGA[COLOR="Red"]GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATGTCGTATGCCGTCTTCTGCTTG[/COLOR]AAAAAAAATAAGTGGTGCGAAGAGAGCCTGTGGCCAACCTCATATGCGTGGAGATGTCTCG
    Last edited by GenoMax; 05-05-2016, 11:52 AM.

    Comment


    • #62
      In this case it looks like BBMerge's output is correct... as GenoMax said, you have adapter sequences indicating short inserts. Specifically, read1's first 126 bases exactly match BBMerge's output, and subsequently there is:
      AGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATGTCGTATGCCGTCTTCTGCTTG
      ...a known Illumina adapter sequence, followed by AAAAAA which is common after the Illumina machine runs of the end of the adapter sequence and has no signal.

      No matter what you expect/design your insert size to be, shorter fragments will almost always be present.

      Comment


      • #63
        I am using:

        Code:
        bbmerge.sh in1=<read1> in2=<read2> out=<mergedreads> outu1=<unmerged1> outu2=<unmerged2> mismatches=0

        Comment


        • #64
          Thank you both for your help! Do you have any suggestions on how to set optional parameters to ensure that my merged file only contains sequences of my intended/designed insert size?

          Comment


          • #65
            You can postfilter it afterward:

            reformat.sh in=reads.fq out=filtered.fq minlength=202 maxlength=202

            But, bear in mind that you may be losing important data by doing so. For example, there could be a whole bunch of sequences that are 199bp long for some real biological reason (rather than problems with library prep). So, just be cautious.

            Comment


            • #66
              Thank you!

              Comment


              • #67
                Hello there! Is there a way we could generate a stats or log file with bbmerge?

                Comment


                • #68
                  Originally posted by shimingt View Post
                  Hello there! Is there a way we could generate a stats or log file with bbmerge?
                  Stats are automatically generated with each run of BBMerge. They look something like this

                  Code:
                  Pairs:                  2879431
                  Joined:                 2052925         71.296%
                  Ambiguous:              810015          28.131%
                  No Solution:            16491           0.573%
                  Too Short:              0               0.000%
                  
                  Avg Insert:             396.7
                  Standard Deviation:     98.1
                  Mode:                   415
                  
                  Insert range:           35 - 591
                  90th percentile:        524
                  75th percentile:        469
                  50th percentile:        402
                  25th percentile:        332
                  10th percentile:        262
                  You can capture STDOUT/STDERR with standard bash conventions to a file (2>&1) to get the log info.

                  Comment


                  • #69
                    Log File

                    Hello,

                    I am a novice with command prompt. I can't seem to get a log file.

                    This is my command prompt:

                    for x in *_R1_001.fastq;do echo bbmerge.sh -Xmx20g in1=$x in2=${x%_R1_001.*}_R2_001.fastq out=bbmerge\/${x%_*-*_R1_001.*}_R1R2_bbmerge.fastq>bbmerge\/${x%_*-*_L001_R1_001.*}_bbmerge_log.txt 2>&1 outu1=outu1\/$x outu2=outu2\/${x%_R1_001.*}_R2_001.fastq >> bbmerge.sh;done

                    Am I doing something wrong here?

                    There are log files but they are all empty?
                    Last edited by shimingt; 06-14-2016, 03:48 AM.

                    Comment


                    • #70
                      Try this (I am assuming that your variable expressions are correct). The log file and the redirect should be at the end of the command.

                      Code:
                      for x in *_R1_001.fastq;do your_bbmerge_command_along_with_options >bbmerge\/${x%_*-*_L001_R1_001.*}_bbmerge_log.txt 2>&1;done

                      Comment


                      • #71
                        Log file from bbmerge

                        Hello GenoMax,

                        Thanks for your reply.

                        I tried a command like this:

                        for x in *_R1_001.fastq;do echo bbmerge.sh -Xmx20g in1=$x in2=${x%_R1_001.*}_R2_001.fastq out=bbmerge\/${x%_*-*_R1_001.*}_R1R2_bbmerge.fastq outu1=outu1\/$x outu2=outu2\/${x%_R1_001.*}_R2_001.fastq > bbmerge\/${x%_*-*_L001_R1_001.*}_bbmerge_log.txt 2>&1 >> bbmerge.sh;done


                        The log files are generated according to the names, but the log files are empty.

                        Am I doing something wrong here?

                        Comment


                        • #72
                          Can you try

                          Code:
                          for x in *_R1_001.fastq;do bbmerge.sh -Xmx20g in1=$x in2=${x%_R1_001.*}_R2_001.fastq out=bbmerge\/${x%_*-*_R1_001.*}_R1R2_bbmerge.fastq outu1=outu1\/$x outu2=outu2\/${x%_R1_001.*}_R2_001.fastq > bbmerge\/${x%_*-*_L001_R1_001.*}_bbmerge_log.txt 2>&1;done

                          Comment


                          • #73
                            Dear Genomax,

                            Thanks for your help!

                            Comment


                            • #74
                              Hello,

                              how does BBMerge behave when the reads contain repetitive regions at the right end?

                              My amplicons are variable in length and derive from STR's, meaning that they are like:
                              (non-repetitive flanking region) - (tandem repeats) - (non-repetitive flanking region).
                              If the the amplicon is long enough, I could imagine that there is a case where paired reads overlap only in the repetitive region and thus multiple ways of merging are theoretically correct.
                              Until now, merging works fine. Can BBMerge ensure that merged reads are always consistent with the "real" amplicon sequence?

                              Sebastian

                              Comment


                              • #75
                                If reads overlap only in a repetitive region, they will not be merged. BBMerge looks at all the possible overlaps, and keeps track of the two top-scoring ones (based on length and match/mismatch ratio). If those two are close, the pair will be classified as "ambiguous" and not merged. However, that does not mean it will always be correct; say you have a tandem repeat of two copies, like this:

                                ARRB

                                ...where A and B are unique, and R is repeat. If the reads look like this:

                                1: ARR
                                2: RRB

                                ...then there are 2 good overlap frames, forming ARRB or ARRRB, so the merge will be rejected as ambiguous. But if the reads only span a single repeat each, like this:

                                1: AR
                                2: RB

                                ...then there will only be one apparently good overlap frame, and the reads will probably be merged incorrectly to form ARB. BBMerge's false-positive merge rate is extremely low, but it's not perfect. With shotgun data you can add the flag "rsem" to greatly reduce the chances of false-positive merges due to short tandem repeats, but that does not really work with amplicon data. You may want to simulate some data using your expected sequence to see what the actual behavior is.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM
                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 05:31 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-24-2024, 06:58 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-23-2024, 08:43 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-17-2024, 07:29 AM
                                0 responses
                                58 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X