Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks Felix, I will give it a try again. Is it poosible to provide multiple adaptor sequences in a single line like cutadapt?
    cutadapt -a PRIMER1 -b ADAPTOR1 -b ADAPTOR2
    It seems there is no description about this in the Trim_galore guide. Thanks a lot!

    Comment


    • #17
      No, if you wanted to use many adapter sequences you would have to use Cutadapt itself. For standard (Illumina) sequencing libraries this is probably not needed though. Do you have a reason to try many adapters as the first attempt instead of just going with the default (which is Illumina adapters)?

      Comment


      • #18
        Yes, for Illumina Mate-pair reads, there are multiple scenarios that need be handled at the same time, at least the junction sequence and its reverse complement. Maybe Trim_galore can do it with multiple steps, is that right?

        Comment


        • #19
          Do you mean mate-pair or paired-end libraries? At least paired-end libraries typically share the same starting sequence of the adapters on both ends of each fragment, so you wouldn't have to supply different sequences or reverse complements but simply run Trim Galore in default mode. If you really wanted to run several consecutive steps it can't be guaranteed that you only trim off one adapter per sequence.

          Comment


          • #20
            Hi all,
            we noticed something strange in TrimGalore! (0.2.8) result file, for a WGBS 2x101 paired-ends run :
            RUN STATISTICS FOR INPUT FILE: /work/ng6/jflow/methylSeq/wf000619/ConcatenateFilesGroups_default/A3_CACGAT_L002_R1.fastq.gz
            =============================================
            26711665 sequences processed in total
            Sequences removed because they became shorter than the length cutoff of 20 bp: 0 (0.0%)
            RUN STATISTICS FOR INPUT FILE: /work/ng6/jflow/methylSeq/wf000619/ConcatenateFilesGroups_default/A3_CACGAT_L002_R2.fastq.gz
            =============================================
            26711665 sequences processed in total
            Sequences removed because they became shorter than the length cutoff of 20 bp: 0 (0.0%)

            Total number of sequences analysed for the sequence pair length validation: 26711665

            Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 2697629 (10.10%)
            It seems that 0 reads were removed for R1 due to length, 0 reads were removed for R2 due to length but 2697629 pairs were removed because at least one read was shorter than the length cutoff

            Is there someting that I don't well understand?

            We have just installed the latest version, but did'nt try it yet (but you don't mention this in your release notes, so I don't think the behaviour will be different)

            Thank you for your answer

            Another thing, but it may be a cutadapt output problem. In the report file, cutadapt section :
            cutadapt version 1.2.1
            Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /work/ng6/jflow/methylSeq/wf000619/ConcatenateFilesGroups_default/A3_CACGAT_L002_R1.fastq.gz
            Maximum error rate: 10.00%
            No. of adapters: 1
            Processed reads: 26711665
            Processed bases: 2671166500 bp (2671.2 Mbp)
            Trimmed reads: 10940767 (41.0%)
            Quality-trimmed: 168748516 bp (177.0 Mbp) (6.32% of total)
            Trimmed bases: 177017106 bp (177.0 Mbp) (6.63% of total)
            Quality-trimmed and Trimmed bases have different number of bp reported, but the number in Mb is the same. It seems that the number of bases in Mbp used in Quality-trimmed comes from the Trimmed bases result.


            Gerald

            Comment


            • #21
              Originally posted by gerald2545 View Post
              It seems that 0 reads were removed for R1 due to length, 0 reads were removed for R2 due to length but 2697629 pairs were removed because at least one read was shorter than the length cutoff
              Is there someting that I don't well understand?
              We have just installed the latest version, but did'nt try it yet (but you don't mention this in your release notes, so I don't think the behaviour will be different)
              just tested with 0.3.1 version, we notice the same behaviour

              Best regards

              Gérald

              Comment


              • #22
                Hi Gerald,

                Trim galore does not remove any reads 1 or read 2 individually if they became too short during trimming, but only does so after both reads have been trimmed (a validation step). This is done to ensure that the files do not get out of sync because of trimming.

                Considering the output of Cutadapt I would guess that the 2 values are quite similar in your case, but as you said this is the output straight from cutadapt. apologies for my slow response but I am currently on holiday.

                Best, Felix

                Comment


                • #23
                  Hi Felix,
                  no worry about the late reply, I knew that you were on holiday.

                  Thank you for the information. Maybe for paired-ends run, the following sentence could be omitted in the report :
                  "Sequences removed because they became shorter than the length cutoff of 20 bp: 0 (0.0%)"
                  and just write down the information for the pairs :
                  "Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 2697629 (10.10%)"
                  ?

                  have nice holidays

                  Gérald

                  PS : sorry for my late reply too, I didn't receive the notification email as my adress was wrong
                  Last edited by gerald2545; 08-23-2013, 04:03 AM.

                  Comment


                  • #24
                    problems with --clip_R1

                    Hi all,

                    I am having issues when using the --clip_R1 option.

                    Code:
                    trim_galore --clip_R1 3 test2.fastq.gz
                    Gives me a lot of
                    Code:
                    substr outside of string at ../../programs/trim_galore/trim_galore line 503, <TRIM> line 43696.
                    substr outside of string at ../../programs/trim_galore/trim_galore line 504, <TRIM> line 43696.
                    Use of uninitialized value in numeric lt (<) at ../../programs/trim_galore/trim_galore line 507, <TRIM> line 43696.
                    Without the clipping it works fine.

                    I am on Linux (amd64) and I use version 0.3.1 (Last update: 18 07 2013)

                    Attached is a sample file that produces one of these errors for me.

                    Regards
                    Attached Files

                    Comment


                    • #25
                      Thanks for reporting this. These warnings occurred if the sequence had been adapter- or quality-trimmed below the clipping threshold. I have now added an additional check to prevent this from happening.

                      A new version of Trim Galore is now available from its project page (https://www.bioinformatics.babraham....s/trim_galore/), which also fixes one additional issue:

                      - Specifying --clip_R1 or --clip_R2 will no longer attempt to clip sequences hat have been adapter- or quality-trimmed below the clipping threshold
                      - Specifying an output directory with --rrbs mode should now correctly create temporary files

                      Comment


                      • #26
                        A quick notice that I have just put out a new version of Trim Galore (v0.3.3) that fixes a bug I had introduced accidentally last week where single-end trimming would add an empty line into the trimmed sequence output.

                        Comment


                        • #27
                          If you like Trim Galore!, you might also like:



                          It uses FastQC to detect adaptors and primers, and then cuts them with cutadapt (well, in parallel using several cutadapts)

                          Comment


                          • #28
                            Hi all,

                            I am running into an error with trim_galore. Would anyone have any idea on what I may be doing wrong?
                            Command line: ./trim_galore --paired /Users/mvijayen/fastq/sample_1.fastq /Users/mvijayen/fastq/sample_2.fastq

                            No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)

                            Writing report to 'sample_1.fastq_trimming_report.txt'

                            SUMMARISING RUN PARAMETERS
                            ==========================
                            Input filename: /Users/mvijayen/fastq/sample_1.fastq
                            Quality Phred score cutoff: 20
                            Quality encoding type selected: ASCII+33
                            Adapter sequence: 'AGATCGGAAGAGC'
                            Maximum trimming error rate: 0.1 (default)
                            Minimum required adapter overlap (stringency): 1 bp
                            Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp

                            Writing final adapter and quality trimmed output to sample_1_trimmed.fq


                            >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /Users/mvijayen/fastq/sample_1.fastq <<<
                            open3: exec of cutadapt -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /Users/mvijayen/fastq/sample_1.fastq failed at ./trim_galore line 471

                            RUN STATISTICS FOR INPUT FILE: /Users/mvijayen/fastq/sample_1.fastq
                            =============================================
                            0 sequences processed in total
                            Illegal division by zero at ./trim_galore line 565.


                            I am using trim_galore version 0.3.3 and cutadapt seems to be working just fine when I check with ./cutadapt -h. Thanks!

                            Comment


                            • #29
                              If Cutadapt is not in the PATH you need to give Trim Galore the absolute path to where it can be found. Since you are saying that ./cutadapt works fine you seem to have installed it in that current directory. If you type 'cwd' and copy that path into the first part of Trim Galore that says $path_to_cutadapt = '' it should all work fine (just edit it with any editor).

                              Cheers,
                              Felix

                              Comment


                              • #30
                                Hi Felix,

                                I failed to mention that I actually did try that as well:

                                # change these paths if needed
                                my $path_to_cutadapt = '/Users/mvijayen/cutadapt-1.3/bin';

                                I am still getting the same error.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Exploring the Dynamics of the Tumor Microenvironment
                                  by seqadmin




                                  The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                                  07-08-2024, 03:19 PM
                                • seqadmin
                                  Exploring Human Diversity Through Large-Scale Omics
                                  by seqadmin


                                  In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                                  06-25-2024, 06:43 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:53 AM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 07-10-2024, 07:30 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 07-03-2024, 09:45 AM
                                0 responses
                                204 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 07-03-2024, 08:54 AM
                                0 responses
                                213 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X