Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    trim_galore without adaptor trimming?

    Hi All,

    Here is my first question ever to this forum! :-)

    I have come across trim_galore when looking for a quality trimmer that would trim both paired end reads together. my fastq files are from illumina 1.9. I run the following command:

    trim_galore -q 20 --fastqc --gzip --paired filename1 filename3
    I get the following error message:

    No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)

    Writing report to 'filename1_trimming_report.txt'

    SUMMARISING RUN PARAMETERS
    ==========================
    Input filename: filename1
    Trimming mode: paired-end
    Trim Galore version: 0.3.7
    Quality Phred score cutoff: 20
    Quality encoding type selected: ASCII+33
    Adapter sequence: 'AGATCGGAAGAGC'
    Maximum trimming error rate: 0.1 (default)
    Minimum required adapter overlap (stringency): 1 bp
    Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
    Running FastQC on the data once trimming has completed
    Output file(s) will be GZIP compressed

    Writing final adapter and quality trimmed output to filename1_trimmed.fq.gz


    >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file filename1 <<<
    Traceback (most recent call last):
    File "/Users/yasmin/cutadapt-1.4.2/bin//cutadapt", line 9, in <module>
    from cutadapt.scripts import cutadapt
    File "/Users/yasmin/cutadapt-1.4.2/cutadapt/scripts/cutadapt.py", line 69, in <module>
    from cutadapt.adapters import Adapter, ColorspaceAdapter, BACK, FRONT, PREFIX, ANYWHERE
    File "/Users/yasmin/cutadapt-1.4.2/cutadapt/adapters.py", line 4, in <module>
    from cutadapt import align, colorspace
    File "/Users/yasmin/cutadapt-1.4.2/cutadapt/align.py", line 225, in <module>
    from cutadapt._align import globalalign_locate, compare_prefixes
    ImportError: dlopen(/Users/yasmin/cutadapt-1.4.2/cutadapt/_align.so, 2): no suitable image found. Did find:
    /Users/yasmin/cutadapt-1.4.2/cutadapt/_align.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00


    Cutadapt terminated with exit signal: '256'.
    Terminating Trim Galore run, please check error message(s) to get an idea what went wrong...
    if anybody came across this and solved it , please let me know!

    Many thanks!
    Yasmin
    Last edited by yasmin_friedmann; 09-12-2014, 01:16 AM. Reason: added error message

    Comment


    • #47
      Hi Felix - I'm writing the methods sections for a few WGBS papers where I've used trim_galore, is there a paper I can cite for it?

      Comment


      • #48
        If you wanted to you could cite its URL, there is no publication as such (apart from the Cutadapt reference). Cheers, Felix

        Comment


        • #49
          Hi all,
          I have a question about the option non directional of trim galore.
          After a lot of reflexion, we have determined that we have done a RRBS library in a directional paired end manner (R1 begin by C/TGG and R2 by CAA). But the option nd permits to cut the CA from R2.
          It's a better strategy to let this CA for bismark and then to cut them ?
          We have run the two: With nd: 36,6% uniquely aligned pairs + 55.6% Multiple pairs
          Without nd: 37.8% uniquely aligned pairs + 55.2% Multiple pairs
          Thank you for your future response
          Maxime

          Comment


          • #50
            Originally posted by gwilkie View Post
            I have also found that when using Nextera sample prep, you should trim at CTGTCTCTTATACACATCT instead of the usual AGATCGGAAGAGC.

            Best wishes, Gavin
            Is this still the case in 2015? I mean, is "CTGTCTCTTATACACATCT" universal to Nextera prepped samples?
            savetherhino.org

            Comment


            • #51
              Trim Galore v0.4.0 released: Adapter auto-detection

              We have just made a new Trim Galore release to version 0.4.0. This adds a few sanity checks and makes the specification of standard adapters more straight forward. In fact we changed the default mode so that Trim Galore attempts to auto-detect which type of adapter has been used in library construction, which results in a 'one command to trim them all' for standard ClusterFlow processing of a highly diverse full Illumina flowcell.

              Here are the changes in more detail:

              • Unless instructed otherwise Trim Galore will now attempt to auto-detect the adapter which had been used for library construction (choosing from the Illumina universal, Nextera transposase and Illumina small RNA adapters). For this the first 1 million sequences of the first file specified are analysed. If no adapter can be detected within the first 1 million sequences Trim Galore defaults to --illumina. The auto-detection behaviour can be overruled by specifying an adapter sequence or using --illumina, --nextera or --small_rna

              • Added the new options '--illumina', '--nextera' and '--small_rna' to use different default sequences for trimming (instead of -a):
              Universal Illumina: AGATCGGAAGAGC (TruSeq or Sanger iTag)
              Small RNA: ATGGAATTCTCG
              Nextera: CTGTCTCTTATA

              • Added a sanity check to the start of a Trim Galore run to see if the (first) FastQ file in question does contain information at all or appears to be in SOLiD colorspace format, and bails if either is true. Trim Galore does not support colorspace trimming, but users wishing to do this are kindly referred to using Cutadapt as a standalone program

              • Added a new option '--path_to_cutadapt /path/to/cudapt'. Unless this option is specified it is assumed that Cutadapt is in the PATH (equivalent to '--path_to_cutadapt cutadapt'). Also added a test to see if Cutadapt seems to be working before the actual trimming is launched

              • Fixed an open command for a certain type of RRBS processing (was open() instead of open3())

              Trim Galore is available from the Babraham Bioinformatics projects site.

              Comment


              • #52
                Originally posted by fkrueger View Post
                Trim Galore is available from the Babraham Bioinformatics projects site.
                There's a small problem with the zip file.

                Code:
                unzip trim_galore_v0.4.0.zip 
                Archive:  trim_galore_v0.4.0.zip
                  inflating: Trim_Galore_User_Guide.pdf  
                  inflating: trim_galore             
                  inflating: RRBS_Guide.pdf          
                warning:  skipped "../" path component(s) in ../Bismark/license.txt
                  inflating: Bismark/license.txt
                savetherhino.org

                Comment


                • #53
                  Originally posted by rhinoceros View Post
                  There's a small problem with the zip file.

                  Code:
                  unzip trim_galore_v0.4.0.zip 
                  Archive:  trim_galore_v0.4.0.zip
                    inflating: Trim_Galore_User_Guide.pdf  
                    inflating: trim_galore             
                    inflating: RRBS_Guide.pdf          
                  warning:  skipped "../" path component(s) in ../Bismark/license.txt
                    inflating: Bismark/license.txt
                  Ups... but it is only the license file. I have replaced the zip file now, Cheers, Felix

                  Comment


                  • #54
                    Thats great.

                    Do you have any hints how to trim ScriptSeq prepped samples? My PE reads clearly had Truseq adaptors, but after trim_galore fastqc tells me that my R1 reads still contain a considerable amount of "TruSeq Adapter, Index 12 (100% over 58bp)" and some other "no hit" stuff whereas my R2 reads apparently contain lots of "Illumina Single End PCR Primer 1 (100% over 52bp)" and "no hit" stuff. Both files have massive k-mer bias in 5'-ends even after trim_galore. The first 13 bp of TruSeq adapters and ScriptSeq adapters are identical so I'm somewhat baffled how these adapters are present in some R1 even after trimming. I presume the R2 stuff is related to 3'-terminal tagging and very short RNA molecules so as a solution I could include the complete Illumina Paired End PCR Primer 1 seq utilizing the -a2 flag.
                    Last edited by rhinoceros; 05-07-2015, 04:58 AM.
                    savetherhino.org

                    Comment


                    • #55
                      It might help if you could send me the FastQC html report to take a look (email).

                      In more general terms, it is very well possible that you've got fragments of TruSeq adapters, or especially PCR primers, left in the library after trimming that FastQC warns you about. Quite often these are adapter or primer dimers that don't have the A (from A-tailing) at the start of the sequence. These sequences are not removed from the file, and they generally don't have to be if you are going to align the samples as the next step because they simply won't align.

                      The adapter contamination you do care about is the read-through contamination at the 3' end which start in a genomic sequence of interest which then continues into adapter contamination. It would appear that trimming got rid of these efficiently.

                      Comment


                      • #56
                        Hi all,

                        I am running Trim Galore on illumina pair-end data and am trying to figure out what is going wrong. I have set quality score level to phred score of 30 but when trimming is complete and I view the FastQC file the box-plot whiskers under the Per base sequence quality tab go down to a phred score of 13. Is there something I am doing wrong?

                        Thanks.

                        code:
                        trim_galore -fastqc -q 30 -paired -retain_unpaired Blue_trimmed_1.fq Blue_trimmed_2.fq

                        Comment


                        • #57
                          Is the data Illumina 1.9 encoded (phred33) or the old 1.5 encoding by any chance? Would you mind attaching or sending me the FastQC report via email? Cheers, Felix

                          Comment


                          • #58
                            The data is Illumina 1.9. Yes I can email you the report.

                            Thank you,
                            Kevin

                            Comment


                            • #59
                              TrimGalore paired end issue

                              I’m trying to run TrimGalore!v0.4.0 and I have cut adapt 1.8.1 installed using Python 2.7.6. I think that TrimGalore is not feeding in the paired option to cut adapt. I end up with an unequal number of reads in the read1 vs read 2 file and bismark will not align. This is the Summary of trimming: (I bolded the part I think is wrong in cut adapt) Any ideas? Thanks so much! -Lindsay

                              SUMMARISING RUN PARAMETERS
                              ==========================
                              Input filename: path/read1_R1_010.fastq.gz
                              Trimming mode: paired-end
                              Trim Galore version: 0.4.0
                              Cutadapt version: 1.8.1
                              Quality Phred score cutoff: 20
                              Quality encoding type selected: ASCII+33
                              Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
                              Maximum trimming error rate: 0.1 (default)
                              Minimum required adapter overlap (stringency): 1 bp
                              Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
                              Output file will be GZIP compressed


                              This is cutadapt 1.8.1 with Python 2.7.6
                              Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC no –p argument is specified here…… /path_R1_010.fastq.gz
                              Trimming 1 adapter with at most 10.0% errors in single-end mode ...
                              Finished in 161.90 s (40 us/read; 1.48 M reads/minute).

                              Comment


                              • #60
                                Hi Lindsay,

                                This is a little odd... The way Trim Galore handles paired-end files (when you specify --paired) is to run single-end trimming on read 1 and read 2 separately, and then run a 'validation' step that checks the length of each read in a sequence pair to decide whether or not to keep or boot the entire read pair. Since reads are not discarded in the (single-end) trimming step even if they are trimmed to a length of 0bp they should then either be kept or discarded as the entire pair. Is there a chance that the FastQ files you fed in did not match up or were truncated?

                                So in a nutshell, the --paired option is not supposed to be fed through to Cutadapt (which only started supporting paired-end trimming recently), but is handled internally. If you keep having these problems could you please send me a few reads of your FastQ files and I can try to reproduce these errors on my side. Thanks, Felix

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-27-2024, 06:37 PM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-27-2024, 06:07 PM
                                0 responses
                                11 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                69 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X