Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • padmoo
    replied
    Hi everyone,

    not sure if this is the right thread but I'll give it a go:

    I'm trying to get rid of adapters in my sequences. There aren't many with adapters but I'd like to save what I can.

    When I run my command I get the following error:

    ==================================================================
    Your job looked like:

    ------------------------------------------------------------
    # LSBATCH: User input
    /gpfs/scratch/cbh12wsu/trim/trim_galore -phred33 --illumina --paired -phred33 --path_to_cutadapt /gpfs/scratch/cbh12wsu/trim/cutadapt.py -o adaptrim.fastq -length 50 1_L.fastq 2_R.fastq
    ------------------------------------------------------------

    Exited with exit code 2.

    The output (if any) follows:

    Path to Cutadapt set as: '/gpfs/scratch/cbh12wsu/trim/cutadapt.py' (user defined)
    File "/gpfs/scratch/cbh12wsu/trim/cutadapt.py", line 99
    print(rest, match.read.name, file=self.file)
    ^
    SyntaxError: invalid syntax
    Cutadapt seems to be working fine (tested command '/gpfs/scratch/cbh12wsu/trim/cutadapt.py --version')
    File "/gpfs/scratch/cbh12wsu/trim/cutadapt.py", line 99
    print(rest, match.read.name, file=self.file)
    ^
    SyntaxError: invalid syntax
    Failed to write to file '1_L.fastq_trimming_report.txt': No such file or directory
    ==================================================================

    Does indicate a problem with the cutadapt.py?

    Thanks!

    Leave a comment:


  • create.share
    replied
    For those with weaker heart (those that cannot use complex scripts in Linux and need a graphic interface) here is another (free) program for trimming qualities:

    An efficient SFF/FastQ viewer and editor (GUI)






    The gray/green curves in the second graphic shows the average quality before and after trimming the low quality ends.


    Only works on Fasta, FastQ, SFF for the moment.
    Sorry.
    Last edited by create.share; 05-30-2015, 12:19 AM.

    Leave a comment:


  • LindsayR
    replied
    Thanks for helping Felix. A simple typo on my end unfortunately. Problem solved!

    Leave a comment:


  • fkrueger
    replied
    Hi Lindsay,

    This is a little odd... The way Trim Galore handles paired-end files (when you specify --paired) is to run single-end trimming on read 1 and read 2 separately, and then run a 'validation' step that checks the length of each read in a sequence pair to decide whether or not to keep or boot the entire read pair. Since reads are not discarded in the (single-end) trimming step even if they are trimmed to a length of 0bp they should then either be kept or discarded as the entire pair. Is there a chance that the FastQ files you fed in did not match up or were truncated?

    So in a nutshell, the --paired option is not supposed to be fed through to Cutadapt (which only started supporting paired-end trimming recently), but is handled internally. If you keep having these problems could you please send me a few reads of your FastQ files and I can try to reproduce these errors on my side. Thanks, Felix

    Leave a comment:


  • LindsayR
    replied
    TrimGalore paired end issue

    I’m trying to run TrimGalore!v0.4.0 and I have cut adapt 1.8.1 installed using Python 2.7.6. I think that TrimGalore is not feeding in the paired option to cut adapt. I end up with an unequal number of reads in the read1 vs read 2 file and bismark will not align. This is the Summary of trimming: (I bolded the part I think is wrong in cut adapt) Any ideas? Thanks so much! -Lindsay

    SUMMARISING RUN PARAMETERS
    ==========================
    Input filename: path/read1_R1_010.fastq.gz
    Trimming mode: paired-end
    Trim Galore version: 0.4.0
    Cutadapt version: 1.8.1
    Quality Phred score cutoff: 20
    Quality encoding type selected: ASCII+33
    Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
    Maximum trimming error rate: 0.1 (default)
    Minimum required adapter overlap (stringency): 1 bp
    Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
    Output file will be GZIP compressed


    This is cutadapt 1.8.1 with Python 2.7.6
    Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC no –p argument is specified here…… /path_R1_010.fastq.gz
    Trimming 1 adapter with at most 10.0% errors in single-end mode ...
    Finished in 161.90 s (40 us/read; 1.48 M reads/minute).

    Leave a comment:


  • KJohnson
    replied
    The data is Illumina 1.9. Yes I can email you the report.

    Thank you,
    Kevin

    Leave a comment:


  • fkrueger
    replied
    Is the data Illumina 1.9 encoded (phred33) or the old 1.5 encoding by any chance? Would you mind attaching or sending me the FastQC report via email? Cheers, Felix

    Leave a comment:


  • KJohnson
    replied
    Hi all,

    I am running Trim Galore on illumina pair-end data and am trying to figure out what is going wrong. I have set quality score level to phred score of 30 but when trimming is complete and I view the FastQC file the box-plot whiskers under the Per base sequence quality tab go down to a phred score of 13. Is there something I am doing wrong?

    Thanks.

    code:
    trim_galore -fastqc -q 30 -paired -retain_unpaired Blue_trimmed_1.fq Blue_trimmed_2.fq

    Leave a comment:


  • fkrueger
    replied
    It might help if you could send me the FastQC html report to take a look (email).

    In more general terms, it is very well possible that you've got fragments of TruSeq adapters, or especially PCR primers, left in the library after trimming that FastQC warns you about. Quite often these are adapter or primer dimers that don't have the A (from A-tailing) at the start of the sequence. These sequences are not removed from the file, and they generally don't have to be if you are going to align the samples as the next step because they simply won't align.

    The adapter contamination you do care about is the read-through contamination at the 3' end which start in a genomic sequence of interest which then continues into adapter contamination. It would appear that trimming got rid of these efficiently.

    Leave a comment:


  • rhinoceros
    replied
    Thats great.

    Do you have any hints how to trim ScriptSeq prepped samples? My PE reads clearly had Truseq adaptors, but after trim_galore fastqc tells me that my R1 reads still contain a considerable amount of "TruSeq Adapter, Index 12 (100% over 58bp)" and some other "no hit" stuff whereas my R2 reads apparently contain lots of "Illumina Single End PCR Primer 1 (100% over 52bp)" and "no hit" stuff. Both files have massive k-mer bias in 5'-ends even after trim_galore. The first 13 bp of TruSeq adapters and ScriptSeq adapters are identical so I'm somewhat baffled how these adapters are present in some R1 even after trimming. I presume the R2 stuff is related to 3'-terminal tagging and very short RNA molecules so as a solution I could include the complete Illumina Paired End PCR Primer 1 seq utilizing the -a2 flag.
    Last edited by rhinoceros; 05-07-2015, 04:58 AM.

    Leave a comment:


  • fkrueger
    replied
    Originally posted by rhinoceros View Post
    There's a small problem with the zip file.

    Code:
    unzip trim_galore_v0.4.0.zip 
    Archive:  trim_galore_v0.4.0.zip
      inflating: Trim_Galore_User_Guide.pdf  
      inflating: trim_galore             
      inflating: RRBS_Guide.pdf          
    warning:  skipped "../" path component(s) in ../Bismark/license.txt
      inflating: Bismark/license.txt
    Ups... but it is only the license file. I have replaced the zip file now, Cheers, Felix

    Leave a comment:


  • rhinoceros
    replied
    Originally posted by fkrueger View Post
    Trim Galore is available from the Babraham Bioinformatics projects site.
    There's a small problem with the zip file.

    Code:
    unzip trim_galore_v0.4.0.zip 
    Archive:  trim_galore_v0.4.0.zip
      inflating: Trim_Galore_User_Guide.pdf  
      inflating: trim_galore             
      inflating: RRBS_Guide.pdf          
    warning:  skipped "../" path component(s) in ../Bismark/license.txt
      inflating: Bismark/license.txt

    Leave a comment:


  • fkrueger
    replied
    Trim Galore v0.4.0 released: Adapter auto-detection

    We have just made a new Trim Galore release to version 0.4.0. This adds a few sanity checks and makes the specification of standard adapters more straight forward. In fact we changed the default mode so that Trim Galore attempts to auto-detect which type of adapter has been used in library construction, which results in a 'one command to trim them all' for standard ClusterFlow processing of a highly diverse full Illumina flowcell.

    Here are the changes in more detail:

    • Unless instructed otherwise Trim Galore will now attempt to auto-detect the adapter which had been used for library construction (choosing from the Illumina universal, Nextera transposase and Illumina small RNA adapters). For this the first 1 million sequences of the first file specified are analysed. If no adapter can be detected within the first 1 million sequences Trim Galore defaults to --illumina. The auto-detection behaviour can be overruled by specifying an adapter sequence or using --illumina, --nextera or --small_rna

    • Added the new options '--illumina', '--nextera' and '--small_rna' to use different default sequences for trimming (instead of -a):
    Universal Illumina: AGATCGGAAGAGC (TruSeq or Sanger iTag)
    Small RNA: ATGGAATTCTCG
    Nextera: CTGTCTCTTATA

    • Added a sanity check to the start of a Trim Galore run to see if the (first) FastQ file in question does contain information at all or appears to be in SOLiD colorspace format, and bails if either is true. Trim Galore does not support colorspace trimming, but users wishing to do this are kindly referred to using Cutadapt as a standalone program

    • Added a new option '--path_to_cutadapt /path/to/cudapt'. Unless this option is specified it is assumed that Cutadapt is in the PATH (equivalent to '--path_to_cutadapt cutadapt'). Also added a test to see if Cutadapt seems to be working before the actual trimming is launched

    • Fixed an open command for a certain type of RRBS processing (was open() instead of open3())

    Trim Galore is available from the Babraham Bioinformatics projects site.

    Leave a comment:


  • rhinoceros
    replied
    Originally posted by gwilkie View Post
    I have also found that when using Nextera sample prep, you should trim at CTGTCTCTTATACACATCT instead of the usual AGATCGGAAGAGC.

    Best wishes, Gavin
    Is this still the case in 2015? I mean, is "CTGTCTCTTATACACATCT" universal to Nextera prepped samples?

    Leave a comment:


  • MaximeG
    replied
    Hi all,
    I have a question about the option non directional of trim galore.
    After a lot of reflexion, we have determined that we have done a RRBS library in a directional paired end manner (R1 begin by C/TGG and R2 by CAA). But the option nd permits to cut the CA from R2.
    It's a better strategy to let this CA for bismark and then to cut them ?
    We have run the two: With nd: 36,6% uniquely aligned pairs + 55.6% Multiple pairs
    Without nd: 37.8% uniquely aligned pairs + 55.2% Multiple pairs
    Thank you for your future response
    Maxime

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
25 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
29 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
25 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Working...
X