Header Leaderboard Ad

Collapse

Using trimmomatic on multiple paired-end read files

Collapse

Announcement

Collapse

SEQanswers June Challenge Has Begun!

The competition has begun! We're giving away a $50 Amazon gift card to the member who answers the most questions on our site during the month. We want to encourage our community members to share their knowledge and help each other out by answering questions related to sequencing technologies, genomics, and bioinformatics. The competition is open to all members of the site, and the winner will be announced at the beginning of July. Best of luck!

For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
See more
See less
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using trimmomatic on multiple paired-end read files

    I need help to write a for loop to run Trimmomatic tool for quality trimming of paired end fastq files.
    I need to write a for loop so that I can run an executable for all multiple files.

    Input PE files looks like - C1_S1_L001_R1_001.fastq.gz
    C1_S1_L001_R2_001.fastq.gz

    C2_S39_L001_R1_001.fastq.gz
    C2_S39_L001_R2_001.fastq.gz

    T2_S41_L001_R1_001.fastq.gz
    T2_S41_L001_R2_001.fastq.gz

    T6_S45_L001_R1_001.fastq.gz
    T6_S45_L001_R2_001.fastq.gz

    To run trimmomatic for the paired reads corresponding to C1_S1_L001_R1_001.fastq.gz and C1_S1_L001_R2_001.fastq.gz, the following command works:

    java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 C1_S1_L001_R1_001.fastq.gz C1_S1_L001_R2_001.fastq.gz C1_R1_paired.fq.gz C1_R1_unpaired.fq.gz C1_R2_paired.fq.gz C1_R2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35

    The framework provided by trimmomatic

    java -jar <path to trimmomatic.jar> PE [-threads <threads] [-phred33 | -phred64] [-trimlog <logFile>] <input 1> <input 2> <paired output 1> <unpaired output 1> <paired output 2> <unpaired output 2> <step 1> ...

    Any help please!
    Thanks!
    Last edited by shashankgupta; 02-14-2017, 04:49 AM.

  • #2
    I would assume the following should work:
    (but obviously untested)

    Code:
    for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
    do
    java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
    done
    You can easily check if the commands look alright be adding in an echo statement:
    Code:
    for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
    do
    echo java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
    done

    Comment


    • #3
      Originally posted by wdecoster View Post
      I would assume the following should work:
      (but obviously untested)

      Code:
      for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
      do
      java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      done
      You can easily check if the commands look alright be adding in an echo statement:
      Code:
      for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
      do
      echo java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      done
      Thank you for helping me.
      But, somehow it is showing some error as shown below-

      TrimmomaticPE: Started with arguments:
      0*_R1_001.fastq.gz 0*_R2_001.fastq.gz 0_R1.trimmed_PE.fastq 0_R1.trimmed_SE.fastq 0_R2.trimmed_PE.fastq 0_R2.trimmed_SE.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:3:20 MINLEN:30
      Exception in thread "main" java.io.FileNotFoundException: 0*_R1_001.fastq.gz (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
      at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264)
      at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539)
      at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
      0
      TrimmomaticPE: Started with arguments:
      0*_R1_001.fastq.gz 0*_R2_001.fastq.gz 0_R1.trimmed_PE.fastq 0_R1.trimmed_SE.fastq 0_R2.trimmed_PE.fastq 0_R2.trimmed_SE.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:3:20 MINLEN:30
      Exception in thread "main" java.io.FileNotFoundException: 0*_R1_001.fastq.gz (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
      at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264)
      at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539)
      at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
      0




      As I understand, above script unable to find the file. So to simplify it, I rename all the file names, and now it looks like-


      C1_R1.fastq
      C1_R2.fastq

      C2_R1.fastq
      C2_R2.fastq

      C3_R1.fastq
      C3_R2.fastq

      T1_R1.fastq
      T1_R2.fastq

      T2_R1.fastq
      T2_R2.fastq

      T3_R1.fastq
      T3_R2.fastq

      Therefore, the working trimmomatic command looks like,

      java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 C1_R1.fastq C1_R2.fastq C1_R1_paired.fastq C1_R1_unpaired.fastq C1_R2_paired.fastq C1_R2_unpaired.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      Last edited by shashankgupta; 02-15-2017, 01:04 AM.

      Comment


      • #4
        Thanks, this helped me!

        Comment


        • #5
          Hello all,

          This code work on multiple single-end read files??

          Any help please??

          Many thanks

          Comment


          • #6
            #!/bin/bash

            for f1 in /path_to_your_raw_data/*.fastq.gz

            do
            java -jar /path_to_trimmomatic_folder/trimmomatic-0.36.jar SE -phred33 $f1 ${f1%%.fastq.gz}"trimmed_minleng50.fq.gz" ILLUMINACLIP:/path_to_trimmomatic_folder/adapters/TruSeq2-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50

            done


            This is what I used for the different SE files I have, basically it changes .fastq.gz for "trimmed_minleng50.fq.gz" once it is trimmed. You can edit the order and the value of the parameters. (Minimun_length, minimun quality at the end or start of the reads, different adapter files...).
            Last edited by carmarbla; 08-08-2017, 06:37 AM.

            Comment


            • #7
              Many thanks Carmarbla for your reply!
              Best

              Comment


              • #8
                Hi All,
                just found these threads. Does the code
                or f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
                do
                java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
                done
                is need modification prior to use or it will work for any PE files?
                cheers

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 01:08 PM
                0 responses
                5 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-01-2023, 08:56 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-01-2023, 07:33 AM
                0 responses
                89 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-31-2023, 07:50 AM
                0 responses
                128 views
                0 likes
                Last Post seqadmin  
                Working...
                X