Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fewer reads R1 than R2 bowtie2

    Dear all,
    perhaps I am going to ask something too much discussed. However I am absolutely unable to find the rigth answer.
    Anyone knows how to perform a mapping step using Bowtie2 with a different number of R1 and R2 reads from Illumina? I haver heard something about using Trim_galore prior to the alignment step, but I must trim from several primers and after a cutadapt trimming process R1 and R2 have a different number of reads. I guess that using Trim_galore after that could trim useful sequences...
    Any advice would be appreciated.
    Thanks a lot!

  • #2
    If you can reformat your R1/R2 such that they only contained paired, and have an extra fastq that contains unpaired reads, you can do...

    -q -1 R1.fastq -2 R2.fastq -U unpaired.fastq

    On my end, I've had good luck with Trimmomatic, which will give you a separate file for the unpaired reads (1P, 2P, 1U, 2U)

    Comment


    • #3
      Hi ctseto! Thanks for your reply!
      I had a look to trimmomatic. However it didn't work as well as I expected. I mean, some primer sequences still remained in my fastq files.
      That's what I decided to give a chance to cutadapt, but I don't know how to "reformat" my R1 and R2 files...
      Thanks!

      Comment


      • #4
        Not familiar with cutadapt, but I'll look into it. From https://github.com/marcelm/cutadapt/...ster/README.md

        If you use one of the read-discarding options, then the --paired-output option is needed to keep the two files synchronized. First trim the forward read, writing output to temporary files:

        cutadapt -a ADAPTER_FWD --minimum-length 20 --paired-output tmp.2.fastq -o tmp.1.fastq reads.1.fastq reads.2.fastq
        Then trim the reverse read, using the temporary files as input:

        cutadapt -a ADAPTER_REV --minimum-length 20 --paired-output trimmed.1.fastq -o trimmed.2.fastq tmp.2.fastq tmp.1.fastq
        Finally, remove the temporary files:

        rm tmp.1.fastq tmp.2.fastq
        I assume this is what you tried?

        An inelegant way of pulling reads that still had pairs would be to build an index of readnames that existed in both R1 and R2; then extract those reads from R1 and R2 and construct a new pair of files that had the appropriate reads.
        Last edited by ctseto; 11-14-2013, 09:09 AM.

        Comment


        • #5
          This is why I prefer Trimmomatic. It handles paired end reads more elegantly. I suggest you give it a try.

          Comment


          • #6
            Exactly!
            Older version....
            Thanks a lot ctseto!
            However, I will let a chance to Trimmomatic again... thanks!

            Comment


            • #7
              Originally posted by jordi View Post
              Exactly!
              Older version....
              Thanks a lot ctseto!
              However, I will let a chance to Trimmomatic again... thanks!
              Might be worth checking your Trimmomatic's list of adapters to be on the safe side.

              That would be the TruSeq3-PE.fa file in your adapters directory in the Trimmomatic directory. (Current version is 0.30)

              Knowing what adapters/primers/kits you are using upstream of your NGS, and armed with something like Illumina's Customer Sequence Letter (http://support.illumina.com/download...es_letter.ilmn) It should give you enough information (to much, even) to put together a list of adaptors to trim off.

              A figure illustrating the schema of Trimmomatic is in its manual (http://www.usadellab.org/cms/uploads...nual_V0.30.pdf)

              From Trimmomatic's notes
              These sequences have not been extensively tested, and depending on specific issues which may occur in library preparation, other sequences may work better for a given dataset.

              To make a custom version of fasta, you must first understand how it will be used. Trimmomatic uses two strategies for adapter trimming: Palindrome and Simple

              With 'simple' trimming, each adapter sequence is tested against the reads, and if a sufficiently accurate match is detected, the read is clipped appropriately.

              'Palindrome' trimming is specifically designed for the case of 'reading through' a short fragment into the adapter sequence on the other end. In this approach, the appropriate adapter sequences are 'in silico ligated' onto the start of the reads, and the combined adapter+read sequences, forward and reverse are aligned. If they align in a manner which indicates 'read-through', the forward read is clipped and the reverse read dropped (since it contains no new data).

              Naming of the sequences indicates how they should be used. For 'Palindrome' clipping, the sequence names should both start with 'Prefix', and end in '/1' for the forward adapter and '/2' for the reverse adapter. All other sequences are checked using 'simple' mode. Sequences with names ending in '/1' or '/2' will be checked only against the forward or reverse read. Sequences not ending in '/1' or '/2' will be checked against both the forward and reverse read. If you want to check for the reverse-complement of a specific sequence, you need to specifically include the reverse-complemented form of the sequence as well, with another name.
              Knowing what adapters/primers/kits you are using upstream of your NGS, and armed with something like Illumina's Customer Sequence Letter (http://support.illumina.com/download...es_letter.ilmn) It should give you enough information (to much, even) to put together a list of adaptors to trim off.
              Last edited by ctseto; 11-14-2013, 11:46 AM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Understanding Genetic Influence on Infectious Disease
                by seqadmin




                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                09-09-2024, 10:59 AM
              • seqadmin
                Addressing Off-Target Effects in CRISPR Technologies
                by seqadmin






                The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                08-27-2024, 04:44 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:25 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 01:02 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-18-2024, 06:39 AM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-11-2024, 02:44 PM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Working...
              X