Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie Paired-end options

    Hi everyone,

    I just began to use bowtie to map my solexa paired-end reads to a genome. I found that bowtie has three options to specify the orientation of the paired-end reads, --fr, --rf, and --ff. I tried all of them, and found:

    --fr, as the default one, can map most of my data onto the genome. For the left ones, --rf and --ff options can map some, but still leave some reads, which can be separately mapped in non-paired-end mode(should I call it single-end mode), but cannot be recognized as Paired-end reads with any of the three options.

    Carefully examining those remaining reads, I found most of them are in an orientation of rev-rev. In another word, both ends of each pair are mapped to the reverse strand. Why does bowtie have no --rr option? How can I process those reads in paired-end mode?

    THanks a lot!
    Dezhi

  • #2
    Dezhi,

    There is no --rr because that is the same as --ff, molecularly speaking. The 'f' and 'r' do not refer to the strand which the read aligns to, but the relative orientation of read1 and read2. The relative orientation of the reads 1 and 2 is determined by the method used to generate the paired reads; it should be identical for all read pairs in your data set and you should now what the expected orientation is based on the method used to construct the paired end library. (Let's ignore for now that unexpected pair orientations can signal structural variations.)

    The --fr orientation means that the two reads are aligned to opposite strands and are pointed toward each other (i.e. the 3' ends of the reads are closer together than the 5' ends).

    Code:
    -------------->
    ===============================================
                                    <--------------
    The --rf orientation means that the two reads are aligned to opposite strands and are pointed away from each other (i.e. the 5' ends of the reads are closer together than the 3' ends).

    Code:
                                    -------------->
    ===============================================
    <--------------
    The --ff orientation means that both reads are aligned to the same strand and are pointed in the same direction. Further, read1 should always be 5' to read2, relative to the strand they are aligned to. In the two cases above you should have an equal distribution of read1 being the read "on the left".

    Code:
    read1                           read2
    -------------->                 -------------->
    ===============================================
    
    OR
    
    ===============================================
    <--------------                 <--------------
              read2                            read1
    Read pairs in the --fr orientation are produced using the Illumina paired end protocol. Read pairs in the --rf orientation are produced using the Illumina mate-pair protocol. Read pairs in the --ff orientation are produced in using the SOLiD mate-pair protocol.

    Comment


    • #3
      kmcarr, thanks so much. your reply is really helpful.

      I guess that my reads should be produced using the Illumina paired end protocol, so most of them are in --fr orientation.

      But a small portion of the paired-end reads are in --rf/--ff orientation, because they can be mapped with those two options. Some are even in an orientation as illustrated below. I guess they are different from the -ff orientation, because read2 is 5' to read1.

      Do you have any idea why the same illumina paired-end protocol can generate some reads that is not in -fr orientation? or Do you meet this problem before?

      read2 read1
      --------------> -------------->
      ===============================================

      OR

      ===============================================
      <-------------- <--------------
      read1 read2

      Comment


      • #4
        SOLiD Paired-end with Bowtie

        This post is really helpful, addressing Illumina mate-pair and paired-end as well as the SOLiD mate-pair. The newest SOLiD machines also have a paired-end, which as near as I can tell is of the configuration:

        Code:
        read1                           read2
        -------------->                 <--------------
        ===============================================
        Now, since the reverse compliment of color-space is the just the reverse sequence, it seems to me this is equivalent to the -fr from bowtie. Has anybody run Bowtie with the SOLiD Paired-end data?

        Comment


        • #5
          I'd like to bump this up as I'm trying to map some SOLID 4 reads. Thanks!

          Comment


          • #6
            Hi!

            I'm analyzing a "second-hand" dataset generated using SOLiD 4. It is a transcriptome mate pair library that is 52 x 37 nt, and I cannot for the sake of me find the protocol that was used to generate those specific read lengths. I have F3 and R3 reads, so I am assuming it is a circularization protocol, but I do not know what the size selection parameters were, or how the circles were cut to produce the final fragments. This info would be very valuable for a more accurate mapping.

            Any knowledge would be greatly appreciated!

            Thanks a lot,

            Carmen

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Exploring the Dynamics of the Tumor Microenvironment
              by seqadmin




              The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
              07-08-2024, 03:19 PM
            • seqadmin
              Exploring Human Diversity Through Large-Scale Omics
              by seqadmin


              In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
              06-25-2024, 06:43 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 07-10-2024, 07:30 AM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-03-2024, 09:45 AM
            0 responses
            201 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-03-2024, 08:54 AM
            0 responses
            210 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-02-2024, 03:00 PM
            0 responses
            192 views
            0 likes
            Last Post seqadmin  
            Working...
            X