Header Leaderboard Ad

Collapse

Orientation of 454 paired end reads split by linker

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Orientation of 454 paired end reads split by linker

    Hi,

    I extracted reads from sff files

    Then I match these reads within titanium linker,

    1) Why only a small proportion of my reads can found linker? My library is 20kb

    2) After split by linker, I got a pair of reads. The orientation is f-><-r or f->f->?

    Thanks

  • #2
    Originally posted by skblazer View Post
    Hi,

    I extracted reads from sff files

    Then I match these reads within titanium linker,

    1) Why only a small proportion of my reads can found linker? My library is 20kb
    Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.

    2) After split by linker, I got a pair of reads. The orientation is f-><-r or f->f->?

    Thanks
    They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

    In the read call the two halves of the paired read L and R (left and right)
    Code:
    ================================^^^^^^^^^^^^^^^=======================
    Read-L                             linker      Read-R
    After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

    Code:
    Read-R                                                   Read-L
    -------->                                                -------->
    ==================================================================
    Of course if the reads match the bottom strand of the reference they will be flipped around.

    Comment


    • #3
      Many thanks to your kindly help kmcarr.

      Originally posted by kmcarr View Post
      Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.



      They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

      In the read call the two halves of the paired read L and R (left and right)
      Code:
      ================================^^^^^^^^^^^^^^^=======================
      Read-L                             linker      Read-R
      After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

      Code:
      Read-R                                                   Read-L
      -------->                                                -------->
      ==================================================================
      Of course if the reads match the bottom strand of the reference they will be flipped around.

      Comment


      • #4
        Originally posted by kmcarr View Post
        Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.



        They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

        In the read call the two halves of the paired read L and R (left and right)
        Code:
        ================================^^^^^^^^^^^^^^^=======================
        Read-L                             linker      Read-R
        After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

        Code:
        Read-R                                                   Read-L
        -------->                                                -------->
        ==================================================================
        Of course if the reads match the bottom strand of the reference they will be flipped around.
        I got some Paie-end data, but i don't know the sequence of the linker and insert size. could you tell me from where i can know it. Many thanks.

        Comment


        • #5
          Have a look at this thread (http://seqanswers.com/forums/showthread.php?t=12940) for linker sequences. You will have to ask the person who constructed the library for insert size information.

          P.S. There is no reason to shout (using large, bold font) in this forum, we can read the normal typeface just fine.

          Comment


          • #6
            Thanks Kmcarr. I have read the thread, didn't find the linker sequencer. I guess maybe the internal adaptor is the same for 454 sequencing like Illumina sequencing adaptor, that's why i asked the question again.
            Maybe after doing the alignment of all paired end read, i can find it.
            P.S. This is the first time i ask the question on this web, have no idea about the word size. I
            t's not my mean to shout, it's your meaning.
            Thanks again.

            Comment


            • #7
              aurora_Jing,

              Are you asking about 454 paired end reads, Illumina paired end or Illumina mate-pair? You asked your question in a thread specifically about 454 paired end reads so naturally I assumed that was the data you were asking about. The thread I pointed you to clearly has the linker sequences for 454 paired end libraries in the first and second posts.

              Please provide more detail about what types of read data you have (sequencing platform & library construction type) so we better help you.

              Comment


              • #8
                Yes, I am now dealing with 454 Mate pair data.
                I find the linker sequences in the posts you kindly pointed. I am certainly wrong regard the thread you introduced as the thread I read yesterday.
                Thanks again for your quick and kindly reply.

                Comment


                • #9
                  What is usually the percentage of true PE reads in a 20 kb prep?
                  I´ve done several 3 kb preps but never 8 or 20 kb. I believe in our 3 kb preps we get 50.60% of true PE reads.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    How RNA-Seq is Transforming Cancer Studies
                    by seqadmin



                    Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                    09-07-2023, 11:15 PM
                  • seqadmin
                    Methods for Investigating the Transcriptome
                    by seqadmin




                    Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                    Whole Transcriptome RNA-seq
                    Whole transcriptome sequencing...
                    08-31-2023, 11:07 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 07:42 AM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 09-22-2023, 09:05 AM
                  0 responses
                  23 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 09-21-2023, 06:18 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 09-20-2023, 09:17 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X