Header Leaderboard Ad

Collapse

454 PE-linker question

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 454 PE-linker question

    Hello,

    I have been working on how to incorporate a linker when simulating paired reads from an assembly of 454 reads and have a few questions about how the linker sequence is used. From my experience, Newbler will treat simulated pairs as single end even if you try to say they are from a paired-end run. If anyone could suggest a reference, other than the 454 website, as to how the linker is used or if there are other things that are used to distinguish pairs that would be appreciated.

    If there is a tool similar to wgsim for 454 data then that would really help. I am not restricted to using Newbler by any means but I am not really wanting to generate reads with wgsim, for example, and have to assemble (or map) them as Illumina reads.

    Thanks.

  • #2
    I guess you are referring to the internal adaptor. Here is my understanding of the protocol: Once you have the genomic DNA (fragments) of the desired size, you circularize and the internal adaptor is added to
    close the circle. Cutting the circle is not perfect, so you may end up having the adaptor
    very close to the ends of the fragment. So the fragments you sequence will look like
    this:

    ADAPTOR 5-----------------INTERNAL ADAPTOR ------------------3 ADAPTOR
    ADAPTOR 3-----------------INTERNAL ADAPTOR ------------------5 ADAPTOR

    For you modeling, randomize the location of the adaptor.

    Also, look around for the big projects 1000 genomes, TCGA (cancer) and look for real 454 data.
    -drd

    Comment


    • #3
      Thanks for the reply and advice for looking at public data. Ironically, 1000 genomes website is down, but I guess that is beside the point.

      Yes, I was referring to the internal sequence and how/if it is used or recognized during assembly (with Roche's software). I was trying to do assemblies with simulated paired ends but they were not being treated as such. So, I was wondering if the linker is just a part of the protocol and removed prior to assembly or if it is used in some way to orient the pairs. By working with some public data I might be able to get at this so that will be my next step.

      Comment


      • #4
        When I was looking to this data (long time ago) we had to extract the adapter ourselves.
        Perhaps that has changed now and it is done by the roche/454 software.

        Do you refer to Newbler? Is it possible you are not using the correct linker/adapter
        sequence in your simulated data? Does Newbler have any report about linker processing?
        -drd

        Comment


        • #5
          So I confirmed that newbler finds and removes the 44bp of the linker prior to the assembly step. Also, the efficiency seems very high and the amount of successful pairs
          seems very high. So perhaps you are using the wrong linker in your simulated reads?
          -drd

          Comment


          • #6
            Originally posted by drio View Post
            So I confirmed that newbler finds and removes the 44bp of the linker prior to the assembly step. Also, the efficiency seems very high and the amount of successful pairs
            seems very high. So perhaps you are using the wrong linker in your simulated reads?
            This is the information I was looking for. I have not been using the linker at all when doing the assemblies with the simulated reads because I thought maybe people removed it prior to doing the assembly. Now I will incorporate this into the reads and see if Newbler recognizes it.

            Comment


            • #7
              You can force newbler to interpret your paired end reads with the -p flag. However, I think you need to give newbler the pairs as separate fasta entries, with descriptions in the fasta headers as such:

              >readname_F template=readname dir=F library=libname
              >readname_R template=readname dir=R library=libname

              What I mean is that I know this works, while I have no clue if newbler will check for the 454 linker in fastafiles added with the -p flag...

              Comment

              Working...
              X