Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • westerman
    replied
    @kmcarr: I will concede that the constraint is mostly, if not entirely, theoretical since the adapter sequencer should be known -- certainly it will be by us service providers and this information should be passed onto our customers. A 'adapter-knowledge-free' program would only be useful in extremely rare cases or as part of a thought experiment.

    I had not considered Trimmomatic's Palindrome mode since I never use that part of Trimmomatic. Thanks for the tip.

    Leave a comment:


  • kmcarr
    replied
    Originally posted by westerman View Post
    As a followup, it turns out that the samples in question did not (for the most part) look like the 2nd example I gave -- i.e., with the desired fragment fully contained in R1 and R2 with R1 starting inside R2 and vice-versa. Instead most of the reads looked like the 1st example thus we could use normal Panda/Flash methodology on them.

    It might still be interesting to develop an 'adapater-knowledge-free' stitching/merging program. But that is a task for another day.
    I'm curious about the 'adapter-knowledge-free' constraint to you problem. If the premise of instance #2 in your original post is that these are sequencing reads in which (read length) > (fragment length) (i.e. contain adapter sequence at the 3' end) how would you not know what the adapter sequence is? The adapters/sequencing primers for all major kits are pretty much known are they not?

    If you have a priori knowledge of the adapter sequences then Trimmomatic, using it Palindrome trimming mode, handles cases like #2, but not in exactly the way you asked about. I makes not attempt to "merge" the two reads. It simply clips the adapter from read 1 and discards read 2 entirely as it contains no additional data beyond that which is contained in read 1.

    Leave a comment:


  • westerman
    replied
    As a followup, it turns out that the samples in question did not (for the most part) look like the 2nd example I gave -- i.e., with the desired fragment fully contained in R1 and R2 with R1 starting inside R2 and vice-versa. Instead most of the reads looked like the 1st example thus we could use normal Panda/Flash methodology on them.

    It might still be interesting to develop an 'adapater-knowledge-free' stitching/merging program. But that is a task for another day.

    Leave a comment:


  • westerman
    replied
    Originally posted by mcnelson.phd View Post
    ... but the new version of Reporter incorporates a read "Stitching" feature that might do exactly what you want.
    Ah yes, that is an interesting option. Hard to say from scanning the docs if it would be better than Panda/Flash/SeqPrep but since the Reporter can be run off-machine I might give it a try. Thanks for the tip.

    Leave a comment:


  • westerman
    replied
    Originally posted by GenoMax View Post
    Rick,

    It sounds like you do not want to trim (adapters) before the merge, is that a requirement?
    Not a requirement per se. It is what I will probably end up doing especially since we know the adapters. However Phillip and I were wondering if there an adapter-knowledge-free method.

    Indeed, the longer lengths are making for interesting possibilities.

    Leave a comment:


  • mcnelson.phd
    replied
    It's probably too late right now if your run is already doing, but the new version of Reporter incorporates a read "Stitching" feature that might do exactly what you want. You'll have to manually add the flag to your sample sheet and reprocess your data if you want to try it. Check out the full guide on Reporter for what the actual flag is and what options are associated with it.

    Leave a comment:


  • GenoMax
    replied
    Longer read lengths have finally made the idea practical.

    Leave a comment:


  • SNPsaurus
    replied
    This group published along these lines:
    Backgound High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from sequencing errors. Results We demonstrate that overlapping read pairs (ORP) -- generated by combining short fragment sequencing libraries and longer sequencing reads -- significantly reduce sequencing error rates and improve rare variant detection accuracy. Using this sequencing protocol and an error model optimized for variant detection, we are able to capture a large number of genetic mutations present within a viral population at ultra-low frequency levels (<0.05%). Conclusions Our rare variant detection strategies have important implications beyond viral evolution and can be applied to any basic and clinical research area that requires the identification of rare mutations.


    They align the raw reads and analyze that rather than merging. There is a second paper that came out more recently as well, but I can't dredge it up. My lab should have our version out soon, too. Gary Schroth at Illumina said he was pushing long ago to have this the standard output of the Illumina machines as a way to get separation on error rate with other platforms, so it is funny that years later there is a sudden wave of labs all independently coming up with the idea.

    Leave a comment:


  • GenoMax
    replied
    Rick,

    It sounds like you do not want to trim (adapters) before the merge, is that a requirement?
    Last edited by GenoMax; 08-27-2013, 09:13 AM.

    Leave a comment:


  • SNPsaurus
    replied
    I use SeqPrep for exactly that purpose, although I do an extra careful adapter stripping before and after merging to clean up the errors. It did an ok job without the extra step, but I wanted the reads as error-free as possible. I can reliably find alleles at the 0.03% range by doing that.

    I look at the length of the merged reads and trim back if they are a size range where partial adapters would have been present. But your approach would work too, I think.
    Last edited by SNPsaurus; 08-27-2013, 09:10 AM.

    Leave a comment:


  • westerman
    replied
    @GenoMax: Your idea should work but doing it for an entire miSeq run sounds like a long processing time. I was hoping for a quicker and one-stop solution.

    @McNelson.phd: No, I haven't tried SeqPrep but from my reading of it -- and your description -- it sounds like it would act the same as Panda and Flash: not good for when there is no prior knowledge of the adapter. I will install it though and give it a spin.

    Real data coming off the sequencer later today!

    Leave a comment:


  • mcnelson.phd
    replied
    Have you tried SeqPrep?

    I know I've tried it on Nextera data and by giving it the Nextera adapter sequence it was able to spit out reads with 100% overlap but whose length was < 250bp, which would fit what you're talking about. What I can't say is how it would handle the "adapter" sequences that might hang off the ends if you don't provide it with any sequence to look for.

    Leave a comment:


  • GenoMax
    replied
    A pair-wise aligner (that can export a consensus, followed by an appropriate trim) should work right?

    Leave a comment:


  • westerman
    replied
    As Phillip SanMiguel said to me in private email and which may clarify my post:


    So the reads may all have a few (1-5 bases) of adapter at the their 3' ends. A better way to trim them would be to compare R1 and R2 -- the first base of each should point out the last base of the the other. If PANDA had a setting to remove single stranded sequence from pair merges, that would be good.

    Leave a comment:


  • westerman
    started a topic Merger/overlapper for fully contained fragment

    Merger/overlapper for fully contained fragment

    I am trying to find a tool that would do merging/overlapping of PE reads when the fragment is fully contained within the reads and without having to know the adapters ahead of time. The program PANDA and FLASH (and others) will merge PE reads into a single read however they are geared towards cases where the fragment is a subset of the read. E.g.

    Code:
    R1:     ---------->
    R2:          <----------
    Frag:   ----------------
    However I am thinking of the situation of:
    Code:
    R1:          ------------->
    R2:       <--------------
    Frag:        ------------
    Both Panda and Flash can remove adapters before making the merge however if the adapter is short (say 4 bases) then I am not confident that the programs will be able to do so. Perhaps a better program would be one that matches the first bases of R1 to the region close to the end of R2 and vice-versa and then only output the merged read where both R1 and R2 match. In other words a merging where the adapter does not need to be known a priori.

    Hope that this makes sense. Any suggestions? Thanks.

Latest Articles

Collapse

  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM
  • seqadmin
    Choosing Between NGS and qPCR
    by seqadmin



    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
    10-18-2024, 07:11 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 11:09 AM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, Today, 06:13 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-01-2024, 06:09 AM
0 responses
30 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-30-2024, 05:31 AM
0 responses
21 views
0 likes
Last Post seqadmin  
Working...
X