Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should I try hybrid assembly with my PacBio data?

    Hi all,

    I recently had the genome of a bacterial strain I am working with sequenced using both PacBio and Illumina paired end.

    I have managed to assemble the Illumina data into ~200 contigs using Soap2. The PacBio data I got back came assembled into 22 contigs. Which I was a little disappointed with especially because other people in my lab have sequenced the same species but different strains and got their data back as one contig! The original idea was to map the Illumina to the PacBio to look for errors.

    But anyway, now I am not sure what to do with the data I have. The longest four contigs of the PacBio data cover ~97% of my estimated 4.5Mb genome size but all the other contigs do map to the same species when looking at the BLASR output, although some with low coverage. Now I'm not sure what is "real" and I don't want to underestimate the genome size.

    I have read that you can use Pacbio sequences to scaffold Illumina contigs so I am wondering if I should try that? But I can't really find any helpful tutorials/resources on how to do this. I'm not sure about which PacBio data I should use (I have the CCS.fastq, filtered subread fastq and longest subread fastq file). If I need to do anything to the data before using it? Which program to use? etc.

    Any help would be appreciated, even if its just a link to a good resource.

    Thanks in advance!

  • #2
    Rather than try a complex hybrid approach, which is unlikely to be any more successful than the 22 contig Pacbio assembly I would try to diagnose and optimize the Pacbio assembly. How do the preassembly statistics (yield, N50, number of bases) compare to the other assemblies in your lab? Was the subread N50, or the number of bases in the filtered data less than the assemblies that generated single contigs?
    With 22 contigs it is possible to run bridgemapper to order the contigs with the remaining Pacbio reads, overlapping the contigs using minimus2 and validating using resequencing. Think of it as manual finishing. I would then use the illumina reads to check the final base accuracy.
    You mentioned contigs with lower coverage, is it possible that the sample is not perfectly clonal, and you are seeing a minor population that is breaking the assembly?

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Advances in Sequencing Analysis Tools
      by seqadmin


      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
      05-06-2024, 07:48 AM
    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:57 AM
    0 responses
    11 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-06-2024, 07:17 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-02-2024, 08:06 AM
    0 responses
    19 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-30-2024, 12:17 PM
    0 responses
    24 views
    0 likes
    Last Post seqadmin  
    Working...
    X