Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • iltisanni
    Member
    • Mar 2017
    • 21

    Nanopore - circular Assembly

    Hi,

    we succesfully sequenced a DNA sample and assembled the genome with canu.
    We got one circular contig which is perfect. But the contig is overlapping.

    Since nanopore Reads are very long some reads at the end have the same sequence as the reads at the beginning of the contig and vice versa.
    Canu is also reporting that the contig is circular.

    Now we want to fix those reads at the beginning and the end of the contig to get one linear contig without overlaps.

    I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

    Can anyone help us here? Maybe any alternative software to circlator?


    Of course we could always trim the contig manually by finding the end of the contig at the beginning and then trim at this position or use a script which does the same... but...the best code is still the one which has already been written by someone else :-)
    Last edited by iltisanni; 05-11-2018, 04:04 AM.
  • Markiyan
    Senior Member
    • Sep 2010
    • 126

    #2
    Also you can use blast or mummer to detect the overlapping ends...

    First you need to detect by how much the ends are overlapping.
    Than you can save non-overlapping portion + a single copy of the overlapping area sequence to a file.

    You can detect overlapping ends of the contig(s) by the standalone blast using the master-slave alignment formatting output option (blast the sequence against itself).: Lower the expect value -e 1e-50 or less and crank up the word size to 16 - 64bp (-W 32)
    Also dotplot/mummer alignment against itself may be userfull.

    Using above info you can decide which base-range to keep, so you get non - overlapping ends.
    Than you open your sequence in Artemis or similar editor and do select->base range
    and save the selected base range to a fasta file: File->Write->Bases of selection->Fasta format.

    Comment

    • iltisanni
      Member
      • Mar 2017
      • 21

      #3
      Thank you. You helped me a lot and your suggestion to align the sequence against itself was right. Now I found the trimming point and trimmed the fasta with a simple cat X.fasta | cut -c 1-XXX > trimmed.fasta after deleting the header line first and inserting it again at the end in the trimmed.fasta

      I found this information directly in the canu documentation:


      --->

      An alternative is to run MUMmer to get self-alignments on the contig and use those trim points. For example, assuming the circular element is in tig00000099.fa. Run:

      nucmer -maxmatch -nosimplify tig00000099.fa tig00000099.fa
      show-coords -lrcTH out.delta


      to find the end overlaps in the tig. The output would be something like:

      1 1895 48502 50400 1895 1899 99.37 50400 50400 3.76 3.77 tig00000001 tig00000001
      48502 50400 1 1895 1899 1895 99.37 50400 50400 3.77 3.76 tig00000001 tig00000001

      means trim to 1 to 48502. There is also an alternate writeup.

      <---
      Last edited by iltisanni; 05-14-2018, 12:28 AM.

      Comment

      • Ali May
        Member
        • Aug 2016
        • 13

        #4
        Originally posted by iltisanni View Post
        Hi,

        I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

        Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.

        Code:
        circlator all <assembly.fasta> <corrected_longreads_from_canu.fasta> <output_folder> --threads <nr_of_trheads>
        Then I check the output of

        Code:
        04.merge.circularise_details.log
        in the output folder to hopefully see a line like

        Code:
        [merge circularise_details]	scaffold1|size4159270|arrow	Circularized: yes
        Then the file
        Code:
        06.fixstart.fasta
        is the final output file which should have fixed coordinates without overlaps etc. Let me know if this helps.

        Comment

        • iltisanni
          Member
          • Mar 2017
          • 21

          #5
          Originally posted by Ali May View Post
          Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.
          I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...

          Comment

          • Ali May
            Member
            • Aug 2016
            • 13

            #6
            Originally posted by iltisanni View Post
            I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...
            I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.

            Comment

            • iltisanni
              Member
              • Mar 2017
              • 21

              #7
              Originally posted by Ali May View Post
              I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.
              Oh Hey.. I just recognized the "merge" function which is included with the all option.

              I guess this does what I want...I will try it. Alle the other functions coming with the "all" option are not needed in my case.

              The only thing I don't get is whether the "merge" function uses spades for anything? And if spades is used, for what?
              My assembler is canu because it seems to be the best right now for Nanopore Reads, so nothing with spades...
              Last edited by iltisanni; 05-14-2018, 05:25 AM.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                Here are nine questions we think about, in roughly the order they matter, before...
                Today, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM
              • SEQadmin2
                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                by SEQadmin2


                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                Introduction

                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                05-22-2026, 06:42 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 06:09 AM
              0 responses
              16 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              37 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              42 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              49 views
              0 reactions
              Last Post SEQadmin2  
              Working...