Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Derek_S
    Junior Member
    • Jun 2011
    • 1

    Duplicating contigs in caf file

    Hi all,

    I'm wondering if anyone has experience in duplication of contigs for the final stitching of a denovo assembly of a bacterial genome?

    I've done a 454/Illumina hybrid assembly in MIRA and am now trying to reduce the number of contigs by stitching contigs together manually. I have a decent reference genome (190 contigs for a 7Mbp genome) so I know where some of the 16S RNA repetitive regions should be to join some of the larger contigs. The problem I run into is when I join a non-repetitive contig to a repetitive one there is only one copy of the repetitive contig even though it should be represented in the genome 4-5 times (I'm using GAP5 to do this). Is there a way to duplicate these contigs so I can assemble them more than once and hence stitch the genome together better?

    Thanks

    Derek
  • BaCh
    Member
    • May 2008
    • 81

    #2
    Originally posted by Derek_S View Post
    I'm wondering if anyone has experience in duplication of contigs for the final stitching of a denovo assembly of a bacterial genome?
    You cannot immediately duplicate a whole contig as this would also duplicate all reads contained within ... and duplicate read names is something which will break MIRA, gap4, gap5 etc.

    Two solutions:
    a) you duplicate contigs completely, but change all names contained within
    b) you duplicate contigs by just taking the consensus and insert these as single reads into the project.

    I'd go the second way as the first involves a lot of work by hand. Doing this is quite easy:
    1. use "convert_project -f caf -t fasta -s" to convert a CAF to single consensus FASTA files
    2. take the consensus you want, rename the sequence to something you can distinguish (e.g. "RNAcopy_01") then use "convert_project -f fasta -t caf -m" to make a CAF.
    3. Append the new CAF with RNAcopy_01 via "cat" to the complete project CAF
    4. Rinse, repeat at step 2.

    B

    Comment

    Latest Articles

    Collapse

    • seqadmin
      New Genomics Tools and Methods Shared at AGBT 2025
      by seqadmin


      This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

      The Headliner
      The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
      03-03-2025, 01:39 PM
    • seqadmin
      Investigating the Gut Microbiome Through Diet and Spatial Biology
      by seqadmin




      The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
      02-24-2025, 06:31 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-20-2025, 05:03 AM
    0 responses
    16 views
    0 reactions
    Last Post seqadmin  
    Started by seqadmin, 03-19-2025, 07:27 AM
    0 responses
    17 views
    0 reactions
    Last Post seqadmin  
    Started by seqadmin, 03-18-2025, 12:50 PM
    0 responses
    18 views
    0 reactions
    Last Post seqadmin  
    Started by seqadmin, 03-03-2025, 01:15 PM
    0 responses
    185 views
    0 reactions
    Last Post seqadmin  
    Working...