Header Leaderboard Ad

Collapse

Removing contaminants (bacteria, phage) in genomic dna sequences

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing contaminants (bacteria, phage) in genomic dna sequences

    Hi Everyone,
    I have dna, paired-end genomic sequences that I want to perform de novo assembly on, but I want to 'clean' them up first ie remove contaminants, before de novo assembly.
    I have already done the trimming out of low quality reads and adapter removal. I would like to preferably map the reads or blast them to a bacterial and/or phage database, then keep the un mapped reads for further analysis downstream.

    Could any one please guide me on how I can approach this?

    I was thinking of downloading the bacterial and/or phage genomes to my local computer, but then there are 1000's of genomes presently.

    Ideas and suggestions will be, as always, very appreciated!

  • #2
    I used DeconSeq for this, but I had to install and run it locally, since my job was in the queue for two weeks waiting and then vanished. (this was back in November).

    I was lucky, only a very tiny trace of fungal+bacterial ribosomal genes. Though I was glad to see those, it indicated that the filter worked.

    But, of course you can only check for contamination of sequenced organisms in the Deconseq database. I would be curious if there could be a more general filter, but I can't really see how, for a newly assembled genome.

    Edit: Oh, and if you do this with assembled contigs, instead of reads, realize that the entire sequence is flagged as contaminated or not. I chopped up my assembled sequence into scaftigs prior to running (does anyone else ever use the term scaftigs?)
    Last edited by dsenalik; 03-18-2014, 01:56 PM.

    Comment


    • #3
      You could BLAST a few thousand reads, see what they hit, and then just download the genomes of organisms that appear to be contaminants for your filtering. If you have a reference of your target genome, map to the reference, then only BLASTthe unmapped reads.

      Once you have references, bbduk or bbmap can decontaminate using kmers or mapping, respectively, with the "outu" (output unmapped/unmatched) stream.

      Comment


      • #4
        My approach has been to do the assembly first, then try to remove contaminant contigs. This greatly reduces the amount of computation that has to be done. Chimeric contigs of contaminant and target should be extremely rare.

        If you have a microbial genome, IMG has some tools for finding contamination, eg:
        https://img.jgi.doe.gov/er/doc/Singl...tamination.pdf

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
          by seqadmin




          Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
          03-10-2023, 05:31 AM
        • seqadmin
          Expert Advice on Automating Your Library Preparations
          by seqadmin



          Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....
          02-21-2023, 02:14 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-17-2023, 12:32 PM
        0 responses
        7 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-15-2023, 12:42 PM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-09-2023, 10:17 AM
        0 responses
        66 views
        1 like
        Last Post seqadmin  
        Started by seqadmin, 03-03-2023, 12:03 PM
        0 responses
        64 views
        0 likes
        Last Post seqadmin  
        Working...
        X