Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • chris_bioinfo
    Junior Member
    • Aug 2012
    • 8

    how to use sam files in MEGAN metagenomics

    hello everybody..

    I have two environmental bacteria data sequenced on Illumina for metagenomics (approx 14 million paired-end reads for one dataset and 16 million for the other. ~70 bp read length). Since I knew that the sequences consists of only bacteria, I've downloaded the all bacteria sequences from NCBI ( 14 GB file size) instead of downloading nr/nt database and started standalone blast as suggested in MEGAN manual. It continuously ran for 9 constant days and then i had to stop the process, since the blast result file size was more than 45 GB. I know this is not a memory issue. Then I did the alignment with bowtie (bowtie-0.12.7) and it gave me the sam alignment file (7 GB and 12 percent of the reads got aligned to the reference). I also downloaded GI to NCBI taxon id file from megan website ( the bin file). Now I uploaded both the files ( sam and bin) file as exactly mentioned in the manual and it gives me no result, somehow.

    Can you please help me as to what I did wrong..

    I appreciate your help

    Christopher
  • DZhang
    Senior Member
    • Jun 2010
    • 177

    #2
    Hi Chris,

    BLAST using Illumina reads is not recommended due to extreme computational challenges. Before getting into your experiment design, can you share what you had intended to achieve for your sequencing project?

    Best regards,
    Douglas

    Comment

    • jimmybee
      Senior Member
      • Sep 2010
      • 119

      #3
      Perhaps run something like Qiime first. It will do 16S identification and will reduce the size of your dataset (as a fasta file) so you can run it in MEGAN. I assume you're using MEGAN for functional analysis?

      Comment

      • DZhang
        Senior Member
        • Jun 2010
        • 177

        #4
        MetaPhlAn may be a right tool for this.

        Best regards,
        Douglas

        Comment

        • chris_bioinfo
          Junior Member
          • Aug 2012
          • 8

          #5
          thanks for replies..

          well, i want to have a complete metagenomics analysis as to how many and what species are in the sample and phylogeny too.. is this what this program let me do it..

          chris

          Comment

          • DZhang
            Senior Member
            • Jun 2010
            • 177

            #6
            MetaPhlAn can do that.

            Best regards,
            Douglas

            Comment

            • cliffbeall
              Senior Member
              • Jan 2010
              • 144

              #7
              I suspect MEGAN might not be able to parse the taxa id from your alignment results because the format is slightly different in the database you're using. You might be able to tweak it to get it working.

              Blastx against nr might be doable if you have access to a cluster - I blasted an Illumina dataset about the size of yours, just chopping it into little pieces and farming it out to separate nodes. I had to buy more memory to run MEGAN on it, though.

              Comment

              • chris_bioinfo
                Junior Member
                • Aug 2012
                • 8

                #8
                thank you all for your replies


                I used metaphlan with the marker db that is provided by them and very happy with the results, but if I want to map against the database that Ive downloaded from NCBI, is it possible? because as far as I have understood is that database comprises of ~2800 genome markers and in this case there are chances that we might be losing on information on genomes which are currently not present in that list. I'm sorry if I am completely wrong, I'm novice and trying to understand it

                christopher

                Comment

                • DZhang
                  Senior Member
                  • Jun 2010
                  • 177

                  #9
                  Hi Chris,

                  Please read the paper on MetaPhlAn. The authors screened for representative genes in each family/class. If you use a general database, I am not sure if the results are useful or not. I recommend you contact the author(s) to discuss.

                  Best regards,
                  Douglas

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Pathogen Surveillance with Advanced Genomic Tools
                    by seqadmin




                    The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                    03-24-2025, 11:48 AM
                  • seqadmin
                    New Genomics Tools and Methods Shared at AGBT 2025
                    by seqadmin


                    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                    The Headliner
                    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                    03-03-2025, 01:39 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-20-2025, 05:03 AM
                  0 responses
                  41 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-19-2025, 07:27 AM
                  0 responses
                  49 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-18-2025, 12:50 PM
                  0 responses
                  36 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-03-2025, 01:15 PM
                  0 responses
                  192 views
                  0 reactions
                  Last Post seqadmin  
                  Working...