Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to use sam files in MEGAN metagenomics

    hello everybody..

    I have two environmental bacteria data sequenced on Illumina for metagenomics (approx 14 million paired-end reads for one dataset and 16 million for the other. ~70 bp read length). Since I knew that the sequences consists of only bacteria, I've downloaded the all bacteria sequences from NCBI ( 14 GB file size) instead of downloading nr/nt database and started standalone blast as suggested in MEGAN manual. It continuously ran for 9 constant days and then i had to stop the process, since the blast result file size was more than 45 GB. I know this is not a memory issue. Then I did the alignment with bowtie (bowtie-0.12.7) and it gave me the sam alignment file (7 GB and 12 percent of the reads got aligned to the reference). I also downloaded GI to NCBI taxon id file from megan website ( the bin file). Now I uploaded both the files ( sam and bin) file as exactly mentioned in the manual and it gives me no result, somehow.

    Can you please help me as to what I did wrong..

    I appreciate your help

    Christopher

  • #2
    Hi Chris,

    BLAST using Illumina reads is not recommended due to extreme computational challenges. Before getting into your experiment design, can you share what you had intended to achieve for your sequencing project?

    Best regards,
    Douglas

    Comment


    • #3
      Perhaps run something like Qiime first. It will do 16S identification and will reduce the size of your dataset (as a fasta file) so you can run it in MEGAN. I assume you're using MEGAN for functional analysis?

      Comment


      • #4
        MetaPhlAn may be a right tool for this.

        Best regards,
        Douglas

        Comment


        • #5
          thanks for replies..

          well, i want to have a complete metagenomics analysis as to how many and what species are in the sample and phylogeny too.. is this what this program let me do it..

          chris

          Comment


          • #6
            MetaPhlAn can do that.

            Best regards,
            Douglas

            Comment


            • #7
              I suspect MEGAN might not be able to parse the taxa id from your alignment results because the format is slightly different in the database you're using. You might be able to tweak it to get it working.

              Blastx against nr might be doable if you have access to a cluster - I blasted an Illumina dataset about the size of yours, just chopping it into little pieces and farming it out to separate nodes. I had to buy more memory to run MEGAN on it, though.

              Comment


              • #8
                thank you all for your replies


                I used metaphlan with the marker db that is provided by them and very happy with the results, but if I want to map against the database that Ive downloaded from NCBI, is it possible? because as far as I have understood is that database comprises of ~2800 genome markers and in this case there are chances that we might be losing on information on genomes which are currently not present in that list. I'm sorry if I am completely wrong, I'm novice and trying to understand it

                christopher

                Comment


                • #9
                  Hi Chris,

                  Please read the paper on MetaPhlAn. The authors screened for representative genes in each family/class. If you use a general database, I am not sure if the results are useful or not. I recommend you contact the author(s) to discuss.

                  Best regards,
                  Douglas

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Exploring the Dynamics of the Tumor Microenvironment
                    by seqadmin




                    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                    07-08-2024, 03:19 PM
                  • seqadmin
                    Exploring Human Diversity Through Large-Scale Omics
                    by seqadmin


                    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                    06-25-2024, 06:43 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 07-19-2024, 07:20 AM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-16-2024, 05:49 AM
                  0 responses
                  42 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-15-2024, 06:53 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-10-2024, 07:30 AM
                  0 responses
                  43 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X