Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to use sam files in MEGAN metagenomics

    hello everybody..

    I have two environmental bacteria data sequenced on Illumina for metagenomics (approx 14 million paired-end reads for one dataset and 16 million for the other. ~70 bp read length). Since I knew that the sequences consists of only bacteria, I've downloaded the all bacteria sequences from NCBI ( 14 GB file size) instead of downloading nr/nt database and started standalone blast as suggested in MEGAN manual. It continuously ran for 9 constant days and then i had to stop the process, since the blast result file size was more than 45 GB. I know this is not a memory issue. Then I did the alignment with bowtie (bowtie-0.12.7) and it gave me the sam alignment file (7 GB and 12 percent of the reads got aligned to the reference). I also downloaded GI to NCBI taxon id file from megan website ( the bin file). Now I uploaded both the files ( sam and bin) file as exactly mentioned in the manual and it gives me no result, somehow.

    Can you please help me as to what I did wrong..

    I appreciate your help

    Christopher

  • #2
    Hi Chris,

    BLAST using Illumina reads is not recommended due to extreme computational challenges. Before getting into your experiment design, can you share what you had intended to achieve for your sequencing project?

    Best regards,
    Douglas

    Comment


    • #3
      Perhaps run something like Qiime first. It will do 16S identification and will reduce the size of your dataset (as a fasta file) so you can run it in MEGAN. I assume you're using MEGAN for functional analysis?

      Comment


      • #4
        MetaPhlAn may be a right tool for this.

        Best regards,
        Douglas

        Comment


        • #5
          thanks for replies..

          well, i want to have a complete metagenomics analysis as to how many and what species are in the sample and phylogeny too.. is this what this program let me do it..

          chris

          Comment


          • #6
            MetaPhlAn can do that.

            Best regards,
            Douglas

            Comment


            • #7
              I suspect MEGAN might not be able to parse the taxa id from your alignment results because the format is slightly different in the database you're using. You might be able to tweak it to get it working.

              Blastx against nr might be doable if you have access to a cluster - I blasted an Illumina dataset about the size of yours, just chopping it into little pieces and farming it out to separate nodes. I had to buy more memory to run MEGAN on it, though.

              Comment


              • #8
                thank you all for your replies


                I used metaphlan with the marker db that is provided by them and very happy with the results, but if I want to map against the database that Ive downloaded from NCBI, is it possible? because as far as I have understood is that database comprises of ~2800 genome markers and in this case there are chances that we might be losing on information on genomes which are currently not present in that list. I'm sorry if I am completely wrong, I'm novice and trying to understand it

                christopher

                Comment


                • #9
                  Hi Chris,

                  Please read the paper on MetaPhlAn. The authors screened for representative genes in each family/class. If you use a general database, I am not sure if the results are useful or not. I recommend you contact the author(s) to discuss.

                  Best regards,
                  Douglas

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Best Practices for Single-Cell Sequencing Analysis
                    by seqadmin



                    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                    06-06-2024, 07:15 AM
                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:54 AM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-14-2024, 07:24 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-13-2024, 08:58 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-12-2024, 02:20 PM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X