Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ssharma
    Member
    • Oct 2010
    • 19

    Illumina Metagenomics data

    Hi All,
    I am new member to this forum.
    Earlier i used to work with 454 data, now i am switching to illumina.
    I am getting around 300 million reads (100bp) and its a metagenomic sample. So i am really confused about how to start my analysis.
    Earlier i used approaches like blastx but now i think this is not a good option.
    So i was just wondering if anyone had done something like this or have some idea on this.

    I would really appreciate your help.

    Thanks
    SS
  • rwenang
    Member
    • Jan 2009
    • 31

    #2
    Is it a 16S or WGS sample?

    Comment

    • ssharma
      Member
      • Oct 2010
      • 19

      #3
      Its a metagenomics environmental sample (no 16s).

      Comment

      • themerlin
        Member
        • Feb 2010
        • 51

        #4
        The real question is....what's the question? Are you looking for specific genes, or want to take an inventory of all genes?

        Have you tried assembling the reads yet? That's always a little sketchy with mixed communities, but it might be a good place to start.

        Comment

        • ssharma
          Member
          • Oct 2010
          • 19

          #5
          thanks for your input themerlin,
          Actually mainly its going to be a community study (in nut shell i need to annotate all of the sequences)
          Yes i tried assembly but it doesn't look good, but yes i will try again with different programs.

          Comment

          • kmewis
            Junior Member
            • Sep 2010
            • 7

            #6
            Is this just sequence data from DNA straight from the environment, or did you clone it into vectors first?

            I handle metagenomics data, we do it in fosmids though, so it's easy to assemble contigs from one fosmid (phred/phrap). Trying to do the whole environment at once will likely be tougher. Once I have contigs, we use blastx to looks for homology and tools like fgenesb to find ORFs.

            Comment

            • Dilipmohana
              Junior Member
              • Nov 2010
              • 2

              #7
              hi i am new to this site can anyone tell me about effective working in schrodinger plz pass useful video tutorials if possible,

              Comment

              • greigite
                Senior Member
                • Mar 2009
                • 145

                #8
                Take a look at MG-RAST for annotation of your data http://metagenomics.nmpdr.org.
                Originally posted by ssharma View Post
                Hi All,
                I am new member to this forum.
                Earlier i used to work with 454 data, now i am switching to illumina.
                I am getting around 300 million reads (100bp) and its a metagenomic sample. So i am really confused about how to start my analysis.
                Earlier i used approaches like blastx but now i think this is not a good option.
                So i was just wondering if anyone had done something like this or have some idea on this.

                I would really appreciate your help.

                Thanks
                SS

                Comment

                • Eric
                  Junior Member
                  • Oct 2009
                  • 1

                  #9
                  Hi,

                  What do you mean by "annotate" ? Are you looking at "who is there" or "what are the functions" ?
                  Do you have reference genomes at hand, or genomes of organisms close to the ones in your sample ? Do you have an idea of the complexity of the population ? Is it eukaryote or microbes, or both ?
                  You can consider first trying to have an idea of the composition of your population, looking at some marker genes (eg : trying to find 16S or 18S reads in your dataset by mapping against reference databases)
                  If you have known reference genomes, you can also map reads against them, to evaluate the complexity/diversity
                  For a first glimpse at functions, you can try UniRef50 or KEGG genes (or any other functionally classified reference protein set) as a proxy.

                  Comment

                  • gridbird
                    Member
                    • Oct 2010
                    • 16

                    #10
                    You can try WebMGA: http://weizhong-lab.ucsd.edu/metagenomic-analysis/

                    Comment

                    • colindaven
                      Senior Member
                      • Oct 2008
                      • 417

                      #11
                      You can try approaches like
                      -de novo assembly (metaVelvet, Abyss etc)
                      -fast clustering - (CD-Hit, RAMMCAP)
                      -reference based alignment (Genometa)

                      Comment

                      • faozhi
                        Junior Member
                        • Dec 2011
                        • 5

                        #12
                        I would trim the reads (based on qual and remove adapters), then start assembling.
                        If you would like to know who are there, you could use MG-RAST or just blastn your trimmed reads against greengenes or SILVA 16S databases.

                        Comment

                        • seb567
                          Senior Member
                          • Jul 2008
                          • 260

                          #13
                          Originally posted by ssharma View Post
                          Hi All,
                          I am new member to this forum.
                          Earlier i used to work with 454 data, now i am switching to illumina.
                          I am getting around 300 million reads (100bp) and its a metagenomic sample. So i am really confused about how to start my analysis.
                          Earlier i used approaches like blastx but now i think this is not a good option.
                          So i was just wondering if anyone had done something like this or have some idea on this.

                          I would really appreciate your help.

                          Thanks
                          SS

                          You can reduce the volume of data by doing a de novo assembly.




                          HTML Code:
                          mpiexec -n 64 Ray \
                           -k \
                           31 \
                           -p \
                           Sample/ERR011142_1.fastq.gz \
                           Sample/ERR011142_2.fastq.gz \
                           -p \
                           Sample/ERR011143_1.fastq.gz \
                           Sample/ERR011143_2.fastq.gz \
                           -o \
                           Assembly

                          Sébastien Boisvert

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Pathogen Surveillance with Advanced Genomic Tools
                            by seqadmin




                            The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                            03-24-2025, 11:48 AM
                          • seqadmin
                            New Genomics Tools and Methods Shared at AGBT 2025
                            by seqadmin


                            This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                            The Headliner
                            The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                            03-03-2025, 01:39 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 03-20-2025, 05:03 AM
                          0 responses
                          41 views
                          0 reactions
                          Last Post seqadmin  
                          Started by seqadmin, 03-19-2025, 07:27 AM
                          0 responses
                          49 views
                          0 reactions
                          Last Post seqadmin  
                          Started by seqadmin, 03-18-2025, 12:50 PM
                          0 responses
                          36 views
                          0 reactions
                          Last Post seqadmin  
                          Started by seqadmin, 03-03-2025, 01:15 PM
                          0 responses
                          191 views
                          0 reactions
                          Last Post seqadmin  
                          Working...