Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff or DESeq

    Without starting some kind of pointless back and forth, I am wondering what are the advantages of using Cuffdiff or DESeq to identify differentially expressed genes from RNA-seq data. The statistical side is a little over my head at the moment but is also what spurs my interest in this question. I find Simon Anders posts very informative so I am inclined to use DESeq but everything R and Bioconductor is a little biologist unfriendly. The Cuffdiff workflow is easier to follow but am I sacrificing anything by using Cuffdiff?
    --------------
    Ethan

  • #2
    The vignette for DESeq is exceptionally well written and makes it easy for a biologist with no R experience to follow it. I say this having at one time being that biologist with NO understanding of R.

    Comment


    • #3
      There's a discussion of this going on over at the bioconductor mailing list. I'm quoting some of the recent posts from the last two days:

      Dear Bioconductor list,

      Sometime ago Simon Anders explained the difference
      between DeSeq and Cuffdiff as follows:

      "If you have two samples, cuffdiff tests, for each transcript, whether
      there is evidence that the concentration of this transcript is not the
      same in the two samples.

      If you have two different experimental conditions, with replicates for
      each condition, DESeq tests, whether, for a given gene, the change in
      expression strength between the two conditions is large as compared to
      the variation within each replicate group."

      Current language on the Cuffdiff site suggests that the current version
      of that program tests for whether the change is significant compared to
      changes in each condition.





      Can someone please comment on the relative merits of Cuffdiff and
      DeSeq. I ask here because our sequencing core delivers results
      based on Cuffdiff and I want to know if I should redo it using
      DeSeq,I would greatly appreciate any guidance in this matter.

      Thanks and best wishes,
      Rich
      Reply 1

      Not directly relevant to gene-level RNA-seq DE calls, but rather for
      exon-level DE,
      I found it useful to read this:

      In particular, section 4.3 on page 11, and supplementary figures S7 and S8
      on page 19.

      I was informed by a coworker that since everyone uses
      BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do.
      Conversations with people who know what they are doing (Terry Speed &
      BCGSC) suggest the matter is not yet settled.
      So I retrieved ~1TB of BAMs, extracted the reads, and started looking into
      how that compares to DEXSeq and/or subread.

      It would be incredibly informative if the Cufflinks and DEXSeq authors had
      time to weigh in on their strengths/weaknesses. DEXSeq & cummeRbund both
      offer nice tools for exploring the results; I am curious which pipeline
      fits best for my needs.

      Thanks for bringing this up.
      Reply 2 (from me)

      I'm also in the same boat as Rich. I run a new bioinformatics core
      here and I'm building a pipeline for RNA-seq. Cufflinks for some time
      has supported biological replicates, and I'm also curious about the
      relative merits of using
      bowtie/tophat-cufflinks-cuffmerge-cuffdiff-cummeRbund versus using
      tophat-HTSeq?-customScriptForCreatingMatrix?-DESeq. Cufflinks also
      gives me a host of other tests (differential splicing load,
      differential TSS usage, differential coding output, etc), which also
      seem useful for certain applications.

      On a related note, does anyone have a workflow for taking multiple bam
      files, running HTSeq-count (or another program), plus some other
      program or custom script, to produce a matrix of counts as input to
      DESeq?

      Stephen
      Reply 4:

      Dear Stephen,

      To your related note, you could have a look at the easyRNASeq package (bioC 2.10) for R (2.15). It reads in your annotation, your bam files and generate a count table for DESeq, all in R. It can actually do the first step of DESeq (estimating size library and dispersion) and give you back a normalized countDataSet object plus some validation plots as described in the DESeq vignettes (the same is true for edgeR). I'm about to push some changes in SVN to correct an issue that prevented the package vignette to be build. The package should be available in a couple of days as binary or you could install it directly from SVN. See http://wiki.fhcrc.org/bioc/SvnHowTo. The package URL is: https://hedgehog.fhcrc.org/bioconduc...cks/easyRNASeq.

      Getting the proper set of annotation is definitely the most important step in the whole process and the one that requires most attention, the rest is then pretty straightforward. What I mean by annotation is the description of your feature of interest, be it gene, transcript, exon, enhancers, etc... as genomic loci (chr, start, width, etc...). The main issue in defining the annotation is to avoid counting reads multiple times and the kind of annotation needed does of course depends on your project. If you are interested in looking at isoforms differential expression, you probably want to define synthetic exons (to avoid double counting) and process the obtained count table with DEXseq. If you're looking at gene expression, you would want to create gene models to avoid multiple counting and use these to create your count table. If you are interested in eRNAs, you can define enhancer loci as the count "feature". All this can be done relatively easily in R. Once you have the proper annotation that suits your need, running easyRNASeq is very straightforward. easyRNASeq accepts both RangedData and GRangesList as annotation input, amo ng other formats. easyRNASeq is in addition able to fetch annotations for you from different sources, but most of the time these would need to be post-processed. You can look at my post: "u! sing easyRNASeq examples" from 2 days ago for some examples and comments in addition to the vignette content.

      Cheers,

      Nico
      And finally, the post that Nico mentioned about easyRNAseq

      I've fixed a bug in easyRNASeq and made a couple of modifications to deal with novelties introduced in the IRanges and in the DESeq packages. I've just committed that whole bunch, so it will take ~2 days before a binary package is available. You can always install from SVN in the meanwhile.

      I've listed below the different examples you sent me in our email exchange last week and I've added some comments which I hope will prove useful to you. I've created a new "thread" to make it easier to read, and I'm confident it will prove useful to others. This leads me to thanking you for your patience, for providing the data and for giving me the opportunity to post such an example

      I hope this helps, do not hesitate to contact me if you have further questions and to suggest improvements.

      Cheers,

      Nico
      Code:
      ## load the library
      library(easyRNASeq)
      
      ## load the chromosome sizes
      ## Note that you can just provide a named list; using the BSgenome is not necessary (especially as the
      Hsapiens ones are huge >800Mb)
      library(BSgenome.Hsapiens.UCSC.hg19)
      chr.sizes=as.list(seqlengths(Hsapiens))
      
      ## set the wd directory to wherever you have your bam files
      ## that's just for this example, you could give that directory
      ## directly to the filesDirectory argument of the function
      setwd("Desktop/Francesco")
      
      ## get the bam filenames
      bamfiles=dir(getwd(),pattern="*\\.bam$")
      
      ## run easyRNASeq to get an RNAseq object.
      ## we use biomaRt to fetch the annotation directly from ensembl
      ## 
      rnaSeq <- easyRNASeq(filesDirectory=getwd(),
                           organism="Hsapiens",
                           chr.sizes=chr.sizes,
                           readLength=100L,
                           annotationMethod="biomaRt",
                           format="bam",
                           count="exons",
                           filenames=bamfiles[1],
                           outputFormat="RNAseq"
                           )
      
      ## Fetching the annotation that way is time consuming
      ## Using the RNAseq object you can extract the genic information
      ## and save them as an rda for a faster processing time
      ## later on
      gAnnot <- genomicAnnotation(rnaSeq)
      
      ## There are 244 "chromosomes" in that annotation, let's keep only what we need
      gAnnot <- gAnnot[space(gAnnot) %in% paste("chr",c(1:22,"X","Y","M"),sep=""),]
      save(gAnnot,file="gAnnot.rda")
      
      ## you could use it this way
      countTable <- easyRNASeq(filesDirectory=getwd(),
                               organism="Hsapiens",
                               chr.sizes=chr.sizes,
                               readLength=100L,
                               annotationMethod="rda",
                               annotationFile="gAnnot.rda",
                               format="bam",
                               count="exons",
                               filenames=bamfiles[1]
                               )
      
      ## using different annotation (makeTranscriptDB)
      library(GenomicFeatures)
      hg19.tx <- makeTranscriptDbFromUCSC(
                                          genome="hg19",
                                          tablename="refGene")
      
      ## easyRNASeq can deal with GRangesList object, so no need to modify it much, i.e. no need to convert it to a RangedData
      gAnnot <- exons(hg19.tx)
      ## change the metadata column name to suit easyRNASeq
      colnames(elementMetadata(gAnnot)) <- "exon"
      ## finally turn it into a GrangesList
      gAnnot <- split(gAnnot,seqnames(gAnnot))
      
      ## you could save it as an rda object, you can as well use it directly
      ## by using the annotationMethod "env" and the annotationObject arguments.
      ## In addition we will be selecting for a single chromosome: chr1
      ## as the corresponding name in your bam file is 1, that's what we'll use
      countTable <- easyRNASeq(filesDirectory=getwd(),
                               organism="Hsapiens",
                               chr.sizes=chr.sizes,
                               readLength=100L,
                               annotationMethod="env",
                               annotationObject=gAnnot,
                               format="bam",
                               count="exons",
                               filenames=bamfiles[1],
                               chr.sel="1"
                               )
      
      ## applying DESeq
      ## this will not yield very sensitive results as we have no replicates (biological)
      ## These are important for DESeq to accurately model the technical and biological variance
      ## With no replicates for every condition, the dispersion will be based on a "pooled" estimate
      ## making the differential expression call lose sensitivity. In addition, DESeq is in such case
      ## using a conservative approach (which is good) so you'd get even less significant results.
      
      ## Defining the conditions
      conditions <- c("A","B")
      names(conditions) <- bamfiles
      
      ## running DESeq
      ## NOTE that the two last arguments are for the DESeq estimateDispersions method. They are just
      transferred through.
      ## There are necessary as we have only 1 replicate per condition; see the DESeq vignette for more details.
      countDataSet <- easyRNASeq(filesDirectory=getwd(),
                                 organism="Hsapiens",
                                 chr.sizes=chr.sizes,
                                 readLength=100L,
                                 annotationMethod="env",
                                 annotationObject=gAnnot,
                                 format="bam",
                                 count="exons",
                                 filenames=bamfiles,
                                 chr.sel="1",
                                 outputFormat="DESeq",
                                 normalize=TRUE,
                                 conditions=conditions,
                                 fitType="local",
                                 method="blind"
                               )
      
      ## finally my session info
      > sessionInfo()
      R Under development (unstable) (2012-02-07 r58290)
      Platform: x86_64-apple-darwin10.8.0 (64-bit)
      
      locale:
      [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
      
      attached base packages:
      [1] parallel  stats     graphics  grDevices utils     datasets  methods  
      [8] base     
      
      other attached packages:
       [1] GenomicFeatures_1.7.16 AnnotationDbi_1.17.15  easyRNASeq_1.1.6      
       [4] ShortRead_1.13.12      latticeExtra_0.6-19    RColorBrewer_1.0-5    
       [7] Rsamtools_1.7.29       DESeq_1.7.6            locfit_1.5-6          
      [10] lattice_0.20-0         akima_0.5-7            Biobase_2.15.3        
      [13] BSgenome_1.23.2        GenomicRanges_1.7.24   Biostrings_2.23.6     
      [16] IRanges_1.13.24        genomeIntervals_1.11.0 intervals_0.13.3      
      [19] edgeR_2.5.9            limma_3.11.11          biomaRt_2.11.1        
      [22] BiocGenerics_0.1.4    
      
      loaded via a namespace (and not attached):
       [1] annotate_1.33.2    bitops_1.0-4.1     DBI_0.2-5          genefilter_1.37.0 
       [5] geneplotter_1.33.1 grid_2.15.0        hwriter_1.3        RCurl_1.91-1      
       [9] RSQLite_0.11.1     rtracklayer_1.15.7 splines_2.15.0     survival_2.36-12  
      [13] tools_2.15.0       XML_3.9-4          xtable_1.6-0       zlibbioc_1.1.1

      Comment


      • #4
        One of the issues I faced with DESeq was going from alignments (bam files) to the counts table that DESeq requires, e.g.:

        Code:
              Sample1 Sample2 Sample3 Sample4 Sample5
        Gene1    2614    7000    7119    6689    4893
        Gene2    1162    2939      72    4988     390
        Gene3    5799    8533    9263    8413    5635
        Gene4    8129    4023    1524    2270    1816
        Gene5    3536    4320    4927    2659    6795
        From numerous responses here and elsewhere, I started using HTSeq (specifically, htseq-count) to do this.

        htseq-count is dead simple to use. It takes an alignment and a GTF annotation file (e.g. htseq-count [options] <alignment> <gtf>) and produces a list of features and counts of reads that overlap those features (how overlapping reads are counted is specified by the options).

        WASH5P 3
        WDR18 201
        ZNF555 44
        ZNF57 82
        ZNF77 55
        psiTPTE22 0
        tAKR 0
        no_feature 6321
        ambiguous 196
        too_low_aQual 0
        not_aligned 0
        alignment_not_unique 15890

        I hacked out a perl script that takes a list of bam files, runs samtools view on them, pipes that output into htseq-count, and drops an output file. It does this for each bam file, dropping a temporary file containing all the features and counts for that particular alignment. I then pipe together a series of join commands to glue together all these temporary count text files that HTSeq produces. This gets me my table that I can use with DESeq.

        I've heard far too many examples of people using tophat-cufflinks-cuffmerge-cuffdiff for their RNA-seq workflow, and getting extremely different results from one version of cufflinks to the next. But when it takes some perl hacking around a python script to munge alignments into a format that R can read, it's currently easier IMHO to use the cufflinks solution. In addition, cufflinks gives you other useful information about splicing load, differential TSS usage, differential coding output, etc.

        I would like to see this discussion continue, and if anyone has any better solution for going from bam alignments to a DESeq-ready counts table, I'm all ears.

        Comment


        • #5
          Originally posted by turnersd View Post
          I hacked out a perl script that takes a list of bam files, runs samtools view on them, pipes that output into htseq-count, and drops an output file. It does this for each bam file, dropping a temporary file containing all the features and counts for that particular alignment.
          ...
          I would like to see this discussion continue, and if anyone has any better solution for going from bam alignments to a DESeq-ready counts table, I'm all ears.
          There's really no need to use Perl for that if you're also using HTSeq. HTSeq has been designed as a somewhat easy-to-use Python library -- I say 'somewhat' because in some very small points, it didn't quite do what I wanted.

          I've got a python script lying around that takes in multiple BAM files (or SAM files, if that's your liking) and spits out counts for each feature in a three-column format (feature, experiment, count) using HTSeq. I'm mapping to a transcriptome for the reads so haven't bothered with GTF files yet, but that's something that can be fairly easily plugged in using the HTSeq library. It would probably also be fairly easy to modify HTSeq-count to support no GFF files and multiple experiments.

          Once you have the three column format, it's a somewhat easy process to convert into a count matrix using the xtabs function in R.

          ... but I'm sitting at the wrong computer at the moment. I'll attach / embed the useful scripts tomorrow after I get in to work.

          Taking note of what else has been mentioned, easyRNASeq (which I hadn't heard about before) sounds like an even better tool for this given that it stays in R. As long as the BAM file reading/processing is quick enough (e.g. considering 5-50GB BAM files for each experiment), there's no reason not to do everything in R.

          Comment


          • #6
            Originally posted by chadn737 View Post
            The vignette for DESeq is exceptionally well written and makes it easy for a biologist with no R experience to follow it. I say this having at one time being that biologist with NO understanding of R.
            Agree. Really, it is one of the few Bioconductor vignettes that make any sense.


            Originally posted by gringer View Post
            Taking note of what else has been mentioned, easyRNASeq (which I hadn't heard about before) sounds like an even better tool for this given that it stays in R. As long as the BAM file reading/processing is quick enough (e.g. considering 5-50GB BAM files for each experiment), there's no reason not to do everything in R.
            Actually, I always look for any reason not to use R. R workflows are time consuming in both trying to put together a workflow and using a workflow once you have put it together.

            Anyway, enough of my ranting against Bioconductor. I'm still interested in opinions about what is happening under the hood of these methods.

            Originally posted by turnersd View Post
            I've heard far too many examples of people using tophat-cufflinks-cuffmerge-cuffdiff for their RNA-seq workflow, and getting extremely different results from one version of cufflinks to the next.
            Sometimes people look at data expression data sets that are nearly identical and end up concluding they are very different. For example, you are looking for differentially expressed genes in two data sets. You get a list of differentially expressed genes from software A. Software A gives you a value you use for cutoff say FDR. You make the cutoff and then run the same data set through software B and make the exact same arbitrary cutoff. You look at the list of differentially expressed genes in the two lists and they are quite different because of small variations put some genes above the arbitrary cutoff line and some below. In the conversion from a measurable variable to a nominal variable you lost a lot of information and it makes the data sets look very different when if you look before you made the arbitrary cutoff the data sets looked nearly identical. I'm not saying that's what people have done but it's at the very least possible.
            --------------
            Ethan

            Comment


            • #7
              I have HTseq-count in my pipeline. At the end I just import all the individual count files into R and create the table there.

              Or for those who are very unfamiliar with perl, R, or python. Just open the count files that are output by HTseq-count in excel. Then copy and paste to make your table and save it as a csv file. I realize this is labor intensive, but I would expect basically every scientist to at least know excel.

              Comment


              • #8
                Originally posted by ETHANol
                Anyway, enough of my ranting against Bioconductor. I'm still interested in opinions about what is happening under the hood of these methods.
                Cuffdiff actually uses the same underlying model as DESeq:

                For single isoform genes, Cuffdiff models the variance in fragment counts across replicates using the negative binomial distribution, similar to the method described by Anders and Huber (2010)



                However since Cufflinks attempts to test differential expression for isoforms, their model gets a lot more complex and they attempt adjust expression values with reads that map to multiple isoforms/genes.

                My understanding of statistics is far to rudimentary to actually speak on which performs better. For my own purposes, looking for expression differences between isoforms does not make my results anymore meaningful and also makes the types of analysis I am interested in far more difficult.

                Also, in every paper I have seen that compares various methods of testing differential expression, DESeq and EdgeR give very similar results, and DESeq gives fairly similar results to BaySeq, whereas Cufflinks can differ considerably from all of these.

                Cufflinks can also take a considerable time to run and even with extra steps, I can finish up using DESeq well before Cufflinks ever completes.

                If its not obvious by now, I prefer DESeq and for that matter EdgeR, over Cufflinks. It fits my needs better.

                Comment


                • #9
                  Hi all,
                  Originally posted by chadn737 View Post
                  For my own purposes, looking for expression differences between isoforms does not make my results anymore meaningful and also makes the types of analysis I am interested in far more difficult.
                  There are certainly cases when evaluating differences between isoforms is not only the preferred way to conduct the analysis, but is in fact the only way that you would identify a significant change in the expression of a gene. Additionally, biologically significant differential expression of a isoform can be missed when the smallest unit analyzed is a summary model of a 'gene'.

                  The increased difficulty in interpretation of cuffdiff output is something that I have tried to address with the 'cummeRbund' R/Bioconductor package. The hope is that making the cuffdiff output manageable in a familiar environment, while at the same time maintaining the relationships between genes, isoforms, promoters, etc, will begin to make the interpretation of cuffdiff output more engaging.

                  Comment


                  • #10
                    Originally posted by gringer View Post
                    ... but I'm sitting at the wrong computer at the moment. I'll attach / embed the useful scripts tomorrow after I get in to work.
                    I've attached my python script for doing basic contig counting. I use it like this (assuming all experiment IDs start with 'Sample_'):
                    Code:
                    $ ~/scripts/sam2rawCounts.py $(for x in $(ls -d Sample_*); do \
                    echo ${x}=$(ls ${x}/*contigs*.bam); done) > contig_reads.txt
                    In the absence of a consistent directory prefix for sample IDs, I create an 'experiment_dirs.txt' file that identifies the directories that contain the required BAM files:
                    Code:
                    $ ~/scripts/sam2rawCounts.py $(for x in $(cat experiment_dirs.txt); do \
                    echo ${x}=$(ls ${x}/*contigs*.bam); done) > contig_reads.txt
                    I then convert the output to a DESeq-compatible table in R using xtabs:
                    Code:
                    # get counts from transcriptome mapping
                    reads.3col.df <- read.table("contig_reads.txt", header = TRUE, stringsAsFactors = FALSE);
                    
                    ## convert to raw count table formatted for DESeq
                    experiment.counts <- xtabs(Reads ~ Contig + Experiment, data = reads.3col.df);
                    
                    ## assume no biological replicates
                    experiment.conditions <- sub("^Sample_","",colnames(experiment.counts));
                    
                    ## convert to DESeq CountDataSet object
                    experiment.cds <- newCountDataSet(unclass(experiment.counts), experiment.conditions);
                    Attached Files

                    Comment


                    • #11
                      In regards to the question of how to get whole-number counts appropriate for using DESeq, I would recommend using RSEM for quantification: http://deweylab.biostat.wisc.edu/rsem/

                      RSEM only does quantification of known-transcripts but does so in a very stable manner (comapred to cufflinks) that gives gene-level counts that are the sum of the transcript-level counts for that gene (which cufflinks does not do). It also is more stable from version to version and has less issues in "failing" to produce a count for certain isoforms/genes.

                      RSEM can take BAM files as input, but expects them to be aligned to the transcriptome (i.e. won't take the output of TopHat that includes both genomic and txome alignments) so you may end up re-doing alignment to use it. But RSEM will happily call bowtie for you if you give it fastq as input.

                      It will do probablistic assignment of multi-reads giving you counts that can be fractions of whole numbers, but simply rounding these numbers before inputting them to DEseq should give you better results than simply doing a raw counting of reads in genes/exons.

                      Comment


                      • #12
                        DESeq/edgeR also allow you to use generalized linear modelling to use more complicated statistical models. For example to do paired same analysis (e.g. healthy tissue vs cancer tissue from 10 patients). In this sort of situation I find that cuffdiff finds no genes/isoforms are differentially expressed because the patient-to-patient variance is too high.

                        On the other hand, I'm still to find a solution to the read counting that truly satisfies me. DESeq and edgeR get away without normalising for length because the two entities being compared are the same length. However, if you sum over all exons in a gene then this isn't necessarily the case: different exons might be being used in different samples. You can test exon-by-exon, but then have to work out how you go from differentially expressed exons to differentially expressed genes. I admit I haven't looked at RSEM, i'll go and do that.

                        Comment


                        • #13
                          Using cuffdiff means you are coming from tophat-cufflinks-cuffmerge workflow. I have found that this workflow is bugged and mixes up gene annotations. It can not be said with 100% certainty if the reported FPMKs indeed belong to the corresponding gene/transcript.

                          Please view my thread if interested:
                          Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                          I am now using DESeq for my analysis.
                          Last edited by epi; 03-09-2012, 09:23 AM.

                          Comment


                          • #14
                            Originally posted by epi View Post
                            Using cuffdiff means you are coming from tophat-cufflinks-cuffmerge workflow. I have found that this workflow is bugged and mixes up gene annotations. It can not be said with 100% certainty if the reported FPMKs indeed belong to the corresponding gene/transcript.

                            Please view my thread if interested:
                            Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                            I am now using DESeq for my analysis.
                            The thread you linked to doesn't show a bug, just an incorrect use of Cufflinks, as I just noted in that thread.

                            Regarding use of Cufflinks or RSEM with count-based tools like DEseq or edgeR - the authors of DEseq have noted on a number of occasions that it is NOT a good idea to use RSEM or Cufflinks with their tool. The reason is that those tools produce (either explicitly or internally during estimation of gene and transcript FPKM) an *estimate* of the number of fragments that came from each transcript. However, that estimate comes with *uncertainty*, and those tools don't account for it when testing for DE. Think of it this way - when you have multiple isoforms, you have to decide to assign your fragments among them. In addition to the biological variability that's present between replicates, there's now also variability in this assignment process. Those tools don't account for it, so using them with the estimated counts from transcript-level tools will lead to false positives.

                            Comment


                            • #15
                              Originally posted by sudders View Post

                              On the other hand, I'm still to find a solution to the read counting that truly satisfies me. DESeq and edgeR get away without normalising for length because the two entities being compared are the same length. However, if you sum over all exons in a gene then this isn't necessarily the case: different exons might be being used in different samples. .
                              DEseq and edgeR (and other tools that test raw gene-level counts) assume that the features are the same length. However, if you have any differential splicing or isoform switching going on between conditions, this assumption gets you into trouble. Suppose for example, you have an gene with two isoforms, one twice as long as the other. In both conditions being compared, you have 100 reads on the gene. However, in condition A, all the reads are on the shorter one. In B, they're all on the longer one. Comparing the raw gene-level count will show no change, even though the expression of the gene is two-fold higher in condition A! This is a contrived example (the reads wouldn't *all* move like that in a real situation), but it illustrates the point.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM
                              • seqadmin
                                Understanding Genetic Influence on Infectious Disease
                                by seqadmin




                                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                                09-09-2024, 10:59 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              14 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-26-2024, 12:57 PM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X