Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2 Bowtie2 Htseq-count for bacteria

    Hey this is my first try to analyse a rna-seq project. Since the company we worked with is not able to give me a usefull annotated differential expression table...

    I just want to know for sure, if its half-way right what I do.

    My samples: 2 conditions, 2 replicats from each condition, 50bp single-end, not strand-specific. I got 23m-50m reads per library.

    I want to know differential expressed genes between conditions.

    I first wanted to use bowtie2 for alignment and that worked pretty well until I noticed that no NH tag for htseqcount is written.

    So I switched to tophat and there it got complicated:

    In default tophat2 finds lesser alignments than bowtie2. Why? As I understand tophat2 uses bowtie2 for aligment.

    As I dont want to find novel junctions, as there is no splicing my bacterium, the final command I used after several attempts is:

    tophat2 -G file.gtf --no-novel-juncs --no-coverage-search --library-typ fr-unstranded index file.fastq

    With every attempt (first: --no-coverage-search; second: added -G; third: added --no-novel juncs) the count of aligned reads dropped a little bit. Why it droppend between this 3 modes?

    With the last mode I got an alignment rate of 65-75%.

    I finally got my count tables with
    samtools view file.bam | htseq-count -t gene -s no - file.gtf > counts.txt

    Now I will use deseq for differential expression.

    Everything ok so far?

  • #2
    I just went through the same process as what you did
    see my post here http://crazyhottommy.blogspot.com/20...-sort-and.html

    I am following the protocol from Simon Anders http://www.nature.com/nprot/journal/....2013.099.html
    Count-based differential expression analysis of RNA sequencing data using R and Bioconductor






    Originally posted by chickenmcfu View Post
    Hey this is my first try to analyse a rna-seq project. Since the company we worked with is not able to give me a usefull annotated differential expression table...

    I just want to know for sure, if its half-way right what I do.

    My samples: 2 conditions, 2 replicats from each condition, 50bp single-end, not strand-specific. I got 23m-50m reads per library.

    I want to know differential expressed genes between conditions.

    I first wanted to use bowtie2 for alignment and that worked pretty well until I noticed that no NH tag for htseqcount is written.

    So I switched to tophat and there it got complicated:

    In default tophat2 finds lesser alignments than bowtie2. Why? As I understand tophat2 uses bowtie2 for aligment.

    As I dont want to find novel junctions, as there is no splicing my bacterium, the final command I used after several attempts is:

    tophat2 -G file.gtf --no-novel-juncs --no-coverage-search --library-typ fr-unstranded index file.fastq

    With every attempt (first: --no-coverage-search; second: added -G; third: added --no-novel juncs) the count of aligned reads dropped a little bit. Why it droppend between this 3 modes?

    With the last mode I got an alignment rate of 65-75%.

    I finally got my count tables with
    samtools view file.bam | htseq-count -t gene -s no - file.gtf > counts.txt

    Now I will use deseq for differential expression.

    Everything ok so far?

    Comment


    • #3
      Yes, the description of Deseq is easy to follow, even with no R experience.

      Unfortunately I have run in other problems. Deseq analysis gives me 125 differentially regulated genes. The commercial sequencing service gives me a cuffdiff result of 200 genes (mixed up annotation though). So now I went through the process of cuffdiff and got 129 regulated genes.

      As I now have conducted nearly every possible mapping (bowtie2, tophat2), whole genome or with options for tophat -G and -T and then for cuffdiff -M with the rtRNA.gtf, I could bring it up to 136.

      The only thing in which my analysis differs from them is, that they align straight to only the cds and ncRNAs, completely disregarding the RNAs, but then give cuffdiff a full gtf. They afterwards did a second mapping with the fastqs to rtRNAs, so I know that up to 4.5% map to them.

      Is this right? Can this difference within the alignment give such a huge difference in the results?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Quality Control Essentials for Next-Generation Sequencing Workflows
        by seqadmin




        Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

        Nucleic Acid Quality Control
        Preparing for NGS starts with isolating the...
        02-10-2025, 01:58 PM
      • seqadmin
        An Introduction to the Technologies Transforming Precision Medicine
        by seqadmin


        In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
        01-27-2025, 07:46 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 02-07-2025, 09:30 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 02-05-2025, 10:34 AM
      0 responses
      84 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 02-03-2025, 09:07 AM
      0 responses
      68 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 01-31-2025, 08:31 AM
      0 responses
      44 views
      0 likes
      Last Post seqadmin  
      Working...
      X