Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cedance
    replied
    After going thro' your question again, the simple picture as far as I understand is this: tophat first uses bowtie to align reads against the reference genome. Bowtie aligns reads without identifying splice sites. It tries to map entire read directly to a matching region of your genome (and it is fast). From here, tophat "knows" potential exons and it tries to map the other unmapped reads by splitting them in to smaller parts and finding a region.

    Leave a comment:


  • agout
    replied
    exon join frequencies from RNAseq data

    Hi,

    Don't know whether this will help - or whether I went about it the correct way - but this is my related experience.

    I was faced with a similar problem - determining the read evidence for particular exon joins within a certain genomic region.

    Basically what I did was take the RNAseq read data and use tophat to align to a particular genomic region for which I knew the exon positions.

    After getting a sam/bam file, I wrote a python script to parse the tophat match data i.e. the cigar string info: 30M500N50M to determine the exon join read frequencies.

    Shoot me any questions you like.

    Good luck,

    Alex

    Leave a comment:


  • cedance
    replied
    If you just want to use the junctions directly, then, in addition to bam file, you should also have a junctions, insertions and deletions .bed file.

    If you are wanting to understand, then, you should look at Sam Format Specification. Your bam file can be viewed with samtools. If necessary, you can convert to sam with picard tools.
    In RNA-Seq data, basically, when tophat finds a read that splices across previously identified exon regions, that is, for one read R (80bp), say, of gene G (with 5 exons, say), between E2 and E3 (the intron between E2 and E3 is 500bp, say); lets say R1 = 30bp maps to E2 and R2 = 50bp maps to E3, then tophat writes this as 30M500N50M. This is a CIGAR string (from the sam format specification). In addition to that there is a "start" position in the SAM file format and using this you can find out the junction (check out other possible options for CIGAR string). Is this what you asked for?

    Leave a comment:


  • schaffer
    started a topic counting junction reads in TopHat

    counting junction reads in TopHat

    Hi,
    I am running TopHat and using the accepted_hits.bam file to generate counts for genes.
    How does TopHat identify junctions in this bam file, as exons?
    How are users generating read counts for genes which include the junction and exon reads?
    Thanks,
    Lana

Latest Articles

Collapse

  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    Today, 06:43 AM
  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 06-21-2024, 07:49 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-20-2024, 07:23 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-17-2024, 06:54 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Working...
X