Seqanswers Leaderboard Ad

**ian.d.reid** · 10-23-2010, 11:50 AM

LAx,
1. Tophat uses several smaller programs to do its work. One of these programs is long_spanning_reads; another is bowtie. If you look in logs/run.log you will find the command lines that tophat issues to its subsidiary programs, and see where the intermediate data files in /tmp and the logs with the cryptic names come from.
The first and third log files that you attached are from bowtie. The first is probably from mapping the whole reads, and the third, judging from the number of reads processed, is probably from mapping segments of the initially unmapped reads in order to find splice junctions. The good news is that >80% of the segments mapped; the bad news is that 33 million aligned segments generated 77.5 million alignments, so many of the read segments aligned in more than one place.
The second log file is from long_spanning_reads (obviously) and just shows that the program ran without any problems.

2. The question is unclear. What kind of problem in GC content?

**laxman** · 10-23-2010, 05:25 PM

Hi Ian
Thanks. I am trying to figure out how to deal with the sequences that align at multiple locations. Further, I am also trying to figure out what caused it. Is it something to do with the sequencing? i. e. specific artifacts in the reads produced. I did find that there was some issues in the "per base sequence content" plot which plots the %G, %T, %C, %A across all bases. There was flucutations in the first 8 bases and divergence again after about 40 bases. I am beginning to think about it. One think, I thought about was to filter out bases in the the reads using fastax tools to contain only reads with a quality greater than 30 and a minimum length of 25. The hope is that the multiple hits in the alignment are caused by reads portions with bad quality and would be remedied by this.
Any suggestions?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Understanding tophat intermediate logs

Comment

Comment

Latest Articles

ad_right_rmr

News