Seqanswers Leaderboard Ad

**john_mu** · 07-29-2010, 01:34 PM

That is not slow. It sounds about the right time.

Also, yes 400million reads will need a powerful multi-core computer with a lot of memory.

For example, it took me about 17 hours to map 130 million (100bp) reads with SpliceMap on 10 cores.

The current version of SpliceMap is also about the same speed as TopHat. However, the next version will be twice as fast.

Edit: use the -p option in TopHat if you have a multi-core machine.

**IrisZhu** · 07-29-2010, 05:44 PM

John, thanks for your reply.

This 15 hours for mapping 17 million pairs against a single (short) chromosome ( chr20 ) is reasonable? I am asking just to make sure you was not thinking of the whole genome :-)

**john_mu** · 07-29-2010, 07:40 PM

Originally posted by IrisZhu View Post

John, thanks for your reply.

This 15 hours for mapping 17 million pairs against a single (short) chromosome ( chr20 ) is reasonable? I am asking just to make sure you was not thinking of the whole genome :-)

oh... a single chromosome??? That is uhhh... a bit slow.

How did you build the index?

EDIT: also how much free memory do you have? Maybe either bowtie or tophat is thrashing.

**IrisZhu** · 07-30-2010, 05:32 AM

Now I am mapping the same thing against the whole genome index, seems it's not a lot more slower than just against a chr20 index ......

Another thing, if I add option "-p" it failed immediately:
"
[zhuz2@cbbdev1 mapping]$ tophat -p -r 100 hg19 mate1.fa mate2.fa
Traceback (most recent call last):
File "/usr/local/tophat/1.0.14/bin/tophat", line 1854, in <module>
sys.exit(main())
File "/usr/local/tophat/1.0.14/bin/tophat", line 1746, in main
args = params.parse_options(argv)
File "/usr/local/tophat/1.0.14/bin/tophat", line 474, in parse_options
self.system_params.parse_options(opts)
File "/usr/local/tophat/1.0.14/bin/tophat", line 171, in parse_options
self.bowtie_threads = int(value)
ValueError: invalid literal for int() with base 10: '-r'
"
The command "tophat -r 100 hg19 mate1.fa mate2.fa" works well.
Do you know why? Is it the problem of my machine or command?

Thank you so much for your help,

Iris

**john_mu** · 07-30-2010, 10:09 AM

you need to use -p followed by the number of processes.

Originally posted by IrisZhu View Post

Now I am mapping the same thing against the whole genome index, seems it's not a lot more slower than just against a chr20 index ......

Another thing, if I add option "-p" it failed immediately:
"
[zhuz2@cbbdev1 mapping]$ tophat -p -r 100 hg19 mate1.fa mate2.fa
Traceback (most recent call last):
File "/usr/local/tophat/1.0.14/bin/tophat", line 1854, in <module>
sys.exit(main())
File "/usr/local/tophat/1.0.14/bin/tophat", line 1746, in main
args = params.parse_options(argv)
File "/usr/local/tophat/1.0.14/bin/tophat", line 474, in parse_options
self.system_params.parse_options(opts)
File "/usr/local/tophat/1.0.14/bin/tophat", line 171, in parse_options
self.bowtie_threads = int(value)
ValueError: invalid literal for int() with base 10: '-r'
"
The command "tophat -r 100 hg19 mate1.fa mate2.fa" works well.
Do you know why? Is it the problem of my machine or command?

Thank you so much for your help,

Iris

**IrisZhu** · 07-30-2010, 10:12 AM

Yes I just realized this a few minutes ago :-) so silly of me ....
Thanks again.

**poisson200** · 07-31-2010, 11:30 AM

Hi,
Just an observation; is there a reason to map only to chromosome 20? It is possible some reads may map to chromosome 20 but better map to a different chromosome (less mismatches). So some reads may be falsely mapped to Chromosome 20 if the others are not present in the indexes. Also you would not know if reads mapped to more than one chromosome with the same stringency. Possibly something to bare in mind.

**IrisZhu** · 08-02-2010, 07:50 AM

Originally posted by poisson200 View Post

Hi,
Just an observation; is there a reason to map only to chromosome 20? It is possible some reads may map to chromosome 20 but better map to a different chromosome (less mismatches). So some reads may be falsely mapped to Chromosome 20 if the others are not present in the indexes. Also you would not know if reads mapped to more than one chromosome with the same stringency. Possibly something to bare in mind.

Thanks for your comment. Of course it doesn't make sense to map to one chromosome.
That's just for a test to see if the software is working.

**poisson200** · 08-02-2010, 09:26 AM

In that case, it makes perfect sense.

Topics	Statistics	Last Post
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, Today, 08:18 AM	0 responses 8 views 0 likes	Last Post by seqadmin Today, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, Today, 08:04 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM
Genetic Mosaicism More Prevalent Than Previously Thought by seqadmin Started by seqadmin, 05-30-2024, 03:16 PM	0 responses 27 views 0 likes	Last Post by seqadmin 05-30-2024, 03:16 PM

Seqanswers Leaderboard Ad

Announcement

need some help with Tophat

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News