BFAST wish list - SEQanswers

You are currently viewing the SEQanswers forums as a guest, which limits your access. Click here to register now, and join the discussion

X

aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#1

BFAST wish list

07-16-2010, 12:14 PM

Hello, I'm running a sensitivity/specificity study between aligners for my master thesis and I've noticed that BFAST is really slow when reading each one of the 14GB indexes. I have a 24 Core Xeon and the problem is that at some points it only uses 1 CPU instead of the 16 that I specified, why is this? After using blat for so long I'm so happy somebody patched it up. Also what is the deal with reporting negative aligments and not updating the flag field in SAM, can I get a 4 there? It would also be nice if when you set threads to something other than 2n it would tell you its invalid before you have to wait the indexes to load would have saved me a lot of time.

PS:
The only thing missing from BFAST is CUDA support, anybody working on this? How nice would it be if you could do 512 threads at the same time, I'm sure you'll need it with Hi-Seq.. Cloud computing doesn't work if you have to upload 200GB of sequence data every day, a $1000 GPU makes much more sense. You are waisting time trying to run a simple text matching algorithm in a CPU, GPU's sounds better..
Tags: None
nilshomer

Nils Homer

Join Date: Nov 2008

Posts: 1285
- Share
- Tweet
#2

07-16-2010, 01:36 PM

First of all, I am sorry for your frustration. You are welcome to contribute to the code since I think you have some good suggestions (it is Open Source). Make sure you are using the latest version.

Originally posted by aleferna View Post

Hello, I'm running a sensitivity/specificity study between aligners for my master thesis and I've noticed that BFAST is really slow when reading each one of the 14GB indexes.

I assume you are not talking about index creation. The indexes are a compressed. Feel free to remove that compression (gzip) and they will read in to memory faster. We have a fast file system (lustre) and it takes a while to load the indexes (a few minutes per), but given you batch 25M or more reads, it is not that bad. Again feel free to play with the code.

Originally posted by aleferna

I have a 24 Core Xeon and the problem is that at some points it only uses 1 CPU instead of the 16 that I specified, why is this?

Not everything is parallelizable or scales linearly (see Amdahl's Law). For example, reading in an index, merging the results (you can do parallel merge sort), or outputting in sorted order.

Originally posted by aleferna

After using blat for so long I'm so happy somebody patched it up. Also what is the deal with reporting negative aligments and not updating the flag field in SAM, can I get a 4 there?

It reports all alignments and lets the user filter the data. If you report too few alignments, then a new alignment must be performed to recover them. The SAM flag has a "4" since it is binary encoded (see the SAM spec). Try "samtools view -X <in.bam>" to see the flag in a string representation. If you see a "4" when you don't expect it, submit a bug, we are quite responsive.

Originally posted by aleferna

It would also be nice if when you set threads to something other than 2n it would tell you its invalid before you have to wait the indexes to load would have saved me a lot of time.

What version are you using? The threads do not have to be a multiple of two when aligning. Are you talking about the one-time indexing step?

Originally posted by aleferna

PS:
The only thing missing from BFAST is CUDA support, anybody working on this?

Sounds like a good master's thesis project

Originally posted by aleferna

How nice would it be if you could do 512 threads at the same time, I'm sure you'll need it with Hi-Seq.. Cloud computing doesn't work if you have to upload 200GB of sequence data every day, a $1000 GPU makes much more sense. You are waisting time trying to run a simple text matching algorithm in a CPU, GPU's sounds better..

That last sentence sounds too simplistic. The goal of BFAST was to first meet the sensitivity requirement, THEN be as fast as possible. (Warn: generalization) biologists want the right answer (eventually), not the wrong answer (quickly).
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#3

07-16-2010, 03:07 PM

Wow, thanks for the instant reply, I love SeqAnswers where else can you talk to the man himself, cool!

1. The 2^N issue maybe I'm mistaken at some point I ran one of the processes with the number of threads = 24 and it started working, I came back to check on the process some ours later and it said that the number of threads must be a power of 2, I't might have been the index creation.

2. Didn't quite understand your response regarding the sensitivity on running BFast on GPU's. I see a trend of new aligners being made to run in a computer cloud, but I think that it will take longer to upload the data to the cloud than to process it locally using GPU architecture such as the NVidia CUDA.

3. I always get 255 for the MapQ value am I doing something wrong? What is a typical value for the --avgMismatchQuality in the post process?

Will check the source code, Thanks!!
Comment
nilshomer

Nils Homer

Join Date: Nov 2008

Posts: 1285
- Share
- Tweet
#4

07-16-2010, 07:40 PM

Originally posted by aleferna View Post

Wow, thanks for the instant reply, I love SeqAnswers where else can you talk to the man himself, cool!

Without users, a developer is nothing.

2. Didn't quite understand your response regarding the sensitivity on running BFast on GPU's. I see a trend of new aligners being made to run in a computer cloud, but I think that it will take longer to upload the data to the cloud than to process it locally using GPU architecture such as the NVidia CUDA.

Implementation is important, and the GPU vs. cloud vs. FPGA or a solution customized by the problem are all important things to consider. I don't weight in on this topic for good reason: I need more data to make an opinion.

3. I always get 255 for the MapQ value am I doing something wrong? What is a typical value for the --avgMismatchQuality in the post process?

Will check the source code, Thanks!!

A 255 is returned if there is no second best hit, which happens when the read is uniquely mapped. See

Blat-like Fast Accurate Search Tool

https://sourceforge.net/apps/mediawiki/bfast/index.php?title=Mapping_Quality

Download Blat-like Fast Accurate Search Tool for free. BFAST facilitates the fast and accurate mapping of short reads to reference sequences, where mapping billions of short reads with variants is of utmost importance.
Comment
epigen

Senior Member

Join Date: May 2010

Posts: 101
- Share
- Tweet
#5

07-19-2010, 02:46 AM

Hi aleferna and Nils,

your thread already answered most of the questions I would also have asked Nils. But I still have two:
1. To reduce non-parallelizable I/O, would it be possible to replace the large temp files that bfast match produces by keeping the info in the memory?
2. Could I pipe the indexes from gunzip and would that make loading them faster?

And something for the wish list: Why do the bfast programs not output any information when their input comes from standard input? It would be nice to have the info in case the pipeline crashes at some point to know why.

BFAST for CUDA sounds like a really good idea. Parallel merge sort would be great too because the merging step is the most time-consuming. Unfortunately I'm not a good programmer so I can't offer my help with opimizing the code. But I always stumble across bugs so I'd at least make a good beta tester.

I'd also like to take the opportunity to thank you all for your support!

Barbara
Comment
bioinfosm

Senior Member

Join Date: Jan 2008

Posts: 482
- Share
- Tweet
#6

07-19-2010, 07:28 AM

aleferna,

am interested in the "sensitivity/specificity study between aligners..." do you have any updates, resources, blog or paper to point?

thanks!

--
bioinfosm
Comment
lh3

Senior Member

Join Date: Feb 2008

Posts: 691
- Share
- Tweet
#7

07-20-2010, 05:25 PM

I would go for SSE2 first before considering CUDA. As Nils said, it would be good for someone to take on this as a research project, but in the near future, CUDA would not deliver a performance boost significant enough to make it practically attractive and cost-effective. When you look into details, CUDA is not that decent as it looks to be. hmmerGPU, mummerGPU and swGPU are all far from the theoretical speed due to unconquerable technical difficulties.

Last edited by lh3; 07-20-2010, 05:31 PM.
Comment
nilshomer

Nils Homer

Join Date: Nov 2008

Posts: 1285
- Share
- Tweet
#8

07-21-2010, 10:20 PM

Originally posted by epigen View Post

Hi aleferna and Nils,

your thread already answered most of the questions I would also have asked Nils. But I still have two:
1. To reduce non-parallelizable I/O, would it be possible to replace the large temp files that bfast match produces by keeping the info in the memory?

Yes, if enough memory is available. Storing on disk is a function of not having enough RAM (1TB should solve a lot of this ).

2. Could I pipe the indexes from gunzip and would that make loading them faster?

Probably not, since the underlying system calls are using zlib (gzip). My suggestion would be to get a faster disk.

And something for the wish list: Why do the bfast programs not output any information when their input comes from standard input? It would be nice to have the info in case the pipeline crashes at some point to know why.

They do! Each command initially prints its program parameters! See the "readsFileName:" line in "bfast match" for example. It will name the file or STDIN.
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#9

07-23-2010, 02:11 AM

Sensitivity / Specificity study

Hi Bioinfosm,

Sure I hope to have the results ready soon, I've been struggling with MAQ but I finally realize that it needs reads to be exactly the same size. Since I'm simulating the reads they usually vary 2 or 3 bases in length, that was giving me really bad maq sensitivity but now I have it working.

I will post my results, but I'm working on a very weird dataset, don't think many people has these types of problems. I'm focusing on errors due to high mutation rates not on sequencing errors. We work with cancer stem cell lines that have abnormal mutation rates and therefore the MapQ value breaks down very often. To make things worst all the reads are chimeric (its a 4C experiment) and therefore they are really tricky to map. Basically my thesis is how to combine maq, blat, bfast , bwa aln and bwa bwasw to get > 99% sensitivity with > 99.5% specificity. So far it has been impossible to achieve this level using with a single algorithm so I decided to apply each algorithm where it has the best results.

Hope I can share some of this soon
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#10

07-25-2010, 02:56 AM

bfast localalign takes longer with 24 threads than with 16???

This is very odd, I reran the localalign using -t 24 and its been running for 2 days now where as with -t 16 it only takes a few hours? Has anybody else seen this problem?

Also why does it say endReadNum: 2147483647 when there are only 3 million reads?
Comment
nilshomer

Nils Homer

Join Date: Nov 2008

Posts: 1285
- Share
- Tweet
#11

07-25-2010, 10:10 AM

Originally posted by aleferna View Post

This is very odd, I reran the localalign using -t 24 and its been running for 2 days now where as with -t 16 it only takes a few hours? Has anybody else seen this problem?

Also why does it say endReadNum: 2147483647 when there are only 3 million reads?

The threading option is "-n", not "-t". Threading is not perfectly scalable, and can be a result from many factors (take an OS & architecture course for an introduction).

If not specified, the start/end read #s default to 1 and infinity (in this case (2^32)-1) respectively. Use the "-p" option to see the program parameters.
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#12

07-31-2010, 11:24 PM

I just finished the analysis of BFast and the results are very strange. I get really good performance at 50 and 75 bp but this degrades (significantly) with 150, 200 and 500bp reads. Is there anything that you need to adjust in bfast when you have bigger reads? In the case of blat you get better specificity and sensitivity as reads get longer, I thought BFast would out perform blat but it doesn't?
Comment
nilshomer

Nils Homer

Join Date: Nov 2008

Posts: 1285
- Share
- Tweet
#13

08-01-2010, 09:49 AM

Originally posted by aleferna View Post

I just finished the analysis of BFast and the results are very strange. I get really good performance at 50 and 75 bp but this degrades (significantly) with 150, 200 and 500bp reads. Is there anything that you need to adjust in bfast when you have bigger reads? In the case of blat you get better specificity and sensitivity as reads get longer, I thought BFast would out perform blat but it doesn't?

What performance metrics are you using (running time, accuracy, sensitivity)? I haven't tried BFAST with longer reads (>200bp), so there would need to be some thought on how to make it work for long reads. Remember there are short-read and long-read aligners. Have you tried the BWA-SW module? It performs very well for longer reads.
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#14

08-01-2010, 10:15 AM

I'm just optimizing for specificity right now, not worrying too much on speed. I'm using the 10 indexes that you mention in the manual and no options at all. I'm comparing how different algorithms work at different read lengths / mismatches. It is similar to the study you did in the BFast paper but with 25 to 500 bp read lengths.
Comment
aleferna

Senior Member

Join Date: Sep 2009

Posts: 121
- Share
- Tweet
#15

08-01-2010, 10:22 AM

Here's a table with some comparisons I'm doing. The bmr column correlates to the number of mismatches, a read with bmr 1% in 50bp will typically have 2 mismatches.

http://www.nada.kth.se/~afer/benchmark.jpeg

Last edited by aleferna; 08-01-2010, 10:27 AM.
Comment

Previous 1 2 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

	Topics		Statistics	Last Post
	Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM		0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
	New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM		0 responses 24 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
	Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM		0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
	New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM		0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Working...

X