Seqanswers Leaderboard Ad

**tahamasoodi** · 03-05-2014, 10:30 PM

Antbody having idea about this!

**dpryan** · 03-06-2014, 01:29 AM

Depends on which aligner you used and which definition of "uniquely aligned" you want to use.

**tahamasoodi** · 03-06-2014, 03:22 AM

Thanks dpryan, I've used bwa for the alignmment and uniquely aligned positions you can understand from the clip of THeta manual

To get the mapping positions of the uniquely mapped reads, a user may use the modified version of samtools (included in this package).
Suppose that the bam file is called example.bam and one wants to write the reads under the directory dir/. Then, one may use the following command to extract the mapping position of uniquely mapped reads.

samtools view -U BWA,dir/,N,N example.bam

or
samtools view -U Bowtie,dir/,N,N example.bam

**tahamasoodi** · 03-09-2014, 04:32 AM

Anybody else any idea!

**dpryan** · 03-09-2014, 07:04 AM

For BWA, you might either just grep for "XT:A:U" or simply use a MAPQ threshold. The latter would generally make more sense since you're actually interested in the alignments being correct (in reality, the concept of a "unique" alignment is somewhat non-sensical and is derived from an arbitrary edit-distance threshold).

**jwfoley** · 03-09-2014, 02:02 PM

I also strongly encourage a MAPQ threshold. "Uniquely mapping" is not nearly stringent enough, because there may be hundreds of next-best hits if it's just one base off from a repetitive element. I find MAPQ = 10 to be a reasonable cutoff that doesn't exclude anything real-looking.

So then it's simply "samtools view -q 10".

**tahamasoodi** · 03-09-2014, 11:44 PM

Thanks dpryan and jwfoley,

I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

Thanks

**dpryan** · 03-10-2014, 01:22 AM

MAPQ is the better choice. A threshold of 10 is pretty common, but you might be able to get away with a bit lower, depending on how the data looks. For CNVs, I would suspect that anywhere between 5 and 10 would work well since the occasional wrong mapping won't have much of an effect.

**lethalfang** · 03-14-2014, 03:49 PM

Originally posted by tahamasoodi View Post

Thanks dpryan and jwfoley,

I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

Thanks

Have you managed to get BIC-seq to run? I have trouble.
Basically, I first installed the R package called Rsamtools.

Code:

# In an R shell:
source("http://bioconductor.org/biocLite.R")
biocLite("Rsamtools")

Then, I installed BICseq with the following command in the bash shell:

Code:

~/apps/R CMD INSTALL BICseq_1.2.1.tar.gz

I tried the sample bam files that came with the zip file, but did not seem to work, and followed the 2-page PDF instruction.

Code:

> bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c(1:22, "X", "Y") )
Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
  No such chromosome "1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"

Fine.

Code:

> bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c("chr1") )
Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
  No such chromosome "chr1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"

So anyway, tried it with my own bam files:

Code:

> bicseq <- BICseq(sample = '/PATH/TO/Our_Tumor.bam', reference = '/PATH/TO/Our_Normal.bam', seqNames=c("chr21", "chr22") )

# The previous command worked, but this happened. 
> seqs <- getBICseg(object=bicseq, bin=100, lambda=2, winSize=200, quant=0.95, mult=1)
Error in .C("sort_rms_binning", as.integer(sample), length(sample), as.integer(reference), : 
"sort_rms_binning" not resolved from current namespace (BICseq)

Okay, so what's wrong? Anyone has any idea?

**aroraa** · 04-07-2014, 12:13 PM

Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

For loading their bam files:
If you see the bam files that they have provided it only has chromosome 22. so try it like this:

bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))

**lethalfang** · 04-07-2014, 12:15 PM

Originally posted by aroraa View Post

Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

For loading their bam files:
If you see the bam files that they have provided it only has chromosome 22. so try it like this:

bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))

Someone told me to try it on R version 2 because it worked there for him.
I haven't had time to try that yet. I'll do that in the next couple of weeks.

**lethalfang** · 05-05-2014, 09:39 AM

Originally posted by aroraa View Post

Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

For loading their bam files:
If you see the bam files that they have provided it only has chromosome 22. so try it like this:

bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))

I tried it with R 2.15. At least that works with their sample data.

Topics	Statistics	Last Post
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, Today, 11:09 AM	0 responses 22 views 0 likes	Last Post by seqadmin Today, 11:09 AM
ASHG 2024 Highlights – Part One by seqadmin Started by seqadmin, Today, 06:13 AM	0 responses 20 views 0 likes	Last Post by seqadmin Today, 06:13 AM
Seq-Scope Expands Possibilities for High-Resolution Gene Expression Analysis by seqadmin Started by seqadmin, 11-01-2024, 06:09 AM	0 responses 30 views 0 likes	Last Post by seqadmin 11-01-2024, 06:09 AM
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, 10-30-2024, 05:31 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-30-2024, 05:31 AM

Seqanswers Leaderboard Ad

Announcement

extracting uniquely mappable read positions bam file

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News