Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • tahamasoodi
    Success
    • May 2012
    • 130

    extracting uniquely mappable read positions bam file

    Hi,

    How can I extract uniquely mappable read positions from a bam file? I need this file as input to BIC-Seq for CNV analysis.

    Thanks,
    Thanks,
  • tahamasoodi
    Success
    • May 2012
    • 130

    #2
    Antbody having idea about this!
    Thanks,

    Comment

    • dpryan
      Devon Ryan
      • Jul 2011
      • 3478

      #3
      Depends on which aligner you used and which definition of "uniquely aligned" you want to use.

      Comment

      • tahamasoodi
        Success
        • May 2012
        • 130

        #4
        Thanks dpryan, I've used bwa for the alignmment and uniquely aligned positions you can understand from the clip of THeta manual

        To get the mapping positions of the uniquely mapped reads, a user may use the modified version of samtools (included in this package).
        Suppose that the bam file is called example.bam and one wants to write the reads under the directory dir/. Then, one may use the following command to extract the mapping position of uniquely mapped reads.

        samtools view -U BWA,dir/,N,N example.bam

        or
        samtools view -U Bowtie,dir/,N,N example.bam
        Thanks,

        Comment

        • tahamasoodi
          Success
          • May 2012
          • 130

          #5
          Anybody else any idea!
          Thanks,

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            For BWA, you might either just grep for "XT:A:U" or simply use a MAPQ threshold. The latter would generally make more sense since you're actually interested in the alignments being correct (in reality, the concept of a "unique" alignment is somewhat non-sensical and is derived from an arbitrary edit-distance threshold).

            Comment

            • jwfoley
              Senior Member
              • Jun 2009
              • 183

              #7
              I also strongly encourage a MAPQ threshold. "Uniquely mapping" is not nearly stringent enough, because there may be hundreds of next-best hits if it's just one base off from a repetitive element. I find MAPQ = 10 to be a reasonable cutoff that doesn't exclude anything real-looking.

              So then it's simply "samtools view -q 10".

              Comment

              • tahamasoodi
                Success
                • May 2012
                • 130

                #8
                Thanks dpryan and jwfoley,

                I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

                Thanks
                Thanks,

                Comment

                • dpryan
                  Devon Ryan
                  • Jul 2011
                  • 3478

                  #9
                  MAPQ is the better choice. A threshold of 10 is pretty common, but you might be able to get away with a bit lower, depending on how the data looks. For CNVs, I would suspect that anywhere between 5 and 10 would work well since the occasional wrong mapping won't have much of an effect.

                  Comment

                  • lethalfang
                    Member
                    • Aug 2011
                    • 95

                    #10
                    Originally posted by tahamasoodi View Post
                    Thanks dpryan and jwfoley,

                    I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

                    Thanks
                    Have you managed to get BIC-seq to run? I have trouble.
                    Basically, I first installed the R package called Rsamtools.
                    Code:
                    # In an R shell:
                    source("http://bioconductor.org/biocLite.R")
                    biocLite("Rsamtools")
                    Then, I installed BICseq with the following command in the bash shell:
                    Code:
                    ~/apps/R CMD INSTALL BICseq_1.2.1.tar.gz

                    I tried the sample bam files that came with the zip file, but did not seem to work, and followed the 2-page PDF instruction.
                    Code:
                    > bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c(1:22, "X", "Y") )
                    Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
                      No such chromosome "1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"
                    Fine.

                    Code:
                    > bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c("chr1") )
                    Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
                      No such chromosome "chr1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"



                    So anyway, tried it with my own bam files:

                    Code:
                    > bicseq <- BICseq(sample = '/PATH/TO/Our_Tumor.bam', reference = '/PATH/TO/Our_Normal.bam', seqNames=c("chr21", "chr22") )
                    
                    # The previous command worked, but this happened. 
                    > seqs <- getBICseg(object=bicseq, bin=100, lambda=2, winSize=200, quant=0.95, mult=1)
                    Error in .C("sort_rms_binning", as.integer(sample), length(sample), as.integer(reference), : 
                    "sort_rms_binning" not resolved from current namespace (BICseq)

                    Okay, so what's wrong? Anyone has any idea?

                    Comment

                    • aroraa
                      Junior Member
                      • Apr 2014
                      • 1

                      #11
                      Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                      For loading their bam files:
                      If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                      bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))

                      Comment

                      • lethalfang
                        Member
                        • Aug 2011
                        • 95

                        #12
                        Originally posted by aroraa View Post
                        Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                        For loading their bam files:
                        If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                        bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))
                        Someone told me to try it on R version 2 because it worked there for him.
                        I haven't had time to try that yet. I'll do that in the next couple of weeks.

                        Comment

                        • lethalfang
                          Member
                          • Aug 2011
                          • 95

                          #13
                          Originally posted by aroraa View Post
                          Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                          For loading their bam files:
                          If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                          bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))
                          I tried it with R 2.15. At least that works with their sample data.

                          Comment

                          Latest Articles

                          Collapse

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, 06-09-2026, 11:58 AM
                          0 responses
                          24 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          29 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-04-2026, 08:59 AM
                          0 responses
                          39 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 12:03 PM
                          0 responses
                          61 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...