Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • extracting uniquely mappable read positions bam file

    Hi,

    How can I extract uniquely mappable read positions from a bam file? I need this file as input to BIC-Seq for CNV analysis.

    Thanks,
    Thanks,

  • #2
    Antbody having idea about this!
    Thanks,

    Comment


    • #3
      Depends on which aligner you used and which definition of "uniquely aligned" you want to use.

      Comment


      • #4
        Thanks dpryan, I've used bwa for the alignmment and uniquely aligned positions you can understand from the clip of THeta manual

        To get the mapping positions of the uniquely mapped reads, a user may use the modified version of samtools (included in this package).
        Suppose that the bam file is called example.bam and one wants to write the reads under the directory dir/. Then, one may use the following command to extract the mapping position of uniquely mapped reads.

        samtools view -U BWA,dir/,N,N example.bam

        or
        samtools view -U Bowtie,dir/,N,N example.bam
        Thanks,

        Comment


        • #5
          Anybody else any idea!
          Thanks,

          Comment


          • #6
            For BWA, you might either just grep for "XT:A:U" or simply use a MAPQ threshold. The latter would generally make more sense since you're actually interested in the alignments being correct (in reality, the concept of a "unique" alignment is somewhat non-sensical and is derived from an arbitrary edit-distance threshold).

            Comment


            • #7
              I also strongly encourage a MAPQ threshold. "Uniquely mapping" is not nearly stringent enough, because there may be hundreds of next-best hits if it's just one base off from a repetitive element. I find MAPQ = 10 to be a reasonable cutoff that doesn't exclude anything real-looking.

              So then it's simply "samtools view -q 10".

              Comment


              • #8
                Thanks dpryan and jwfoley,

                I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

                Thanks
                Thanks,

                Comment


                • #9
                  MAPQ is the better choice. A threshold of 10 is pretty common, but you might be able to get away with a bit lower, depending on how the data looks. For CNVs, I would suspect that anywhere between 5 and 10 would work well since the occasional wrong mapping won't have much of an effect.

                  Comment


                  • #10
                    Originally posted by tahamasoodi View Post
                    Thanks dpryan and jwfoley,

                    I actually want to use BICseq for CNV which later I've to use for tumor purity estimate, so what do you think is better, MAPQ or "XT:A:U".

                    Thanks
                    Have you managed to get BIC-seq to run? I have trouble.
                    Basically, I first installed the R package called Rsamtools.
                    Code:
                    # In an R shell:
                    source("http://bioconductor.org/biocLite.R")
                    biocLite("Rsamtools")
                    Then, I installed BICseq with the following command in the bash shell:
                    Code:
                    ~/apps/R CMD INSTALL BICseq_1.2.1.tar.gz

                    I tried the sample bam files that came with the zip file, but did not seem to work, and followed the 2-page PDF instruction.
                    Code:
                    > bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c(1:22, "X", "Y") )
                    Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
                      No such chromosome "1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"
                    Fine.

                    Code:
                    > bicseq0 = BICseq(sample = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam', reference = '/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/normal_sorted.bam', seqNames = c("chr1") )
                    Error in BICseq(sample = "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam",  : 
                      No such chromosome "chr1" in the header of the BAM file "/home/ltfang/Documents/DOWNLOAD/BICseq/test_data/tumor_sorted.bam"



                    So anyway, tried it with my own bam files:

                    Code:
                    > bicseq <- BICseq(sample = '/PATH/TO/Our_Tumor.bam', reference = '/PATH/TO/Our_Normal.bam', seqNames=c("chr21", "chr22") )
                    
                    # The previous command worked, but this happened. 
                    > seqs <- getBICseg(object=bicseq, bin=100, lambda=2, winSize=200, quant=0.95, mult=1)
                    Error in .C("sort_rms_binning", as.integer(sample), length(sample), as.integer(reference), : 
                    "sort_rms_binning" not resolved from current namespace (BICseq)

                    Okay, so what's wrong? Anyone has any idea?

                    Comment


                    • #11
                      Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                      For loading their bam files:
                      If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                      bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))

                      Comment


                      • #12
                        Originally posted by aroraa View Post
                        Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                        For loading their bam files:
                        If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                        bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))
                        Someone told me to try it on R version 2 because it worked there for him.
                        I haven't had time to try that yet. I'll do that in the next couple of weeks.

                        Comment


                        • #13
                          Originally posted by aroraa View Post
                          Thanks for posting this. I tried this on their bam files and mine as well.I am getting the same error. Have you figured it out yet ? error - "sort_rms_binning" not resolved from current namespace (BICseq)"

                          For loading their bam files:
                          If you see the bam files that they have provided it only has chromosome 22. so try it like this:

                          bicseq <- BICseq(sample = try.tumor, reference = try.normal, seqNames=paste("chr",22,sep=""))
                          I tried it with R 2.15. At least that works with their sample data.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Genetic Variation in Immunogenetics and Antibody Diversity
                            by seqadmin



                            The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                            11-06-2024, 07:24 PM
                          • seqadmin
                            Choosing Between NGS and qPCR
                            by seqadmin



                            Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                            10-18-2024, 07:11 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Today, 11:09 AM
                          0 responses
                          22 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Today, 06:13 AM
                          0 responses
                          20 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 11-01-2024, 06:09 AM
                          0 responses
                          30 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 10-30-2024, 05:31 AM
                          0 responses
                          21 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X