Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate number of multi-mapped reads?

    Hi, does anyone know how to calculate the number of multi-mapped reads from your bam file?

    This seems like an obvious/silly question but the latest versions of Cufflinks no longer provides this statistic in the basic output and I have been looking everywhere (log files, samtools stats, this website, etc) but can't figure out how to find this out!

    Thanks!

  • #2
    Although primarily to tally up read counts for genes, HT-seq count also adds up the multimap reads "alignment not unique."

    Comment


    • #3
      Thanks! I'll look into it.

      Comment


      • #4
        Something like this might work.

        Code:
        samtools view -F 4 file.bam | awk '{printf $1"\n"}' | sort | uniq -d | wc -l
        
        Explanation
        -----------
        samtools view -F 4 file.bam
            print all lines in SAM format except those flagged as unmapped
        
        awk '{printf $1"\n"}'
            print the first field in the line (the read name)
        
        sort | uniq -d | wc -l
            sort the read names
            print only the duplicates (i.e. those appearing more than once)
            count the lines

        Comment


        • #5
          Thanks, great idea and thanks for the detailed, explanation! I'll give it a try

          Comment


          • #6
            Retrieve mapped reads to certain regions (eg. between 120-140 in a gene)?

            I have my own unannotated reference data base. With it, the reads were mapped with Bowtie, now I need to retrieve all reads which mapped to certain region between 120-140 nt/each gene, how can I do it with samtools ? Thanks in advance.

            Comment


            • #7
              Make a BED file with the gene locations, flank that with bedtools (or awk), and then use it with the "-L" option to "samtools view".

              Comment


              • #8
                more specifics

                Would you please give me more details ? Thanks.

                Comment


                • #9
                  Which part in particular requires more detail?

                  Comment


                  • #10
                    example

                    >Sg01_138835_A_G
                    TGCGAAGTATGCCACCAACATGTTTCACCGCATAACCTTGATCACCTTGGACATTCTGCCACAAGCAGACGACGAGTTCCTCAGAAAATTCATGGACAAATTTGGGCGCACGACCATCTA [GATCGAAAATCTTAATCCCG]AGCTAGTGTTGCATTCCATGCTCCTCGTCTTACACTCCAACAAGTTTGCAGAAAGTTTGTCAAAAATCCACTCGGTAGCATGGACGAGCTGCACGAGTGAGCCAAGGGTTACATATAGATGGAAGAAATGTCCAAATTCTGGAAAGAAGTCCGTCAAGCCGAACACAAGCACGAAAAGTGCAAAGCAAACTCCAAGGCCGACTTGCACAAGTCAAACAAGAGGCCCAATTCAGACAAGCGCTAACCTCTCCCCAAAGAGC

                    I want to retrieve reads aligned to [GATCGAAAATCTTAATCCCG] within gene above.

                    Comment


                    • #11
                      If you aligned to the gene itself, then "samtools view alignments.bam Sg01_138835_A_G:120-140". If you aligned to the genome then find out where in the genome that is.

                      Comment


                      • #12
                        how about multiple genes in reference

                        >Sg01_138835_A_G
                        TGCGAAGTATGCCACCAACATGTTTCACCGCATAACCTTGATCACCTTGGACATTCTGCCACAAGCAGACGACGAGTTCCTCAGAAAATTCATGGACAAATTTGGGCGCACGACCATCTAGATCGAAAATCTTAATCCCGAGCTAGTGTTGCATTCCATGCTCCTCGTCTTACACTCCAACAAGTTTGCAGAAAGTTTGTCAAAAATCCACTCGGTAGCATGGACGAGCTGCACGAGTGAGCCAAGGGTTACATATAGATGGAAGAAATGTCCAAATTCTGGAAAGAAGTCCGTCAAGCCGAACACAAGCACGAAAAGTGCAAAGCAAACTCCAAGGCCGACTTGCACAAGTCAAACAAGAGGCCCAATTCAGACAAGCGCTAACCTCTCCCCAAAGAGC
                        >Sg01_281790_T_C
                        CATCTATCTATTGCTTCAACCATGAAAATAATTATTTTCAAAGGCTTGAGTAATAATGTTTACCAATATAATGCCTATAATAATGTTACAAAATTCTAGTTTATATACCAAGTTTGATCGATGTTTGTAAAATTTAGCATATTCTTGCCTTGAATATCCTTTTGACACACACAAATACCATATAACTAACCCACAATATTCTGACTTATATGGGTGCAGCGGATCATTTGAGCGAAGCTGTGGGCGACAACGAATCCAAGGAACTTTTTGAATCATTTCGAGTCTTCTGCAATAGAAAAAGTGTTAGTACTAATGCATTTTCATCATTCAACAATTAATATAAATATATATATATATAGAGAGAGAGAGAGAGAGAGAAGTGCACTCATGAACAAAATCC
                        >Sg01_334575_A_G
                        ATTTTCTTAGTCCGCAAAAGTTAAACAGTATGTTAAGAATATAGGTTAAGAAACTTAAAAGTAAAAATATTTTTATTAAGAGGTGAAAAATTTTGTACAATAACTTGACTTGGAAAATTCTGCAGTTCCTGTTCTTCCAAATGTAATCTATGGACTATGGCAACAACAAAACTCAGAGACGCTACGTCCATTACACTTCACTTGGACAGTAAGTTAGACATCAACCAGGGGCCAAATTCTTGATGAAAAAAGCCTGCACGAACAGCATCTGGGATTAAAACATTCCAGACCTCCTTAGCCGCATAACATTTGGATGGATCATATAATTATATGGATAGATAACTATATGGATCATAATTAATTCAGCATAATTTTTTTTAATACTCTCTAAGCGCCTCTC
                        >Sg01_402061_A_C
                        CATAACCTGATAATATATTTTATATATCTACTATGAAAATAAATAGAATATCATATATACTAAGTAAGAAACGATCTGGTGTGTATCAAACAATGAATAATAAAATGTTGTTATCCAATTCTTATATAATATCTTATTTTAAGCAACAATCAAACAATGAATATTTTAAGAACTTATTTATCATGTTCTTATGGAGGCCTCTAGGGTAAGTCGTGACTGTATAGCATAGAAAATAAAGCTGCAGGAGCGTGGTATATCTTGTCAGAGCATGTATGTGTCTTGAGTGTGAAAGAGGGTTGGAGAAAGCAAAGGAATGTTGGAGGTCATTATTGGCTTCAAGCAGAGAGTGCTTACGAACCTATTAAAGTTGGAGCAGTGAGATATATAGTATAGTTGAGCT

                        If I want retrieve reads mapped to same region on multiple (several thousands) genes, do I need to list all, like this, Sg01_138835_A_G:120-140, Sg01_281790_T_C:120-140 ... ?

                        Comment


                        • #13
                          Make a BED file of the regions and use that with the "-L" option.

                          Comment


                          • #14
                            thanks.

                            I will do that.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Recent Advances in Sequencing Analysis Tools
                              by seqadmin


                              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                              Today, 07:48 AM
                            • seqadmin
                              Essential Discoveries and Tools in Epitranscriptomics
                              by seqadmin




                              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                              04-22-2024, 07:01 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Today, 07:17 AM
                            0 responses
                            11 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 05-02-2024, 08:06 AM
                            0 responses
                            19 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-30-2024, 12:17 PM
                            0 responses
                            20 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-29-2024, 10:49 AM
                            0 responses
                            28 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X