Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bedtools with spliced junctions reads

    Hi all,

    I am trying STAR to analyze my RNAseq and it seems that, when a read is on an exon-exon junction, STAR fills up the read with N (then it increases the size of the read). My problem is when I want to use genomeCoverageBed to get the per base coverage, the output is a lot of reads in the different introns.

    Do you know if there is a way to change the output of STAR (I didn't see anything but maybe I went too fast on the manual), like splitting the read into two reads and for each of them the starting and ending positions?

    Thanks in advance.

    S.

  • #2
    Note the -split option.

    Comment


    • #3
      Thanks, I don't know why, I was focusing on the STAR output... Indeed, genomeCoverageBed has the option itself... I will test it.

      Another stupid (probably) question (but since now, I start doubting about my pipeline). When I use intersectBed, it will take the starting point of the mapped read, no? So, in this case, there is no problem with the gapped alignments? Meaning, then if I use the -split option, it may count twice the reads, no? Or I'm totally wrong?
      Last edited by SylvainL; 03-13-2015, 05:13 AM.

      Comment


      • #4
        Alignments in a BAM file describe intervals, not just single points, so it'll take the whole thing. Note again the -split command, which should be present for any bedtools command that accepts BAM files.

        Comment


        • #5
          Ok, so I missunderstood. Then, you advice me to use -split option as well for intersectBed...

          I quickly did a test for genomeCoverageBed using -split option (I want to generate genomic bedgraphs) and it still seems to not split the gapped reads... I'm getting lost (sic)

          Comment


          • #6
            Well, you can use the -split option anytime you're dealing with spliced alignments, since you don't usually care if the spliced portion happens to overlap something (thereby increasing coverage there). If bedtools isn't handling spliced alignments correctly then that's a bug. A better question is what you're trying to achieve and if there's a simpler way.

            Comment


            • #7
              Ok,

              I am using my own gene model. I made a bed files with the regions of interest.

              I want to do 2 things:
              - get the item counts (considering as well the gapped reads but only once)
              - get the genome coverage (per base) to allow the reasearcher to have a fast look on IGV

              Until now, I was using
              bamToBed -i *.bam | intersectBed -a region_of_interest.bed -b stdin -c > counts to get the first one
              and
              genomeCoverageBed -ibam *.bam-bg -strand + -g ${Refname}_chromInfo.txt > bedgraph_plus

              I changed the second by genomeCoverageBed -ibam *.bam-bg -strand + -split -g ${Refname}_chromInfo.txt > bedgraph_plus

              Hope I'm clear enough

              Comment


              • #8
                I'm thinking about one more thing. I'm generating the bam file directly from STAR. Can it be a problem?

                Comment


                • #9
                  Generating the BAM file from STAR is fine (STAR isn't the cause of any of your problems).

                  For the counts, is there a reason you're not just using featureCounts? Granted, it takes a GTF file rather than a BAM file, but the conversion is simple enough and you can use that to double check intersectBed (since featureCounts is designed around RNAseq).

                  Anyway, I would suspect that adding -split to what you have would cause things to function as they should, but perhaps there's a bug in bedtools somewhere.

                  Comment


                  • #10
                    No particular reason to not use featureCounts. I will have a look...

                    I am using bedtools version v2.14.2. Maybe I should update it...

                    Comment


                    • #11
                      Highly recommended

                      Comment


                      • #12


                        Updating bedtools fixed the problem !!!

                        Thanks a lot dpryan

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM
                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        27 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        31 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        27 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X