Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • anurupa
    Member
    • Jan 2012
    • 14

    bowtie --very-sensitive option

    hiii all,
    i have a paired end data with each read having 100nt

    i am using ./bowtie2 -q -D 20 -R 3 -N 0 -L 20 -i S,1,0.50 -x mygenome -1 a.fastq -2 b.fastq

    while using this command do i have to specify the -M 1 option as well since i want only 1 best valid alignment?

    wht is the benefit of getting the output in sam format?
  • amitm
    Member
    • Feb 2011
    • 52

    #2
    -M reporting mode

    hi,
    -M is a reporting mode which means, search for more than n (n being specified as -M n) valid alignments and report the best one. So to be sure that the one reported is the best alignment, you should pass a greater number to -M. So, -M 5 would mean, search for at least 5 valid alignments & report the best one. Passing a higher value to -M means that bowtie has extensively searched the alignment space before deciding the best one. But remember that a higher values slow down bowtie significantly.
    I used -M 5. For complex organisms with increased repeat content, it makes sense to pass such a value to to do justice to reads originated from repeat region which have possibility of >1 valid alignment. For microbes, a smaller value to -M would be fine.

    As about SAM format, it is the de facto standard format for alignment result. Getting in this format helps you to pipe the output to other downstream programs like transcript assemblers or variant callers.

    Comment

    • anurupa
      Member
      • Jan 2012
      • 14

      #3
      hii actually i find it little peculiar because when i am submitting the command like this:
      ./bowtie2 -q -D 20 -R 3 -N 0 -L 20 -i S,1,0.50 -x mygenome -1 a.fastq -2 b.fastq

      it has given me 17gb data and when i just used -M 1 it has given me 2.2 GB OUTPUT
      my refgenome is a mammalian genome and it would have a large number of repeats
      i am wondering whether this much difference would be possible as told i am doing it to align paired ends (end to end mode) and i dont want any discordants ( --no-dicordants )

      Comment

      • amitm
        Member
        • Feb 2011
        • 52

        #4
        Is your data RNA-Seq.. If yes, then you should be using TopHat aligner (or any other splice-junction aware aligner).

        What I can guess is that in your first cmdline, "-D 20 -R 3 ..... "
        you are using the bowtie2 --very-sensitive mode. Here you are asking Bowtie2 to search extensively for the best alignment for each read. This basically would mean that bowtie2 would find more than 1 valid alignment for each read and then report the best one as the default reporting mode is -M.

        But when you explicitly specify -M 1, then you are limiting bowtie2 by saying that it should search for (n+1) i.e. 2 valid alignments ONLY and then report the best one. Now, even if you use the --very-sensitive mode, passing -M 1 is essentially limiting bowtie2 from being sensitive, leave alone being very-sensitive.

        Since this is paired-end data, possibly -M 1 is returning non-discordant alignments for the mates and your no-discordant option leaves only 2.2 Gb as output.
        Moreover, you haven't passed the mate orientation option (--fr etc.) and the mate inner distance (-I and -X option). I am not sure that if bowtie2 is setting some default values for these parameters, are they optimal for your dataset.
        Pass these parameters and check. And if passing sensitive/ very-sensitive mode, leave -M option. bowtie2 would return the best alignment anyways.

        Comment

        • anurupa
          Member
          • Jan 2012
          • 14

          #5
          its not rnaseq data. i am using --no-discordant to make sure that it considers the paired end mode anyways. your description about -M has made me realize the problem. as for the -I and -X options i agree with the defaults min 0 and max 500. so i am not specifying it. thank you very much for ur replies really helpful

          Comment

          Latest Articles

          Collapse

          • GATTACAT
            Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by GATTACAT
            Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
            Yesterday, 11:43 AM
          • SEQadmin2
            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by SEQadmin2


            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

            Here are nine questions we think about, in roughly the order they matter, before...
            06-18-2026, 07:11 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-30-2026, 05:37 AM
          0 responses
          11 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-26-2026, 11:10 AM
          0 responses
          18 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-17-2026, 06:09 AM
          0 responses
          52 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          111 views
          0 reactions
          Last Post SEQadmin2  
          Working...