No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie settings and SAM to GFF

    Hello everyone,

    I'm pretty new to bioinformatics so I will probably start asking a lot of questions on here. Hopefully in time I'll be able to start answering some as well.

    I have two connected questions:

    Question 1:
    I'm trying to align siRNAs to a genome using Bowtie. I have created the index for the genome and have aligned the siRNAs of the various files numerous times with different settings. My question is what settings would be ideal to use for the sequences I'm trying to align? I've tried different ones and all have given me a variable number of output alignments. Is there a good standard set of alignment parameters for these siRNAs?

    The files to be aligned contain thousands of sequences in FASTA format and are around 16-24 base pairs long.

    I've been using this command lately:
    bowtie -t -p 2 (genome index) -f -a -v 0 (FASTA file to align) --sam-nosq (output file)

    Question 2:
    My end goal is to create three column BED files (chromosome, start position, and stop position) for the alignments using the output of Bowtie. What would be the best way to go about doing this? I've been trying to find one, but have been unsuccessful.


  • #2
    Hi Brandon,

    have a look at the Vancouver short read package

    In this directory

    you ll find

    It worked very nicely for me.



    • #3

      Thank you so much for the recommendation. I've been trying to get it to work, but I've been experiencing problems. I was wondering if you have experienced this? Looks like something is wrong with my Java, but it is installed and working.

      bldf2b@kc-bio-bsb319debian:~/Desktop/VancouverShortR-4.0.10/conversion_util$ java -jar BowtieToBedFormat.jar -input /home/bldf2b/Desktop/ -output /home/bldf2b/Desktop/ -name testfile
      Exception in thread "main" java.lang.NoSuchMethodError: method<init> with signature (Ljava.lang.String; )V was not found.
      at src.lib.ioInterfaces.Log_Buffer.addLogFile(
      at src.fileUtilities.BowtieToBedFormat.main(


      • #4

        so what java version are you using ?

        java -version

        Are you using Sun java ?

        I had no problems with
        java version "1.6.0_16"
        Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
        Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)



        • #5
          Originally posted by DrD2009 View Post
          Is there a good standard set of alignment parameters for these siRNAs?
          I'll defer to others on this - I've never tried it. Still, I would recommend simply trying several reasonable sets of parameters and evaluating whether you're getting the desired/best combination of speed and sensitivity. If you have a huge set of input reads and are tired of waiting for long experiments to finish, use -u to select a subset of them.



          • #6
            To Colin:

            That was the problem. I tried to install Java onto Debian following the manual for installation, but I guess it didn't take because it was still using the Free Enterprise version when I ran "java -version". After trying to figure it out on Linux I gave up on it in the interest of time and was able to get it running on Windows. It's currently processing a file for me.

            To Ben:

            Thank you for getting back with me. I'll continue playing with the parameters.


            • #7

              It was taking a while to process the file so I created a sam file with just a few lines (~10) and ran the BowtieToBedFormat tool. It created all of the files, but the files were all blank, any ideas?

              C:\Documents and Settings\bldf2b\Desktop\VancouverShortR-4.0.11\conversion_util>
              java -jar BowtieToBedFormat.jar -input "C:\Documents and Settings\bldf2b\Desktop
              \trial.txt" -output "C:\Documents and Settings\bldf2b\Desktop" -name trial
              Version: Initializing class Log_Buffer $Revision: 1643 $
              Info: Log File: C:\Documents and Settings\bldf2b\Desktop\trial.log
              Version: Vancouver Short Read Analysis Package 4.0.11
              Version: Initializing class MaqPetToBedFormat $Revision: 1571 $
              Info: * Output directory : C:\Documents and Settings\bldf2b\Desktop\
              Info: * Name : trial
              Info: Flags expected "/1" and "/2"
              Version: Initializing class Histogram $Revision: 1197 $
              Version: Initializing class BowtieIterator $Revision: 1880 $
              Info: Processing file...
              Info: Creating FileWriters
              Version: Initializing class FileOut $Revision: 468 $
              Info: processing paired reads
              Info: Number of keys to process: 1
              Info: Total unpaired name: 1
              Info: Total simple name: 0
              Info: Total forked name: 0
              Info: Total complex name 0

              Could it be the way the file is processed from bowtie?


              • #8

                I really haven t used this extensively - I didn t need to happily.

                Perhaps the name
                needs to occur on every line of the reads file ?
                Some programs need a chromosome name for organisms with multiple chroms.

                The only other thing I can think of is putting all data in a simple directory like


                and running from there. Spaces can cause a LOT of problems with scripts.


                Latest Articles


                • seqadmin
                  Advanced Tools Transforming the Field of Cytogenomics
                  by seqadmin

                  At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                  09-26-2023, 06:26 AM
                • seqadmin
                  How RNA-Seq is Transforming Cancer Studies
                  by seqadmin

                  Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                  09-07-2023, 11:15 PM
                • seqadmin
                  Methods for Investigating the Transcriptome
                  by seqadmin

                  Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                  Whole Transcriptome RNA-seq
                  Whole transcriptome sequencing...
                  08-31-2023, 11:07 AM





                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:57 AM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 09-26-2023, 07:53 AM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 09-25-2023, 07:42 AM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 09-22-2023, 09:05 AM
                0 responses
                Last Post seqadmin