Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie settings and SAM to GFF

    Hello everyone,

    I'm pretty new to bioinformatics so I will probably start asking a lot of questions on here. Hopefully in time I'll be able to start answering some as well.

    I have two connected questions:

    Question 1:
    I'm trying to align siRNAs to a genome using Bowtie. I have created the index for the genome and have aligned the siRNAs of the various files numerous times with different settings. My question is what settings would be ideal to use for the sequences I'm trying to align? I've tried different ones and all have given me a variable number of output alignments. Is there a good standard set of alignment parameters for these siRNAs?

    The files to be aligned contain thousands of sequences in FASTA format and are around 16-24 base pairs long.

    I've been using this command lately:
    bowtie -t -p 2 (genome index) -f -a -v 0 (FASTA file to align) --sam-nosq (output file)


    Question 2:
    My end goal is to create three column BED files (chromosome, start position, and stop position) for the alignments using the output of Bowtie. What would be the best way to go about doing this? I've been trying to find one, but have been unsuccessful.


    Thanks,
    Brandon

  • #2
    Hi Brandon,

    have a look at the Vancouver short read package

    In this directory
    VancouverShortR-4.0.7/conversion_util

    you ll find
    BowtieToBedFormat.jar

    It worked very nicely for me.

    Colin

    Comment


    • #3
      Colin,

      Thank you so much for the recommendation. I've been trying to get it to work, but I've been experiencing problems. I was wondering if you have experienced this? Looks like something is wrong with my Java, but it is installed and working.

      bldf2b@kc-bio-bsb319debian:~/Desktop/VancouverShortR-4.0.10/conversion_util$ java -jar BowtieToBedFormat.jar -input /home/bldf2b/Desktop/rdr6.map -output /home/bldf2b/Desktop/ -name testfile
      Exception in thread "main" java.lang.NoSuchMethodError: method java.io.PrintStream.<init> with signature (Ljava.lang.String; )V was not found.
      at src.lib.ioInterfaces.Log_Buffer.addLogFile(Log_Buffer.java:193)
      at src.fileUtilities.BowtieToBedFormat.main(BowtieToBedFormat.java:76)

      Comment


      • #4
        Hi,

        so what java version are you using ?

        type
        java -version

        Are you using Sun java ?

        I had no problems with
        java version "1.6.0_16"
        Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
        Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

        cheers

        Comment


        • #5
          Originally posted by DrD2009 View Post
          Is there a good standard set of alignment parameters for these siRNAs?
          I'll defer to others on this - I've never tried it. Still, I would recommend simply trying several reasonable sets of parameters and evaluating whether you're getting the desired/best combination of speed and sensitivity. If you have a huge set of input reads and are tired of waiting for long experiments to finish, use -u to select a subset of them.

          Thanks,
          Ben

          Comment


          • #6
            To Colin:

            That was the problem. I tried to install Java onto Debian following the manual for installation, but I guess it didn't take because it was still using the Free Enterprise version when I ran "java -version". After trying to figure it out on Linux I gave up on it in the interest of time and was able to get it running on Windows. It's currently processing a file for me.


            To Ben:

            Thank you for getting back with me. I'll continue playing with the parameters.

            Comment


            • #7
              Colin,

              It was taking a while to process the file so I created a sam file with just a few lines (~10) and ran the BowtieToBedFormat tool. It created all of the files, but the files were all blank, any ideas?

              C:\Documents and Settings\bldf2b\Desktop\VancouverShortR-4.0.11\conversion_util>
              java -jar BowtieToBedFormat.jar -input "C:\Documents and Settings\bldf2b\Desktop
              \trial.txt" -output "C:\Documents and Settings\bldf2b\Desktop" -name trial
              Version: Initializing class Log_Buffer $Revision: 1643 $
              Info: Log File: C:\Documents and Settings\bldf2b\Desktop\trial.log
              Version: Vancouver Short Read Analysis Package 4.0.11
              Version: Initializing class MaqPetToBedFormat $Revision: 1571 $
              Info: * Output directory : C:\Documents and Settings\bldf2b\Desktop\
              Info: * Name : trial
              Info: Flags expected "/1" and "/2"
              Version: Initializing class Histogram $Revision: 1197 $
              Version: Initializing class BowtieIterator $Revision: 1880 $
              Info: Processing file...
              Info: Creating FileWriters
              Version: Initializing class FileOut $Revision: 468 $
              Info: processing paired reads
              Info: Number of keys to process: 1
              Info: Total unpaired name: 1
              Info: Total simple name: 0
              Info: Total forked name: 0
              Info: Total complex name 0

              Could it be the way the file is processed from bowtie?

              Comment


              • #8
                Hi,

                I really haven t used this extensively - I didn t need to happily.

                Perhaps the name
                trial
                needs to occur on every line of the reads file ?
                Some programs need a chromosome name for organisms with multiple chroms.

                The only other thing I can think of is putting all data in a simple directory like

                c:\temp\

                and running from there. Spaces can cause a LOT of problems with scripts.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 11:49 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-24-2024, 08:47 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                61 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Working...
                X