Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie settings and SAM to GFF

    Hello everyone,

    I'm pretty new to bioinformatics so I will probably start asking a lot of questions on here. Hopefully in time I'll be able to start answering some as well.

    I have two connected questions:

    Question 1:
    I'm trying to align siRNAs to a genome using Bowtie. I have created the index for the genome and have aligned the siRNAs of the various files numerous times with different settings. My question is what settings would be ideal to use for the sequences I'm trying to align? I've tried different ones and all have given me a variable number of output alignments. Is there a good standard set of alignment parameters for these siRNAs?

    The files to be aligned contain thousands of sequences in FASTA format and are around 16-24 base pairs long.

    I've been using this command lately:
    bowtie -t -p 2 (genome index) -f -a -v 0 (FASTA file to align) --sam-nosq (output file)


    Question 2:
    My end goal is to create three column BED files (chromosome, start position, and stop position) for the alignments using the output of Bowtie. What would be the best way to go about doing this? I've been trying to find one, but have been unsuccessful.


    Thanks,
    Brandon

  • #2
    Hi Brandon,

    have a look at the Vancouver short read package

    In this directory
    VancouverShortR-4.0.7/conversion_util

    you ll find
    BowtieToBedFormat.jar

    It worked very nicely for me.

    Colin

    Comment


    • #3
      Colin,

      Thank you so much for the recommendation. I've been trying to get it to work, but I've been experiencing problems. I was wondering if you have experienced this? Looks like something is wrong with my Java, but it is installed and working.

      bldf2b@kc-bio-bsb319debian:~/Desktop/VancouverShortR-4.0.10/conversion_util$ java -jar BowtieToBedFormat.jar -input /home/bldf2b/Desktop/rdr6.map -output /home/bldf2b/Desktop/ -name testfile
      Exception in thread "main" java.lang.NoSuchMethodError: method java.io.PrintStream.<init> with signature (Ljava.lang.String; )V was not found.
      at src.lib.ioInterfaces.Log_Buffer.addLogFile(Log_Buffer.java:193)
      at src.fileUtilities.BowtieToBedFormat.main(BowtieToBedFormat.java:76)

      Comment


      • #4
        Hi,

        so what java version are you using ?

        type
        java -version

        Are you using Sun java ?

        I had no problems with
        java version "1.6.0_16"
        Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
        Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

        cheers

        Comment


        • #5
          Originally posted by DrD2009 View Post
          Is there a good standard set of alignment parameters for these siRNAs?
          I'll defer to others on this - I've never tried it. Still, I would recommend simply trying several reasonable sets of parameters and evaluating whether you're getting the desired/best combination of speed and sensitivity. If you have a huge set of input reads and are tired of waiting for long experiments to finish, use -u to select a subset of them.

          Thanks,
          Ben

          Comment


          • #6
            To Colin:

            That was the problem. I tried to install Java onto Debian following the manual for installation, but I guess it didn't take because it was still using the Free Enterprise version when I ran "java -version". After trying to figure it out on Linux I gave up on it in the interest of time and was able to get it running on Windows. It's currently processing a file for me.


            To Ben:

            Thank you for getting back with me. I'll continue playing with the parameters.

            Comment


            • #7
              Colin,

              It was taking a while to process the file so I created a sam file with just a few lines (~10) and ran the BowtieToBedFormat tool. It created all of the files, but the files were all blank, any ideas?

              C:\Documents and Settings\bldf2b\Desktop\VancouverShortR-4.0.11\conversion_util>
              java -jar BowtieToBedFormat.jar -input "C:\Documents and Settings\bldf2b\Desktop
              \trial.txt" -output "C:\Documents and Settings\bldf2b\Desktop" -name trial
              Version: Initializing class Log_Buffer $Revision: 1643 $
              Info: Log File: C:\Documents and Settings\bldf2b\Desktop\trial.log
              Version: Vancouver Short Read Analysis Package 4.0.11
              Version: Initializing class MaqPetToBedFormat $Revision: 1571 $
              Info: * Output directory : C:\Documents and Settings\bldf2b\Desktop\
              Info: * Name : trial
              Info: Flags expected "/1" and "/2"
              Version: Initializing class Histogram $Revision: 1197 $
              Version: Initializing class BowtieIterator $Revision: 1880 $
              Info: Processing file...
              Info: Creating FileWriters
              Version: Initializing class FileOut $Revision: 468 $
              Info: processing paired reads
              Info: Number of keys to process: 1
              Info: Total unpaired name: 1
              Info: Total simple name: 0
              Info: Total forked name: 0
              Info: Total complex name 0

              Could it be the way the file is processed from bowtie?

              Comment


              • #8
                Hi,

                I really haven t used this extensively - I didn t need to happily.

                Perhaps the name
                trial
                needs to occur on every line of the reads file ?
                Some programs need a chromosome name for organisms with multiple chroms.

                The only other thing I can think of is putting all data in a simple directory like

                c:\temp\

                and running from there. Spaces can cause a LOT of problems with scripts.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Non-Coding RNA Research and Technologies
                  by seqadmin




                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                  Nobel Prize for MicroRNA Discovery
                  This week,...
                  10-07-2024, 08:07 AM
                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 10-11-2024, 06:55 AM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-02-2024, 04:51 AM
                0 responses
                109 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-01-2024, 07:10 AM
                0 responses
                114 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-30-2024, 08:33 AM
                1 response
                119 views
                0 likes
                Last Post EmiTom
                by EmiTom
                 
                Working...
                X