Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • vebaev
    replied
    Hi,
    I have followed the tread here but finally confused what option to use.

    I have a FASTA file with illumina small RNAs that are clipped, filtered and cleaned, also collapsed in unique seqs ready for mapping.

    I want to map than onto human genome but do not know that is optimal in my situation - I want to know many times a read is mapping onto the genome.
    In this case I used -a -v 0 and -a -v 1. My concerns for -v 1 is that do not know if I allow 1 mismatch a read can map also in a place that is not real? In the opposite the concern about --v 0 is that I get only 30% of the uniq seqs aligned?
    Last edited by vebaev; 08-10-2011, 02:59 PM.

    Leave a comment:


  • medalofhonour
    replied
    Using "Eland" input format in Bowtie

    Like Bowtie !
    Last edited by medalofhonour; 07-19-2011, 09:17 AM.

    Leave a comment:


  • genlyai
    replied
    Hi. I wonder if anyone can help me, as I think bowtie (0.12.7) is misbehaving.

    I'm trying to map reads to a sequence with a short duplicated stretch. The problem is that given a read that should clearly map to one repeat (i.e. it has some unique sequence flanking the repeat) sometimes maps to the wrong repeat instead.

    For instance, given the read pair

    Code:
    @HWI-ST568_0055:8:1106:17676:67081#GCCAAT/1
    ATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA
    +HWI-ST568_0055:8:1106:17676:67081#GCCAAT/1
    ggggggggggggggggggggggegeeggegedgeegeegggggdgegdge
    
    @HWI-ST568_0055:8:1106:17676:67081#GCCAAT/2
    GATATCCTGTTTGGCCCATATTCAGCTGTTCCATCTGTTCTTGGCCCTGA
    +HWI-ST568_0055:8:1106:17676:67081#GCCAAT/2
    ggggggggggggggggggggggggggggggggggggggggggggggbgge
    if I run

    Code:
    bowtie -q --solexa1.3-quals -v 3 --minins 100 --maxins 450 --best -k 1 -t -p 8 index_name -1 testB1.fq -2 testB2.fq
    I get

    Code:
    HWI-ST568_0055:8:1106:17676:67081#GCCAAT/1      +       seq_id      5188   ATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA       HHHHHHHHHHHHHHHHHHHHHHFHFFHHFHFEHFFHFFHHHHHEHFHEHF      0
    HWI-ST568_0055:8:1106:17676:67081#GCCAAT/2      -       seq_id      5458   TCAGGGCCAAGAACAGATGGAACAGCTGAATATGGGCCAAACAGGATATC       FHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH      0       40:G>A,43:T>C,46:A>G
    Those three mismatches should (?) make this alignment not show up, given that there is another site where this could align with no mismatches. Even stranger, if I run without the --best option

    Code:
    bowtie -q --solexa1.3-quals -v 3 --minins 100 --maxins 450 -k 1 -t -p 8 index_name -1 testB1.fq -2 testB2.fq
    I do get the "right" answer

    Code:
    HWI-ST568_0055:8:1106:17676:67081#GCCAAT/1      +       seq_id      5188   ATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA       HHHHHHHHHHHHHHHHHHHHHHFHFFHHFHFEHFFHFFHHHHHEHFHEHF      0
    HWI-ST568_0055:8:1106:17676:67081#GCCAAT/2      -       seq_id      5533   TCAGGGCCAAGAACAGATGGAACAGCTGAATATGGGCCAAACAGGATATC       FHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH      0
    Anyway, I recognize this is probably a solved problem, but I'm having a tough time understanding what's going on, so if anybody could help me understand what's up, I'd be really grateful.

    Leave a comment:


  • BioSlayer
    replied
    Originally posted by harrike View Post
    I am trying to use Bowtie to map about 300,000 reads to my reference. I use the command: bowtie -a -v 0 -p 10 -t INDEX_FILE -f READS_FILE.fasta > RESULT_FILE --un unmapped.txt.
    using the output direction via '>' seems a bit unscrupulous, what do you want to writ in the RESULT_FILE, or, why are you using the '>'?, I would consider a more direct something like:

    Code:
    bowtie -a -v 0 -p 10 -t INDEX_FILE -f READS_FILE.fasta  --un unmapped.txt
    A few days ago, I used the output direction to capture output from the --verbose flag of bowtie and in less than 8 hours, 14 GBs of space went waste compared to obtaining only a 188 MB of results in the alignment file...

    Leave a comment:


  • JimC
    replied
    Try removing the command line option -a You are going to report ALL possible alignments for all reads. if you have repetitive sequences, this could be causing the memory problem. I would set a max number of matches to some high but useful value such as 10 or 30 or 40. Try that.

    Jim

    Leave a comment:


  • harrike
    replied
    Originally posted by sdvie View Post
    I am suprised, that bowtie seems to be so memory-intensive in your case... especially with relatively few reads. Did you use a particularly large genome?

    cheers,
    Sophia
    I am using the apple genome which is about 750 mb.

    Actually, I tried the mapping several times. At the beginning, the size of the output file kept increasing and the Bowtie command only took about 2 GB RAM. After several minutes, the size stopped increasing but the RAM used by Bowtie command rise steadily to reach about 30 GB and then froze there.

    Another interesting thing is that there is a difference between output file sizes when I specified different number of cores used. i.e. the output file size is 2.66 GB when I used option '-p 10', but 2.38 GB without this option. There are about 3 M aligns different. How can this happen?
    Last edited by harrike; 05-09-2011, 07:27 AM.

    Leave a comment:


  • sdvie
    replied
    Originally posted by harrike View Post
    Thanks, Sdvie.

    It my Mac which has 16 cores and 32 GB. I don't use remote control.
    I am suprised, that bowtie seems to be so memory-intensive in your case... especially with relatively few reads. Did you use a particularly large genome?

    cheers,
    Sophia

    Leave a comment:


  • harrike
    replied
    Thanks, Sdvie.

    It my Mac which has 16 cores and 32 GB. I don't use remote control.

    Leave a comment:


  • sdvie
    replied
    Originally posted by harrike View Post
    Here maybe the best place for my question.

    I am trying to use Bowtie to map about 300,000 reads to my reference. I use the command: bowtie -a -v 0 -p 10 -t INDEX_FILE -f READS_FILE.fasta > RESULT_FILE --un unmapped.txt.

    The command couldn't be finished, an error showed up "You Mac OSX startup disk has no more space available for application memory". The Bowtie process took about 30 GB memory and froze.

    I checked my startup disk (Macintosh HD). It still has about 1 Tb space available. I don't know what's going on.

    I am using a computer with 16 cores and 32 GB memory.

    Hope somebody here can help me. Thanks in advance.
    are you executing this command from your Mac on a remote machine with 16 cores and 32 GB, or does your Mac have 16 cores and 32 GB itself ?

    The TB space available will not help if the application freezes because of lack of RAM.

    Leave a comment:


  • harrike
    replied
    Here maybe the best place for my question.

    I am trying to use Bowtie to map about 300,000 reads to my reference. I use the command: bowtie -a -v 0 -p 10 -t INDEX_FILE -f READS_FILE.fasta > RESULT_FILE --un unmapped.txt.

    The command couldn't be finished, an error showed up "You Mac OSX startup disk has no more space available for application memory". The Bowtie process took about 30 GB memory and froze.

    I checked my startup disk (Macintosh HD). It still has about 1 Tb space available. I don't know what's going on.

    I am using a computer with 16 cores and 32 GB memory.

    Hope somebody here can help me. Thanks in advance.

    Leave a comment:


  • BioSlayer
    replied
    bowtie never finishes nor read all ebwt indices

    I have been following this thread from the beginning, I have few issues... the past four days I had a running bowtie instant and it never finished, had to kill it, my command was as follows
    Code:
    bowtie hg19 -q /CombinedReads/SRR065070_Combined.fastq -S  align.map --offrate 20 -p 2
    So, looking around for possible causes I saw that an issue was registered at Sourceforge but was not followed up, I don't know if my situation in here is replicable but here are the factors that may have had some influence on that above behavior:
    01- I downloaded indices from the bowtie website and unzipped that to a directory, that is the same directory I navigated to and ran the bowtie command from. So, I suppose bowtie could automatically relate to this. I took this measure since unzipping to the index folder within bowtie could not get it to read the indices (it kept complaining that it could not find an index hg19) so I created a directory and invoked bowtie from within it.
    02- The file I get the reads from is downloaded from SRA (SRR065070), it is located in another directory from where I am calling bowtie (it is about 6 GBs) and has around 19 million reads. I used samtools to create the forward and backward reads in fastq format...
    03- My system is a Ubuntu, 32 bits, 2 GB RAM, 7 GB SWAP.
    04- The $bowtie --version output is


    bowtie version 0.12.7
    32-bit
    Built on bio-laptop
    Thu Apr 21 21:12:27 AST 2011
    Compiler: gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
    Options: -O3 -Wl,--hash-style=both
    Sizeof {int, long, long long, void*, size_t, off_t}: {4, 4, 8, 4, 4, 8}



    Then, trying to investigate the behavior somewhat deeper (--verbose), I notice that out of the 6 ebwt indices (hg19.1.ebwt...hg19.4.ebwt, hg19.1.rev.ebwt and hg19.2.rev.ebwt), only four are being read (it just doesn't open hg19.3.ebwt nor hg19.4.ebwt), I tested that by passing a query from the STDIN...
    Code:
    $ bowtie hg19 -c acgggtttaa  test.map -t --verbose
    and following is a brief excerpt from the output log
    Opening hit output file: 15:16:21
    About to initialize fw Ebwt: 15:16:21
    About to open input files: 15:16:21
    Opening "hg19.1.ebwt"
    Opening "hg19.2.ebwt"
    Finished opening input files: 15:16:21

    About to initialize rev Ebwt: 15:16:21
    About to open input files: 15:16:21
    Opening "hg19.rev.1.ebwt"
    Opening "hg19.rev.2.ebwt"
    Finished opening input files: 15:16:21
    Reading header: 15:16:21

    About to open input files: 15:16:21
    Opening "hg19.1.ebwt"
    Opening "hg19.2.ebwt"
    Finished opening input files: 15:16:21
    Reading header: 15:16:21

    About to open input files: 15:16:44
    Opening "hg19.rev.1.ebwt"
    Opening "hg19.rev.2.ebwt"
    Finished opening input files: 15:16:44


    Seeking your guidance and support with appreciation...

    Leave a comment:


  • Gators
    replied
    So I am having some weird issues building a bowtie index with the hairpin.fa file from mirbase. The file was filtered to get rid of non-human miRNAs and adjusted to get rid of spaces. But I had the same problem before this filtering, etc. was done. I built the index with all default parameters. There is no obvious error message during the building procedure, but I am not sure I would catch anything unless there was the word "error." Anyway after building the index if I align with bowtie it reports back some alignments, but not even close to all of them. The vast majority of what it reports back have 1 or 2 mismatches (v was set to 2),although there are some perfect matches there. The sequences it reports back are also GC rich, which is weird to me...I also tried to build this on another computer and it gave similar results. So clearly something weird with the fasta file...

    Any ideas?

    I should say that other indexes have been built on this machine w/o problem, as has other alignments...

    Leave a comment:


  • azer
    replied
    -v 4 may be wrong. (1-3) is ok

    Leave a comment:


  • dara
    replied
    bowtie -v option not working in version 0.12.3?

    Hi all,

    I'm finding an issue/possible bug or error on my part with one of bowtie's options, so I thought I would ask here.

    I'm trying to use bowtie -v <int> option for a colorspace alignment and everytime I do, it takes me back to the usage parameters, indicating that something is wrong with the arguments.

    I know that this works:

    bowtie -C -f -n 3 -S -t hg18.cs.bowtie reads.csfasta aln.sam

    But one I add the -v option, it doesnt, though it seems from the manual that it should:

    bowtie -C -v 4 -f -S -t hg18.cs.bowtie reads.csfasta aln.sam

    Any help/suggestions would be really appreciated.

    thank you!

    Leave a comment:


  • kerhard
    replied
    bowtie-index can use single fasta file with multiple entries as input

    woops, didn't realize that the bowtie indexer could take a single fasta file with multiple entries as input. seems to work just fine that way. because the bowtie website describes the input for bowtie-index as:

    "A comma-separated list of FASTA files containing the reference sequences to be aligned to"

    i assumed that they had to be separate files, but it looks like they don't.

    sorry for the silly question, should have tried this simple solution first.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
21 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-06-2024, 07:17 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Working...
X