Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dpryan
    replied
    It's difficult to diagnose your problem when you use multiple pipes. Just run the enumerate command and save that to a file, then try the subsequent commands one at a time and then post which didn't work (probably sort-bed or bedops) and some of the input you're giving it.

    Leave a comment:


  • krafiq
    replied
    I just tried building a bowtie index using the genome.fa file (without splitting it) and it gave me 4 .ebwt files. And then I used the script:

    ./enumerateUniquelyMappableSpace.pl 50 Genome | sort-bed - | bedops -m - > Genome.50.mappable_only.bed

    but it gave the following error:

    Failed to read 50
    Warning: Could not find any reads in "-"
    # reads processed: 0
    # reads with at least one reported alignment: 0 (0.00%)
    # reads that failed to align: 0 (0.00%)
    No alignments

    Leave a comment:


  • dpryan
    replied
    Originally posted by krafiq View Post
    EricHaugen: What's the "bowtie_index_prefix" in the 2nd option you gave me above?
    It's the prefix for the output index files, so it can be anything you want. Normal examples would be "hg19", "mm9" and "mm10", for human and two mouse genome versions. When bowtie is later invoked to do alignments, this same prefix is given to it so it knows what to align things against.

    Leave a comment:


  • dpryan
    replied
    Originally posted by krafiq View Post
    dpryan: I'm sorry-could you please clarify a bit as to what exactly I should do?
    Also, is there a way to get the source code for bowtie?
    The source code for bowtie is available on its website here or here.

    Regarding the remainder, enumerateUniquelyMappableSpace is just a perl script that executes a few other commands, some of which won't work for you because of how the script is structured. I already deleted the Hotspot code (I don't use it) so I can't immediately give you exact changes, but the gist is that you can just edit the code to have bowtie-build index genome.fa rather than a too-long list of scaffold.fa files. There may be a few other lines that will throw errors for similar reasons and you can likely use the same strategy. This all assumes that you know enough to edit a bit of code, of course.

    Leave a comment:


  • krafiq
    replied
    EricHaugen: What's the "bowtie_index_prefix" in the 2nd option you gave me above?

    Leave a comment:


  • krafiq
    replied
    dpryan: I'm sorry-could you please clarify a bit as to what exactly I should do?
    Also, is there a way to get the source code for bowtie?

    Leave a comment:


  • dpryan
    replied
    Well, if you read through the perl scripts, it'll become pretty apparent that they only designed hotspot around human/mouse/etc. genomes (rather than your situation with scaffolds), so you're probably going to have to just edit the script. It's just trying to run bowtie-build, which will effectively concatentate everything together anyway (it looks like they normally use individual chromosome files so things can more easily be split to later run on a cluster). The script is pretty simple, so go ahead and change it to suite your usage needs.

    Leave a comment:


  • krafiq
    replied
    dpryan: I had to split the genome.fa file to get the individual files in the first place to use the hotspot software. should i still cat them? won't that bring it back to genome.fa?

    Leave a comment:


  • dpryan
    replied
    Try concatenating the files together first (you'll have to do it in a couple batches, since the command will be too long for "cat" too) and just use the multifasta file.

    Leave a comment:


  • krafiq
    replied
    My enumerateUniquelyMappableSpace script is calling this line:
    bowtie-build $chromosomeFiles $genome

    The chromosome files variable in this case is a list of 30,000 file names. So when I run the script, it gives me the following error:

    Argument list too long

    is there a way around this?

    Leave a comment:


  • sivasubramani
    replied
    In HHblits package, there is a script which does the way you want.

    HHblits_src/scripts/splitfasta.pl

    Leave a comment:


  • EricHaugen
    replied
    It looks like "bowtie-build" isn't in your PATH, so the shell couldn't find it.

    Try adding a line near the top of "enumerateUniquelyMappableSpace" like:

    export PATH=$PATH:/location/of/this/script/folder

    Then it should be able to find bowtie-build, and the Perl script it calls later will be able to find your bowtie executable there also.

    Leave a comment:


  • krafiq
    replied
    Thanks all!!

    EricHaugen: I'm trying option 1 for now. I'm trying to run the script again with bowtie and bowtie-build in the same folder as the script. But it's giving me the following error:
    ./enumerateUniquelyMappableSpace: line 30: bowtie-build: command not found

    And then it goes on to give the following error multiple times:
    Failed to find bowtie index file Genome.1.ebwt

    Does anyone know why and what I should change?

    Thanks!
    Last edited by krafiq; 07-25-2013, 09:02 PM.

    Leave a comment:


  • fengqi
    replied
    Did you try
    'samtools faidx genome.fasta chrX > chrX.fasta'

    Leave a comment:


  • EricHaugen
    replied
    Two options:

    1. Change "chr" to "scaffold" in the enumerateUniquelyMappableSpace bash wrapper script, to list the individual fasta files.

    2. Just run the whole genome fasta file, after building a bowtie index, with:

    enumerateUniquelyMappableSpace.pl read_length bowtie_index_prefix genome.fa | sort-bed - | bedops -m - > genome.read_length.mappable_only.bed

    If "sort-bed" runs out of memory here, the BEDOPS suite includes a "bbms" script that can be used in place of sort-bed.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    Today, 01:16 PM
  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 07:15 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 10:28 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 07:35 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-22-2024, 02:06 PM
0 responses
8 views
0 likes
Last Post seqadmin  
Working...
X