Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • jsimba
    replied
    build my own reference

    Hi everybody,

    I have a problem trying to create my index with bowtie for OSX I want to use multiple fastq files but first I merge all of those files in one, when I run bowtie-build, I obtain this:

    Writing header
    Reserving space for joined string
    Joining reference sequences
    Reference file does not seem to be a FASTA file

    Then when I list the outputs I only obtain the 4 .ebwt files lacking *ebwt which are needed to run tophat.

    what is the solution of that???

    Thanks

    Leave a comment:


  • wanfahmi
    replied
    Originally posted by JackieBadger View Post
    Why don't you update to the current version of Bowtie2 and see if the problem is resolved?
    TQ for the suggestion. Already updated with new Bowtie2 and its working.

    Leave a comment:


  • JackieBadger
    replied
    Why don't you update to the current version of Bowtie2 and see if the problem is resolved?

    Leave a comment:


  • wanfahmi
    replied
    Could not find Bowtie index files ( genome.*.ebwt)

    Hi, I tried to follow the sample data (fruit fly) as suggested in the paper Trapnell et al 2012. But, it came out with this error even though the particular file already in the same directory. TQ


    [2012-10-16 13:39:15] Beginning TopHat run (v2.0.4)
    -----------------------------------------------
    [2012-10-16 13:39:15] Checking for Bowtie
    Bowtie 2 not found, checking for older version..
    Bowtie version: 0.12.8.0
    [2012-10-16 13:39:15] Checking for Samtools
    Samtools version: 0.1.18.0
    [2012-10-16 13:39:15] Checking for Bowtie index files
    Error: Could not find Bowtie index files ( genome.*.ebwt)

    Leave a comment:


  • anecsulea
    replied
    Hi again,

    So, I've done again several series of tests in which I replace the occurrences of the "read" function with "fread", and this solution seems to work fine. I haven't had any "Error reading..." messages in hundreds of tests, and the results are as expected.

    Actually the simplest way to make this change without modifying too much the source code was to force BOWTIE_MM = 0 in the make file. I've also had to manually replace some occurrences of "lseek" in ebwt.h with MM_SEEK for correct compilation (I'm surprised that Windows users - if there are any - haven't complained about this).

    Best wishes,

    Anamaria

    Leave a comment:


  • anecsulea
    replied

    Is there anything else of note about the partition/filesystem that the index files are stored on? Is it NFS? The problem seems to be that bowtie-build successfully writes the entire index, but when it then tries to read it back in *immediately*, it gets something incomplete. That *might* be Bowtie's fault, but more likely it's some combination of OS & FS.
    The system file is Lustre - I'm doing my computations on a cluster. However I should tell you that Bowtie does not only crash *immediately* after building the index - in my tests there were at least a few minutes between building the index and running Bowtie.


    If you have separate questions about Bowtie and TopHat, it's best to post them separately. Cole reads Seqanswers messages about TopHat and I read ones about Bowtie.
    Of course, I understand - however I have already posted two messages about TopHat (in the forums Bioinformatics and RNASeq), with no response yet (nor was there any response to the e-mails I've sent - sorry for insisting, I was getting a bit desperate). Plus, the questions aren't that separate, in my opinion - we're dealing with a Bowtie error that TopHat should catch but fails to do so.



    If Bowtie later successfully opens and queries that same set of index files, then they're not actually corrupt; it just appeared that way immediately after they were written, due to OS wackiness. So the TopHat results could very well be fine.
    No, they are definitely not fine. In fact I'm running TopHat on long reads (76bp) so TopHat splits them up into three segments, and then tries to map the three segments on the bowtie index of the junction sequences. It can happen that only one of the mapping attempts fails, and the other ones work, so TopHat can still confirm some junctions. Anyway, I will explain all this into more detail in my TopHat-specific posts.

    I'll keep in touch about the Bowtie problem - but if you have any other suggestions for things that I should test, please let me know, I'm running out of ideas. Thanks !

    Best,

    Anamaria

    Leave a comment:


  • Ben Langmead
    replied
    Originally posted by anecsulea View Post
    Here is the ls -l of the index files:

    ##################################

    rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:52 chr3_ensembl57.1.ebwt
    -rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:52 chr3_ensembl57.2.ebwt
    -rw-r--r-- 1 anecsule henrik 180665 Jun 1 13:47 chr3_ensembl57.3.ebwt
    -rw-r--r-- 1 anecsule henrik 47588832 Jun 1 13:47 chr3_ensembl57.4.ebwt
    -rw-r--r-- 1 anecsule henrik 205509239 May 29 15:37 chr3_ensembl57.fa
    -rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:58 chr3_ensembl57.rev.1.ebwt
    -rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:58 chr3_ensembl57.rev.2.ebwt
    Yep, looks good. The write is probably not failing and the files are probably not corrupt or incomplete.

    Originally posted by anecsulea View Post
    I'm currently testing one potential solution: I've noticed that in ebwt.h you're using the "read" function in C if BOWTIE_MM is defined (i.e. on Linux) and the "fread" function if not (i.e. on Windows). I was wondering if I would get the same errors with "fread", so I've compiled bowtie as if for Windows, and I'm doing the same tests. I'll let you know if that works ok.
    I'd be interested to know if that works.

    Is there anything else of note about the partition/filesystem that the index files are stored on? Is it NFS? The problem seems to be that bowtie-build successfully writes the entire index, but when it then tries to read it back in *immediately*, it gets something incomplete. That *might* be Bowtie's fault, but more likely it's some combination of OS & FS.

    Originally posted by anecsulea View Post
    Also, I wanted to ask you if you think it's normal that TopHat does not catch this error thrown by Bowtie. I've had several TopHat runs that finished with apparent "success", but which in fact only gave partial results because reading the Bowtie index for the junction sequences had failed. This seems quite dangerous, as most users will not check the log files for Bowtie errors if TopHat has finished succesfully.
    If you have separate questions about Bowtie and TopHat, it's best to post them separately. Cole reads Seqanswers messages about TopHat and I read ones about Bowtie.

    If Bowtie later successfully opens and queries that same set of index files, then they're not actually corrupt; it just appeared that way immediately after they were written, due to OS wackiness. So the TopHat results could very well be fine.

    Ben

    Leave a comment:


  • anecsulea
    replied
    Hi Ben,

    This is what I originally thought, but I can't see how the exact same index file can be corrupted for one run, and ok on the next one. I've run several hundreds of tests, using the same index file and the same reads file, and only a few of these bowtie jobs crash.

    Here is the ls -l of the index files:

    ##################################

    rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:52 chr3_ensembl57.1.ebwt
    -rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:52 chr3_ensembl57.2.ebwt
    -rw-r--r-- 1 anecsule henrik 180665 Jun 1 13:47 chr3_ensembl57.3.ebwt
    -rw-r--r-- 1 anecsule henrik 47588832 Jun 1 13:47 chr3_ensembl57.4.ebwt
    -rw-r--r-- 1 anecsule henrik 205509239 May 29 15:37 chr3_ensembl57.fa
    -rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:58 chr3_ensembl57.rev.1.ebwt
    -rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:58 chr3_ensembl57.rev.2.ebwt

    ##################################

    And the output :

    ##################################

    Error reading ebwt array: returned 3953400, length was 54387328
    Your index files may be corrupt; please try re-building or re-downloading.
    A complete index consists of 6 files: XYZ.1.ebwt, XYZ.2.ebwt, XYZ.3.ebwt,
    XYZ.4.ebwt, XYZ.rev.1.ebwt, and XYZ.rev.2.ebwt. The XYZ.1.ebwt and
    XYZ.rev.1.ebwt files should have the same size, as should the XYZ.2.ebwt and
    XYZ.rev.2.ebwt files.
    Command: /home/vital-it/anecsule/Tools/bowtie-0.12.3/bowtie -p 4 -q --phred33-quals -m 1 /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/index_0.12.3/chr3_ensembl57 /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/reads.txt /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/test_1_0.12.3/results_1.txt

    ##################################

    I'm currently testing one potential solution: I've noticed that in ebwt.h you're using the "read" function in C if BOWTIE_MM is defined (i.e. on Linux) and the "fread" function if not (i.e. on Windows). I was wondering if I would get the same errors with "fread", so I've compiled bowtie as if for Windows, and I'm doing the same tests. I'll let you know if that works ok.

    Also, I wanted to ask you if you think it's normal that TopHat does not catch this error thrown by Bowtie. I've had several TopHat runs that finished with apparent "success", but which in fact only gave partial results because reading the Bowtie index for the junction sequences had failed. This seems quite dangerous, as most users will not check the log files for Bowtie errors if TopHat has finished succesfully.

    Thanks again for your help !

    Leave a comment:


  • Ben Langmead
    replied
    Originally posted by anecsulea View Post
    I'm having a recurrent problem with Bowtie: it fails reading the indexes it had just built.

    Here are some details about my configuration: I'm using Bowtie 0.12.5 (but 0.12.3 gave the exact same error), on a Linux x86_64 computer.

    I get this type of error messages :

    Error reading _plen[] array: 4194272, 55604484

    Error reading ebwt array: returned 41750080, length was 168445184

    The index had been previously built by the same version of Bowtie. In fact these errors had occurred while running TopHat (which incidentally does not catch the errors thrown by Bowtie and finishes the run with "success", but does not give correct or complete results).

    The worse thing is that this error does not occur all the times: as a test, I've run Bowtie about 100 times on a toy dataset (with the exact same input reads and genome index), and Bowtie only crashed 6 times. But it does seem that it crashes more often when the input is larger.

    I don't understand what might be the problem. I'm starting to wonder if it might be because the filesystem structure is somehow corrupt on the computers I'm using. This is why I would like to know if anyone else has encountered this problem.
    Hi Anamaria,

    These types of errors occur when the files are genuinely either corrupt or incomplete (e.g. if the disk becomes exhausted during the index-building process). Can you send detailed output from one example where this happens, including a 'ls -l' on the index files after bowtie-build completes?

    Thanks,
    Ben

    Leave a comment:


  • anecsulea
    started a topic Bowtie can't read index files

    Bowtie can't read index files

    Dear all,

    I'm having a recurrent problem with Bowtie: it fails reading the indexes it had just built.

    Here are some details about my configuration: I'm using Bowtie 0.12.5 (but 0.12.3 gave the exact same error), on a Linux x86_64 computer.

    I get this type of error messages :

    Error reading _plen[] array: 4194272, 55604484

    Error reading ebwt array: returned 41750080, length was 168445184

    The index had been previously built by the same version of Bowtie. In fact these errors had occurred while running TopHat (which incidentally does not catch the errors thrown by Bowtie and finishes the run with "success", but does not give correct or complete results).

    The worse thing is that this error does not occur all the times: as a test, I've run Bowtie about 100 times on a toy dataset (with the exact same input reads and genome index), and Bowtie only crashed 6 times. But it does seem that it crashes more often when the input is larger.

    I don't understand what might be the problem. I'm starting to wonder if it might be because the filesystem structure is somehow corrupt on the computers I'm using. This is why I would like to know if anyone else has encountered this problem.

    Any comments or suggestions would be much appreciated. Thank you for your help !

    Best,

    Anamaria

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X