GSNAP gives Bus Error: 10

unicornich

Member

Join Date: Apr 2014
Posts: 10

GSNAP gives Bus Error: 10

08-11-2014, 04:31 AM

Hello everypony!

I am using GSNAP to map my RNA-seq paired-end reads to a reference genome. It used to run normally (a few months ago), but I needed to remap some stuff using the exact same command line as before and now GSNAP decided not to work anymore.
It starts the alignment normally and then after a short while gives out Bus error:10.
This is how it looks like:

Code:

gsnap -d oregonR_reference --quality-protocol illumina -N 1 -s /Volumes/Temp/Anna/reference/oregonR_reference/oregonR_reference.maps/dmel-all-transcript-r5.49-Parent1.iit -t 20 -A sam --split-output dmel_oregonR_t13_rep1 /Volumes/Temp/Anna/reads/trimmed_reads/oregon/dmel_oregonR_t13_rep1_1  /Volumes/Temp/Anna/reads/trimmed_reads/oregon/dmel_oregonR_t13_rep1_2 1 > log13r1.txt 2 > err13r1.txt
GSNAP version 2014-02-28 called with args: gsnap -D /Volumes/Temp/Anna/reference/ -d oregonR_reference --quality-protocol illumina -N 1 -s /Volumes/Temp/Anna/reference/oregonR_reference/oregonR_reference.maps/dmel-all-transcript-r5.49-Parent1.iit -t 20 -A sam --split-output dmel_oregonR_t13_rep1 /Volumes/Temp/Anna/reads/trimmed_reads/oregon/dmel_oregonR_t13_rep1_1 /Volumes/Temp/Anna/reads/trimmed_reads/oregon/dmel_oregonR_t13_rep1_2 1 2
Checking compiler assumptions for popcnt: 000041A7 clz=17 clz=0 popcount=7 
Checking compiler assumptions for SSE2: 000041A7 10D63AF1 xor=10D67B56
Checking compiler assumptions for SSE4.1: -89 -15 max=241
Novel splicing (-N) and known splicing (-s) both turned on => assume reads are RNA-Seq
Note: >1 sequence detected, so index files are being memory mapped.
  GSNAP can run slowly at first while the computer starts to accumulate
  pages from the hard disk into its cache.  To copy index files into RAM
  instead of memory mapping, use -B 3, -B 4, or -B 5, if you have enough RAM.
  For more speed, also try multiple threads (-t <int>), if you have multiple processors or cores.
Pre-loading compressed genome (oligos).....,...,...,...,...,...,...,...,..done (63,276,204 bytes, 15449 pages, 0.17 sec)
Pre-loading compressed genome (bits).....,...,...,...,...,...,...,...,..done (63,276,204 bytes, 15449 pages, 0.16 sec)
Pre-loading suffix array...............................................................................................................................,............................................................................................................................................done (674,946,152 bytes)
Looking for index files in directory /Volumes/Temp/Anna/reference//oregonR_reference
  Pointers file is oregonR_reference.ref12153bitpackptrs
  Offsets file is oregonR_reference.ref12153bitpackcomp
  Positions file is oregonR_reference.ref153positions
Offsets compression type: bitpack
Allocating memory for ref offset pointers, kmer 15, interval 3...done (134,217,736 bytes, 1.45 sec)
Allocating memory for ref offsets, kmer 15, interval 3...done (226,957,088 bytes, 2.48 sec)
Pre-loading ref positions, kmer 15, interval 3........................................................................................done (215,791,212 bytes, 52684 pages, 0.60 sec)
Reading splicing file /Volumes/Temp/Anna/reference/oregonR_reference/oregonR_reference.maps/dmel-all-transcript-r5.49-Parent1.iit locally...found donor and acceptor tags, so treating as splicesites file
splice distances present...37770 unique splicesites...
Non-standard nucleotide N near splice site YHet_Parent1:291284.  Discarding...
37769 splicesites are valid...splicetrie_obs has 37773 entries...splicetrie_max has 3858412 entries...done
GMAP modes: pairsearch, indel_knownsplice, terminal, improvement
Starting alignment
Bus error: 10

Does anybody know what could be wrong this time and how to fix it?

Thanks in advance!

Ana Marija

Tags: None

unicornich

Member

Join Date: Apr 2014

Posts: 10
- Share
- Tweet
#2

08-20-2014, 07:53 AM

So for those of you who ever come across this type of very uninformative error message here is how I have found the cause of it:

Since all my fastq files but one gave no error messages after mapping except for one fastq file I went on to a binary search through my problematic fastq file to find the problem because I assumed the problem is not in the mapper and all the standard fastq checks gave no clue of what was wrong.
So, the way I did this "binary search" is I had split my file(s) in half, reran mapping on both halves and whichever half gave an error, I split it again and redo the procedure until finally I got only two reads in my final fastq file.
After 24 iterations, I got a tiny fastq file (which was still giving me the Bus error: 10) containing 2 reads, one of which looked normal, and another which looked like a microsatellite read.
So I took the microsat read, remapped it by itself, and this time it gave a different error:

Code:

Paired-end accessions FCD20FCACXX:2:1302:15509:87068#ATCACGAT/2 and FCD20FCACXX:2:1302:15509:87068#ATCACGAT/1 do not match

When I remapped the other "normal" read, it mapped normally, with no errors.

So obviously, the microsat read was the one causing the problem.
I tried remapping it again but after removing the first nucleotide in one of the pair reads and it's quality so I made both read sequences complementary again. After doing this, the mapping worked perfectly, with no errors.

So there is a weird issue in GSNAP-2014-02-28 with complementarity of microsat paired reads.

What is the reason for it and why GSNAP gives two different error messages if the reads are mapped with other reads or by themselves, I have no idea.
But at least this could be a hint for someone else out there who has the same problem I had.

To half my fastq I just used

Code:

split -l n dmel_oregonR_t13_rep1_1 splitrep1_1 split -l n dmel_oregonR_t13_rep1_2 splitrep1_2 #the output is 2 files with aa and ab extension: splitrep1_1aa & splitrep1_1ab

where n is the number of lines of the fastq file divided by 2.

And that's it!

Cheers,

Ana Marija
Comment

Previous template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

GSNAP gives Bus Error: 10

Comment

Latest Articles

ad_right_rmr

News