I don't know the following issue could be related to my bwa samse problem:
I tried to run gatk for local realignment of a bam file. I used a 1000genomes bam file and NA12878.HiSeq.WGS.bwa.cleaned.recal.b37.20.bam that is recommended on gatk documentation. I have an incompatibles contigues between the input reads and the reference files (see below). Basically, in the reference file downloaded from the cufflinks website, chromosomes are annotated as chr_nbOfChr like chr1, chrX etc and in the bam file as 1, X etc.
Does it mean that because of this (problem with gatk and bwa samse), the index file downloaded from cufflinks is not really usable with different data sets although the problem types are different (location of index file and chr annotation)?
Look forward to your advices,
Carol
------------------------------------------------------
ERROR MESSAGE: Input files reads and reference have incompatible contigs: No overlapping contigs found.
##### ERROR reads contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT, GL000207.1, GL000226.1, GL000229.1, GL000231.1, GL000210.1, GL000239.1, GL000235.1, GL000201.1, GL000247.1, GL000245.1, GL000197.1, GL000203.1, GL000246.1, GL000249.1, GL000196.1, GL000248.1, GL000244.1, GL000238.1, GL000202.1, GL000234.1, GL000232.1, GL000206.1, GL000240.1, GL000236.1, GL000241.1, GL000243.1, GL000242.1, GL000230.1, GL000237.1, GL000233.1, GL000204.1, GL000198.1, GL000208.1, GL000191.1, GL000227.1, GL000228.1, GL000214.1, GL000221.1, GL000209.1, GL000218.1, GL000220.1, GL000213.1, GL000211.1, GL000199.1, GL000217.1, GL000216.1, GL000215.1, GL000205.1, GL000219.1, GL000224.1, GL000223.1, GL000195.1, GL000212.1, GL000222.1, GL000200.1, GL000193.1, GL000194.1, GL000225.1, GL000192.1, NC_007605]
##### ERROR reference contigs = [chrM, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY]
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I found the problem but the solution doesn't help and causes another problem:
When I uncompressed the index file from cufflinks web site, it created different folders, in one of them, there was a symbolic link to the index file. So I was using the symbolic link which caused the segfault.
Now I use the right index file but bwa samse is unable to locate the index file although the path is correct:
../pgm/bwa-0.7.3a/bwa samse ~/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa SRR062641.filt.sai SRR062641.filt.fastq > SRR062641.filt.sam
[bwa_sai2sam_se] fail to locate the index
[main] Version: 0.7.3a-r367
[main] CMD: ../pgm/bwa-0.7.3a/bwa samse ~/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa SRR062641.filt.sai SRR062641.filt.fastq
ls -lt /home/carolw/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa
-rwxrwxr-x 1 carolw carolw 3157279232 Apr 26 10:52 /home/yasrebih/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa
ls -lt /home/carolw/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/
total 3083296
-rwxrwxr-x 1 carolw carolw 3157279232 Apr 26 10:52 genome.fa
-rwxrwxr-x 1 carolw carolw 3099 Apr 13 2012 genome.dict
-rwxrwxr-x 1 carolw carolw 783 Mar 15 2012 genome.fa.fai
Leave a comment:
-
No to your first question.
Yes, I am admin of the machine
I tried to install the related package to limit, but seems that it's not complete
vlimit
vlimit: vc_get_task_xid(): Function not implemented
limit
No command 'limit' found, did you mean:
Command 'vlimit' from package 'util-vserver' (universe)
limit: command not found
Leave a comment:
-
Originally posted by carolW View PostNote that no core file is generated after segmentation fault if this could answer your question.
Originally posted by carolW View PostJust a question, to compile and generated the bw exec file, I just invoked "make". Was it sufficient to compile the files?
Have you been able to successfully use the bwa program on a different data set (look for some E coli data from SRA if you need a test case) so we are sure it works otherwise.
Are you the "administrator" of this machine? What does the output of command "limit" show?
Leave a comment:
-
Note that no core file is generated after segmentation fault if this could answer your question.
Just a question, to compile and generated the bw exec file, I just invoked "make". Was it sufficient to compile the files?
Leave a comment:
-
I'm not sure to have understood your question. I don't generate files that contain "core" in the name. This is what I did
../pgm/bwa-0.7.3a/bwa samse ../hg19/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/version0.6.0/genome.fa ./SRR062641.filt.sai ./SRR062641.filt.fastq > ./SRR062641.filt.sam
Segmentation fault (core dumped)
Leave a comment:
-
Are you generating files that contain word "core" in the name after you get the seg fault?
Leave a comment:
-
Just wanted to verify that the bwa is executing properly on your system .. which it seems to be doing.
Are you issuing the samse command from the directory where you have the "SRR062641*" files? Have you tried to add ./SRR062641.filt.sai and ./SRR062641.filt.fastq to explicitly locate the files as being in current directory?
Leave a comment:
-
I get the same output
../pgm/bwa-0.7.3a/bwa samse
Usage: bwa samse [-n max_occ] [-f out.sam] [-r RG_line] <prefix> <in.sai> <in.fq>
../pgm/bwa-0.7.3a/bwa
Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.3a-r367
Contact: Heng Li <[email protected]>
Usage: bwa <command> [options]
Command: index index sequences in the FASTA format
mem BWA-MEM algorithm
fastmap identify super-maximal exact matches
pemerge merge overlapping paired ends (EXPERIMENTAL)
aln gapped/ungapped alignment
samse generate alignment (single ended)
sampe generate alignment (paired ended)
bwasw BWA-SW for long queries
fa2pac convert FASTA to PAC format
pac2bwt generate BWT from PAC
pac2bwtgen alternative algorithm for generating BWT
bwtupdate update .bwt to the new format
bwt2sa generate SA from BWT and Occ
I downloaded the HG index from http://cufflinks.cbcb.umd.edu/igenomes.html. I used it with bwa aln without any problem. I use only the following file from the uncompressed untared file
hg19/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/version0.6.0/genome.fa
Leave a comment:
-
If you just issue the following commands (highlighted in red) do you see an output similar to what is posted here:
Code:$ [COLOR="Red"]bwa samse[/COLOR] Usage: bwa samse [-n max_occ] [-f out.sam] [-r RG_line] <prefix> <in.sai> <in.fq> $ [COLOR="Red"]bwa[/COLOR] Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.4-r385 Contact: Heng Li <[email protected]> Usage: bwa <command> [options] Command: index index sequences in the FASTA format mem BWA-MEM algorithm fastmap identify super-maximal exact matches pemerge merge overlapping paired ends (EXPERIMENTAL)
Did you create the human genome index or download it from somewhere else?
Leave a comment:
-
I executed the following command many times, I didn't see bwa in the list of processes. So it doesn't even start. Is it able to evaluate the memory before starting? What prevents to start?
../pgm/bwa-0.7.3a/bwa samse ../hg19/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/version0.6.0/genome.fa SRR062641.filt.sai SRR062641.filt.fastq > SRR062641.filt.sam
Thanks,
Carol
Leave a comment:
-
Is there anything else that could be done?
Look forward to your reply,
Carol
Leave a comment:
-
Originally posted by GenoMax View PostAre you using 32-bit or 64-bit OS? Did you compile bwa on this machine yourself?
BWA-MEM is a new alignment algorithm (for long reads). See pre-print here: http://arxiv.org/abs/1303.3997
Mastal thought that the segmentation fault comes from the number of bp and suggested me to use bwa mem because there are more than 100bp in the seqence based on the output that was generated from bwa aln (see below). Do you agree with this argument?
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] 109811 sequences have been processed.
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben MartÃnez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 11-01-2024, 06:09 AM
|
0 responses
30 views
0 likes
|
Last Post
by seqadmin
11-01-2024, 06:09 AM
|
||
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, 10-30-2024, 05:31 AM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
10-30-2024, 05:31 AM
|
||
Started by seqadmin, 10-24-2024, 06:58 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
10-24-2024, 06:58 AM
|
||
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types
by seqadmin
Started by seqadmin, 10-23-2024, 08:43 AM
|
0 responses
57 views
0 likes
|
Last Post
by seqadmin
10-23-2024, 08:43 AM
|
Leave a comment: