assembly output format for visualisation
Another question!
I want to view my assembly with other software e.g. Tablets, Mauve or OSlay, etc. Is there any way in Ray to convert the output files to ACE, MAQ, SAM or BAM format for those post-assembly programs?
In the FAQ section of your site there is question about the AMOS format for the output, but I did not do that. Do I have to run the assembly again and have the -amos option on? But unfortunately the AMOS format is not universal for other programs to read.
I was trying to figure out what the output files are about, but not sure which one I should use for those visualization programs, or which one should be used for perl/shell script for the format converstion.
Appreciate if you could give me any clue.
Thanks!
YT
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by yifangt View PostHello Sebastien:
Can I ask what is the maximum value that I can set MAXKMERLENGTH for Ray 2.1? So far I can see the largest value ever set is 128, and for Velvet I ever saw is 151. Somewhere I saw "There is an arbitary number that can be set for MAXKMERLENGTH", but could not find the link anymore. Can I confirm the max value for Ray I can set? Thanks a lot!
YT
However, read lengths and sequencing errors will be limiting factors here.
Leave a comment:
-
maximum kmer length?
Hello Sebastien:
Can I ask what is the maximum value that I can set MAXKMERLENGTH for Ray 2.1? So far I can see the largest value ever set is 128, and for Velvet I ever saw is 151. Somewhere I saw "There is an arbitary number that can be set for MAXKMERLENGTH", but could not find the link anymore. Can I confirm the max value for Ray I can set? Thanks a lot!
YT
Leave a comment:
-
Originally posted by yaximik View PostIts two quad core Xeon E5620 with 96GB memory and nVIDIA NV 300 in the double display mode.
Is your user experience with Hawkeye or Tablet problematic only with AMOS files generated by Ray or the issue is also occurring with AMOS files generated by other tools ?
Leave a comment:
-
When the AMOS file format was implemented, I tested Hawkeye, Tablet, and Bank-transact.
You can submit a ticket and I will eventually look at that, but this feature has not really changed since it was implemented.
What is the hardware (memory, processor, video card) on which you are running Hawkeye ?
Leave a comment:
-
Originally posted by yaximik View PostHow the average length is calculated?
I guess after reads are aligned to assembly, correct?
But I thought that assembly depends on paired end infomation, so unless I am wrong one has a logical short circuit here - paired reads are distanced based on assembly, which depends on distance between paired reads.
I like your short circuit.
*
It is not that I am maliciously after how algorithm was designed.
I am trying to guess where such discrepancy between Bioanalyzer and assembler is coming from. Could it be that Bioanalyzer traces for libraries are so misleading, so I have really no idea about size of libraries I am sequencing?
Or autocalc is misled somehow in library size estimation?
Leave a comment:
-
Originally posted by yaximik View PostTried to view AMOS.afg file (37.1 GB) using a couple of programs. Tablet is painfully slow, but it eventually quit reporting error in some line. Hawkeye (AMOS package) successfully imported assembly in bank. and even opened graphic window showing contig 1, but then hung forever and has to be killed.
Code:[yaximik@G5NNJN1 ~]$ hawkeye START DATE: Mon Mar 11 11:06:54 2013 Bank is: /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk 0% 100% AFG .................................................. Messages read: 175403161 Objects added: 175403161 Objects deleted: 0 Objects replaced: 0 END DATE: Mon Mar 11 12:13:09 2013 Opening /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk... [160.12s] Indexing Contigs .......... [83.11s] 107326772 reads in 1409913 contigs Scaffold information not available Mates not available:WHAT: Could not open bank file, /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk/FRG.ifo, No such file or directory LINE: 1264 FILE: Bank_AMOS.cc Features not available Initialize Display .Loading AssemblyStats...[8.95s] .Loading Features... [0.01s] .Loading Libraries... [0.00s] .Loading Scaffolds....Loading Contigs... [186.21s] ....Loading NCharts... [21.83s] . [217.01s] Loading Contig 1... [0.05s] 109076 reads Loading reads... [343.52s] Total Load Time: [803.92s] Loading mates .................................................. inserts: 108933 mated: 0 matelisted: 0 unmated: 108933 happy: 0 unhappy: 0 Paint: coverage contigs insetcovfeat readcovfeat features inserts width: 12457 swidth: 778 height: 26357.. Killed [yaximik@G5NNJN1 ~]$
You can submit a ticket and I will eventually look at that, but this feature has not really changed since it was implemented.
What is the hardware (memory, processor, video card) on which you are running Hawkeye ?
For visualization, I am working on Ray Cloud Browser.
Leave a comment:
-
Quote:
If not, is it an average fragment length in the library?
Yes.
Quote:
Such as surmised from BioAnalyzer trace, for example?
Yes, but the BioAnalyzer will also include sequencing adapters in the evaluation whereas these are not included in sequencing reads usually.
It is not that I am maliciously after how algorithm was designed. I am trying to guess where such discrepancy between Bioanalyzer and assembler is coming from. Could it be that Bioanalyzer traces for libraries are so misleading, so I have really no idea about size of libraries I am sequencing? Or autocalc is misled somehow in library size estimation?
Leave a comment:
-
Tried to view AMOS.afg file (37.1 GB) using a couple of programs. Tablet is painfully slow, but it eventually quit reporting error in some line. Hawkeye (AMOS package) successfully imported assembly in bank. and even opened graphic window showing contig 1, but then hung forever and has to be killed.
Code:[yaximik@G5NNJN1 ~]$ hawkeye START DATE: Mon Mar 11 11:06:54 2013 Bank is: /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk 0% 100% AFG .................................................. Messages read: 175403161 Objects added: 175403161 Objects deleted: 0 Objects replaced: 0 END DATE: Mon Mar 11 12:13:09 2013 Opening /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk... [160.12s] Indexing Contigs .......... [83.11s] 107326772 reads in 1409913 contigs Scaffold information not available Mates not available:WHAT: Could not open bank file, /home/yaximik/AssRefMap/SC/Ray/RayOutput/AMOS.afg.bnk/FRG.ifo, No such file or directory LINE: 1264 FILE: Bank_AMOS.cc Features not available Initialize Display .Loading AssemblyStats...[8.95s] .Loading Features... [0.01s] .Loading Libraries... [0.00s] .Loading Scaffolds....Loading Contigs... [186.21s] ....Loading NCharts... [21.83s] . [217.01s] Loading Contig 1... [0.05s] 109076 reads Loading reads... [343.52s] Total Load Time: [803.92s] Loading mates .................................................. inserts: 108933 mated: 0 matelisted: 0 unmated: 108933 happy: 0 unhappy: 0 Paint: coverage contigs insetcovfeat readcovfeat features inserts width: 12457 swidth: 778 height: 26357.. Killed [yaximik@G5NNJN1 ~]$
Leave a comment:
-
Originally posted by yaximik View PostGot to be another reason. The assembly file by minia includes max contig of 16091 nt. Without this dataset, Ray produced assembly with max contig/scaffold of 46428 nt.
Please do submit a ticket if you feel this should be fixed.
That is puzzling. The combined adaptor length (both sides) is standard at 120 bp, so autocalc is then a way off (600-120=480, but estimated is ~150). Obviously much smaller library size should affect scaffolding. Would that be better to provide real numbers? Also, i guess the narrower distribution should be better, correct? This can be done by refractionation of the library and collecting narrow distribution, say +/-5%.
LibraryStatistics.txt contains averages, but you have all the signal in Library0.txt, Library1.txt. If you are using the git version of Ray, this information is now in LibraryData.xml
Leave a comment:
-
The maximum read length is 65536 nucleotides.
The 600 bp +/- 15% presumably includes adapters that are not in sequencing reads.
Leave a comment:
-
Originally posted by yaximik View PostHi,
What is the meaning of averageOuterDistance and standardDeviation for paired end files?
This is computed for paired reads and mate pairs.
Is it just average read length in the dataset?
If so, then why it is not required for single read file?
If not, is it an average fragment length in the library?
Such as surmised from BioAnalyzer trace, for example?
If so, then default autocalc may give very wrong estimate, could it? For example, one of my paired read runs was done with a library of 600 bp +/- 15%, but during assembly autocalc estimate was something 150 bp - how this can be so much off?
You can run another application on your data (like ABySS) and you'll see that Ray's right.
Leave a comment:
-
Originally posted by KirillK View PostHi guys!
Is there a way to provide a reference genome for Ray?
cheers,
KK
Code:-search searchDirectory Provides a directory containing fasta files to be searched in the de Bruijn graph. Biological abundances will be written to RayOutput/BiologicalAbundances See Documentation/BiologicalAbundances.txt
See this paper for more information.
Leave a comment:
-
Originally posted by yaximik View PostHi,
I tried to run Ray (maxkmer 32) on 2 x quad core RHEl58 with hyper-threading enabled:
mpiexec -n 16 Ray <Ray.conf> and got the error:
Code:........ Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SC-MILLib1-Herc2s10cFr1Fr2run2R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SC-MILLib1-Herc2s10cFr1Fr2run2R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SCPfx3s25cFr3-150-200run1R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SCPfx3s25cFr3-150-200run1R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SCPfx3s25cFr3-150-200run2R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/Data/MiSeq/SC/AdQ30/SCPfx3s25cFr3-150-200run2R1AdQ30.fastq (please wait...) [Loader::load] File: /media/FantomHD/AssRefMap/SC/SCold/SColdAll.fasta (please wait...) [Loader::load] File: /media/FantomHD/AssRefMap/SC/SCold/SColdAll.fasta (please wait...) [Loader::load] File: /media/FantomHD/AssRefMap/SC/SCold/SCallSanger.fasta (please wait...) [Loader::load] File: /media/FantomHD/AssRefMap/SC/SCold/SCallSanger.fasta (please wait...) [Loader::load] File: /home/yaximik/AssRefMap/SC/minia/SCMiSeqAllFGMGPGIGclean_k27.contigs.fasta (please wait...) [G5NNJN1:07040] *** Process received signal *** [G5NNJN1:07040] Signal: Segmentation fault (11) [G5NNJN1:07040] Signal code: (128) [G5NNJN1:07040] Failing at address: (nil) -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 7040 on node G5NNJN1 exited on signal 11 (Segmentation fault).
Leave a comment:
-
Hi guys!
Is there a way to provide a reference genome for Ray?
cheers,
KK
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:27 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Today, 07:27 AM
|
||
Started by seqadmin, Yesterday, 12:50 PM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
Yesterday, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
185 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
280 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
Leave a comment: