The quality scores are phred-style. Insertion QVs give the likelihood that the given base is itself an insertion. Deletion QVs refer to the immediately preceding base, and the most likely deleted base is stored as the DeletionTag in the HDF file. In PacBio data, insertions are the most common error. The QV in the FASTQ is predominated by the insertion QV.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Replying to SillyPoint-
We were interested in employing the pacBioToCA script, but the pacBioToCA pipeline expects PacBio RS sequences in fastq format (with sanger (PHRED32) quality values). We were only given an assembled.fasta and a filtered_subreads.fasta.
Comment
-
I am the developer of pacBioToCA, happy to see interest in the pipeline. If you have only fasta files, the pacBioToCA wiki page includes a section on inputting PacBio RS sequences: http://sourceforge.net/apps/mediawik...o_RS_Sequences
We provide a java utility to convert the fasta data to fastq with uniform quality values (http://www.cbcb.umd.edu/~sergek/PacB...ToFastq.tar.gz). The instructions for using it are at the above link.
Comment
-
Thanks for all your replies. I want to understand the SMRT pipe for running assembly. I understand the BLASR has to be run to align the longreads and CCS reads. And then make a consensus through the make-consensus from amos. However the input of the amos make-consensus is the TIG file. How do we go about genrating from the BLASR output? Please somebody help me on the pipeline.
Comment
-
Installing SMRTanalysis
Hi All,
We are trying to install SMRTanalysis software from pacbio. We are getting error as below:
File "./smrtpipe.py", line 4, in <module>
import pkg_resources
File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 2707, in <module>
working_set.require(__requires__)
File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 686, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 584, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: pbpy==0.1
Seems like something wrong with python. We updated to python version 2.7 and included correct path in .bashrc files. Still get the same error.
Any suggestions from previous experience?
Thanks
Comment
-
I think SMRTanalysis comes bundled with its own python. The error above seems to be referring to a system python2.6 directory though, so are you using the system python?
For clarity you should have started a new thread instead of posting under this one.
Comment
-
Originally posted by GenoMax View PostI think SMRTanalysis comes bundled with its own python. The error above seems to be referring to a system python2.6 directory though, so are you using the system python?
For clarity you should have started a new thread instead of posting under this one.
Thanks for the quick reply. I have created new thread here:
Single-molecule real-time observation of DNA polymerase using zero-mode waveguide (ZMW) optical confinement nanostructures
and updated query with more details. Please reply.
Thanks
Comment
-
Hello
I don't understand the error message generated (DeNovo assembly using pacBio data)
with the sample data (e.coli and lambda) is ok.
Then run the command 'smrtpipe.py --params=settings.xml xml:input.xml &>smrtpipe.err' and got an error log message as below:
INFO] 2013-02-20 15:57:08,552 [pbpy.smrtpipe.SmrtDataService writeTo 424] Writing 6 items to DataStore in {'smrt.data.xmlparam': <pbpy.io.MetaAnalysisXml.InputDataUrl object at 0x4295a50>, 'smrt.output.log': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/log', 'smrt.data.cmdline': <pbpy.smrtpipe.InputData.CompositeInputData object at 0x42959d0>, 'smrt.output.root': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir', 'smrt.output.results': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/results', 'smrt.output.data': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/data'}
[INFO] 2013-02-20 15:57:08,555 [pbpy.smrtpipe.SmrtPipeMain _runTasks 267] Skipping PreWorkflow as it contains zero tasks
[INFO] 2013-02-20 15:57:08,558 [pbpy.smrtpipe.SmrtPipeMain _runTasks 270] Loading 10 tasks into Workflow
[INFO] 2013-02-20 15:57:09,275 [pbpy.smrtpipe.SmrtPipeMain _runTasks 279] Executing workflow Workflow
[INFO] 2013-02-20 15:57:09,649 [pbpy.smrtpipe.engine.SmrtPipeTasks run 622] Running task://Anonymous/P_Fetch/toFofn
[ERROR] 2013-02-20 15:57:14,702 [pbpy.smrtpipe.SmrtPipeMain run 648] time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y' Traceback (most recent call last):
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 608, in run self._runTasks(pModules)
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 281, in _runTasks workflow.execute()
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 607, in execute self._update(0)
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 574, in _update self._writeWorkflow()
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 554, in _writeWorkflow self._graph.toFile(path, format)
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 258, in toFile out.write(format2func[format](self))
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 255, in <lambda> 'RDF': lambda g: g.toRDF().serialize(),
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 208, in toRDF for s, p, o in node.toRDF():
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 81, in toRDF Literal(str(self.obj.computeTime))))
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeTasks.py", line 834, in computeTime self._extractComputeTime(regexp)
File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeTasks.py", line 821, in _extractComputeTime self._cachedExecTimes[regexp] = datetime.datetime.strptime(match.group(1), LOG_TIME_FORMAT)
File "/opt/smrtanalysis-1.4.0/redist/python2.7/lib/python2.7/_strptime.py", line 325, in _strptime (data_string, format))
ValueError: time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y'
[ERROR] 2013-02-20 15:57:14,704 [pbpy.smrtpipe.SmrtPipeMain exit 760] time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y'
I need help =)
Comment
-
juassis,
Unfortunately it is a bug due to system location, for a fix, add the following two lines to $SEYMOUR_HOME/etc/setup.sh:
Code:export LANG=en_US.UTF-8 export LANG=en_US.UTF-8
Comment
-
Hello!
Thanks for the information!
Worked properly! =)
Just one more question,
worked properly in the first analysis, however, when presented new data from another bred came up again this error. I'll have to fix every time I want to examine?
Comment
-
Hello,
thank you very much for your help and comments. It was possible to correct several samples
Again some problems. I was able to run the smrtpipe.py command without any errors. However when I tried to run again the SMRTpipe the error message appears:
. /opt/smrtanalysis/etc/setup.sh
$ smrtpipe.py --params=gir_params.xml xml:gir_input.xml
Bus error (core dumped)
--
I did the memory test, and everything is ok.
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 4133745
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Filesystem Size Used Avail Use% Mounted on
/dev/sdg 12T 900G 11T 9% /
Many thanks for your help. =)
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:06 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Today, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment