yep ... why simply saying 'passed' if you can say 'not failed' ... keep things complicated ;-)
SCNR,
Sven
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
CASAVA v.1.8.2 FASTQ files only contain reads that passed filtering (unless you run the analysis "--with-failed-reads" option which then includes reads that would normally be filtered out).
"N" here means the sequence is *not* filtered i.e. it is good quality.
Originally posted by kalyankpy View PostHi,
We have a control run on new HiSeq Machine installed recently. The fastq files extracted from CASAVA 1.8.2 has a different format (Pasted below)
@HWI-ST1072:1440BVUACXX:2:1101:1242:2124 1:N:0:
CGGTTTTTATTAAACATATAAACAATTCTTACAGATTGACATCGTACGAGC
+
;@@DDD++<CD:2:A<<a@F:333<3AFAC9+1**1:C**11CE0?DGF
The manual says that when sequences are filtered they will have "Y" in the header. However, all my sequences (100%) are having "N". I have run FASTQC on these sequences and it shows the quality to be EXCELLENT. I am also attaching the picture of the read quality. Presence of "N" worries me and I want to know if this is good sequence of bad! What actually does "N" mean here!
Leave a comment:
-
CASAVA .bcl to fastq output
Hi,
We have a control run on new HiSeq Machine installed recently. The fastq files extracted from CASAVA 1.8.2 has a different format (Pasted below)
@HWI-ST1072:1440BVUACXX:2:1101:1242:2124 1:N:0:
CGGTTTTTATTAAACATATAAACAATTCTTACAGATTGACATCGTACGAGC
+
;@@DDD++<CD:2:A<<a@F:333<3AFAC9+1**1:C**11CE0?DGF
The manual says that when sequences are filtered they will have "Y" in the header. However, all my sequences (100%) are having "N". I have run FASTQC on these sequences and it shows the quality to be EXCELLENT. I am also attaching the picture of the read quality. Presence of "N" worries me and I want to know if this is good sequence of bad! What actually does "N" mean here!
Leave a comment:
-
Originally posted by kjaja View PostI have a question related to using galaxy. I have tires to map one sample to the reference using BWA and it took few hours to do that!! Is that normal?
Originally posted by kjaja View PostHow do people go about processing many samples, would galaxy be the tool to use? can we use command lines or scripts to process data using galaxy?
If you are comfortable with command line and have access to local compute infrastructure then you do not need public galaxy. But if you still want to use the easy web interface of galaxy then consider setting up a local instance of galaxy (http://wiki.g2.bx.psu.edu/) and use it that way.
Leave a comment:
-
Originally posted by kjaja View PostThank you all, that was helpful.
I have a question related to using galaxy. I have tires to map one sample to the reference using BWA and it took few hours to do that!! Is that normal? How do people go about processing many samples, would galaxy be the tool to use? can we use command lines or scripts to process data using galaxy?
Leave a comment:
-
Thank you all, that was helpful.
I have a question related to using galaxy. I have tires to map one sample to the reference using BWA and it took few hours to do that!! Is that normal? How do people go about processing many samples, would galaxy be the tool to use? can we use command lines or scripts to process data using galaxy?
Leave a comment:
-
Originally posted by kjaja View Postthanks GenoMax for the input
It looks like I will be getting the raw data probably “.bcl” format. Based on reading some papers, I can use CASAVA to convert into ” fastq” format and then use BWA to align against the reference. I have seen other paper use “Maq” or “ELAND”, does anyone know the difference between BWA, Maq or ELAND?
It terms of using an online tool such as Galaxy, I have never used it before, is there an online tutorial on how to use it ?
thanks
Leave a comment:
-
kjaja,
I doubt you are going to get data in the BCL format. You will need the Illumina pipeline software to process the raw data in BCL format. Last I checked this software was not freely available. If you were doing this only for one experiment then you would not want to spend time on installing CASAVA (assuming you got your hands on a copy).
In general bcl --> fastq conversion step is generally performed by the facility where you will get your sequence from. Depending on what their policy is, you can request that your sequences be aligned to your "reference" genome using ELAND. ELAND is Illumina's version of short sequence alignment tool. Most commonly used aligners are bwa, bowtie, SOAP (this site has a long list of software for NGS data analysis: http://seqanswers.com/wiki/Software/list).
Galaxy has tutorials available at the links below for RNA-seq analysis:
They also have video tutorials ("live quickies") on the main page of Galaxy (http://main.g2.bx.psu.edu/) to get you started.
Originally posted by kjaja View Postthanks GenoMax for the input
It looks like I will be getting the raw data probably “.bcl” format. Based on reading some papers, I can use CASAVA to convert into ” fastq” format and then use BWA to align against the reference. I have seen other paper use “Maq” or “ELAND”, does anyone know the difference between BWA, Maq or ELAND?
It terms of using an online tool such as Galaxy, I have never used it before, is there an online tutorial on how to use it ?
thanks
Leave a comment:
-
thanks GenoMax for the input
It looks like I will be getting the raw data probably “.bcl” format. Based on reading some papers, I can use CASAVA to convert into ” fastq” format and then use BWA to align against the reference. I have seen other paper use “Maq” or “ELAND”, does anyone know the difference between BWA, Maq or ELAND?
It terms of using an online tool such as Galaxy, I have never used it before, is there an online tutorial on how to use it ?
thanks
Leave a comment:
-
The output of the sequencer will be fastq files. If the facility where you are getting these from uses the new version (v.1.8) of illumina pipeline, each sample may have multiple gzip-archived files that you will need to merge (or analyze in parallel and then merge). The quality values in the fastq files will be in the "sanger" format (http://en.wikipedia.org/wiki/FASTQ_format). Files are going to be ready to analysis (starting with some QC).
Are you planning to analyze the data using local computing infrastructure or with an online tool like galaxy.
Leave a comment:
-
processing the outputs of illumina hi seq
Hi,
We will be using Illumina HiSeq 2000 to sequence exomes . I have not received the data yet, and I am looking to put a plan together on the steps for analysis.
Does anyone know what type of files I will be starting with ( the output from the illumine sequencer), would it be in a "fastq" format? is there an outline on how to process the files up to the analysis stage.
thanksTags: None
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: