Hi guys,
This forum is great for a beginner, lots of how-to's for newbies and experts alike. I've seen snippets of what i want to do here and there but I was hoping to ask questions, engage the community and then at the end write a how-to/decision tree post for others to benefit from this post. If there is no interest let me know as well also sorry if it's in the wrong place. I'm new to this. Help me understand the process and others too .
Goal:
Take RAW RNA-SEQ data, understand quality control, trims if needed, discuss the different methods of alignment (or pipeline software), Discuss what's different between the processes of looking at splice variants, CNVs, exomes, expression profiling (maybe miRNA) at the different steps. I'm sure i'm missing things too.
I'll organize it all into a nice post later but let's get to it
So the idea is to start with RAW (spit out of the machine) data and from a public database and understand the data and what you can do with it. I decided on K562CellTotalFastqRep2_fastqc from ENCODE as it was suggested to me as a good GAIIX read. Description and Link.
To get the latter tier you need to select:
RNA-extract: Total RNA
View: FastqRd1,2
Platform: Illumina HiSeq 2000
Cell: HAoAF I chose fastqRd1
Rd1,2 stands for read 1,2 for bio replicates: I chose
I chose this data set because it's known, there is some experimental quality control, it's already been analyzed, it's large and unadulterated, and in theory total rna (though it seems there is nothing 200>)
Steps:
QC:
1. QC- I ran it in FASTQC and saw the attached file.
Trimming/Alignment:
1. Tophat/Bowtie/Cufflinks or using python/R?
Expression Profile: (Looking for Counts)
What tools and pathway?
Building Contigs/Exome: (looking for mRNA)
What tools and pathway?
Spliceosomes:
What tools and pathway?
At the end comparing with other samples
I leave many answers open (because of time and i'm also somewhat of a newbie ). This is more of an exercise so let's start with Alignment and understanding what you have. I'll edit this as we go. Help me make this more Useful and organized also.
This forum is great for a beginner, lots of how-to's for newbies and experts alike. I've seen snippets of what i want to do here and there but I was hoping to ask questions, engage the community and then at the end write a how-to/decision tree post for others to benefit from this post. If there is no interest let me know as well also sorry if it's in the wrong place. I'm new to this. Help me understand the process and others too .
Goal:
Take RAW RNA-SEQ data, understand quality control, trims if needed, discuss the different methods of alignment (or pipeline software), Discuss what's different between the processes of looking at splice variants, CNVs, exomes, expression profiling (maybe miRNA) at the different steps. I'm sure i'm missing things too.
I'll organize it all into a nice post later but let's get to it
So the idea is to start with RAW (spit out of the machine) data and from a public database and understand the data and what you can do with it. I decided on K562CellTotalFastqRep2_fastqc from ENCODE as it was suggested to me as a good GAIIX read. Description and Link.
To get the latter tier you need to select:
RNA-extract: Total RNA
View: FastqRd1,2
Platform: Illumina HiSeq 2000
Cell: HAoAF I chose fastqRd1
Rd1,2 stands for read 1,2 for bio replicates: I chose
I chose this data set because it's known, there is some experimental quality control, it's already been analyzed, it's large and unadulterated, and in theory total rna (though it seems there is nothing 200>)
Steps:
QC:
1. QC- I ran it in FASTQC and saw the attached file.
Trimming/Alignment:
1. Tophat/Bowtie/Cufflinks or using python/R?
Expression Profile: (Looking for Counts)
What tools and pathway?
Building Contigs/Exome: (looking for mRNA)
What tools and pathway?
Spliceosomes:
What tools and pathway?
At the end comparing with other samples
I leave many answers open (because of time and i'm also somewhat of a newbie ). This is more of an exercise so let's start with Alignment and understanding what you have. I'll edit this as we go. Help me make this more Useful and organized also.
Comment