Hi
I know people have posted about this before, but amazingly I did not find an answer to particular thing I am confused about, which is the upstreams and downstreams as opposed to forward and reverse...
I have 125bp PE strand-specific reads (Illumina TruSeq Stranded kit, so dUTP protocol), assembled with Trinity using the --RF library parameter, and assigning the first read pair (i.e. R1) as the left and the second read pair (R2) as the right.... because, it says one is to do so....
Anyways, if I understand it correctly, the R2 reads are the forward (and sense) reads, and R1 is the reverse (hence RF - reverse-forward).
In tophat one should choose the --fr-firststrand flag, because only the first synthesised DNA strand is sequenced, because dUTP is incoporated into the second strand, so these will not be amplified during PCR amplification, right? So far, so good.
In bowtie (and hence Detonate rsem-eval http://deweylab.biostat.wisc.edu/detonate/ used to evaluate assemblies), however, there is instead talk about upstream and downstream... Specifically,
"--fr/--rf/--ff
The upstream/downstream mate orientations for a valid paired-end alignment against the forward reference strand. E.g., if --fr is specified and there is a candidate paired-end alignment where mate1 appears upstream of the reverse complement of mate2 and the insert length constraints are met, that alignment is valid. Also, if mate2 appears upstream of the reverse complement of mate1 and all other constraints are met, that too is valid. --rf likewise requires that an upstream mate1 be reverse-complemented and a downstream mate2 be forward-oriented. --ff requires both an upstream mate1 and a downstream mate2 to be forward-oriented. Default: --fr when -C (colorspace alignment) is not specified, --ff when -C is specified."
In detonate one needs to supply the two groups of reads as --upstream_reads --downstream_reads in the command, and specify either
--strand-specific
The RNA-Seq protocol used to generate the reads is strand specific,
i.e., all (upstream) reads are derived from the forward strand.
This option is equivalent to --forward-prob=1.0. With this option
set, if RSEM-EVAL runs the Bowtie/Bowtie 2 aligner, the ’--norc’
Bowtie/Bowtie 2 option will be used, which disables alignment to
the reverse strand of transcripts. (Default: off)
or
--forward-prob
Probability of generating a read from the forward strand of a
transcript. Set to 1 for a strand-specific protocol where all
(upstream) reads are derived from the forward strand, 0 for a
strand-specific protocol where all (upstream) read are derived from
the reverse strand, or 0.5 for a non-strand-specific protocol.
(Default: 0.5)
So with my reads, if I set the upstream reads to be R1 and downstream to be R2, I would have to set --forward-prob to 0, because my R1 reads (upstream) are the reverse, not the forward. Right?
But could I also set the upstream reads to R2, and downstream to R1, and specify --strand-specific (equal to --forward-prob 1)? Would that matter?
Cheers
I know people have posted about this before, but amazingly I did not find an answer to particular thing I am confused about, which is the upstreams and downstreams as opposed to forward and reverse...
I have 125bp PE strand-specific reads (Illumina TruSeq Stranded kit, so dUTP protocol), assembled with Trinity using the --RF library parameter, and assigning the first read pair (i.e. R1) as the left and the second read pair (R2) as the right.... because, it says one is to do so....
Anyways, if I understand it correctly, the R2 reads are the forward (and sense) reads, and R1 is the reverse (hence RF - reverse-forward).
In tophat one should choose the --fr-firststrand flag, because only the first synthesised DNA strand is sequenced, because dUTP is incoporated into the second strand, so these will not be amplified during PCR amplification, right? So far, so good.
In bowtie (and hence Detonate rsem-eval http://deweylab.biostat.wisc.edu/detonate/ used to evaluate assemblies), however, there is instead talk about upstream and downstream... Specifically,
"--fr/--rf/--ff
The upstream/downstream mate orientations for a valid paired-end alignment against the forward reference strand. E.g., if --fr is specified and there is a candidate paired-end alignment where mate1 appears upstream of the reverse complement of mate2 and the insert length constraints are met, that alignment is valid. Also, if mate2 appears upstream of the reverse complement of mate1 and all other constraints are met, that too is valid. --rf likewise requires that an upstream mate1 be reverse-complemented and a downstream mate2 be forward-oriented. --ff requires both an upstream mate1 and a downstream mate2 to be forward-oriented. Default: --fr when -C (colorspace alignment) is not specified, --ff when -C is specified."
In detonate one needs to supply the two groups of reads as --upstream_reads --downstream_reads in the command, and specify either
--strand-specific
The RNA-Seq protocol used to generate the reads is strand specific,
i.e., all (upstream) reads are derived from the forward strand.
This option is equivalent to --forward-prob=1.0. With this option
set, if RSEM-EVAL runs the Bowtie/Bowtie 2 aligner, the ’--norc’
Bowtie/Bowtie 2 option will be used, which disables alignment to
the reverse strand of transcripts. (Default: off)
or
--forward-prob
Probability of generating a read from the forward strand of a
transcript. Set to 1 for a strand-specific protocol where all
(upstream) reads are derived from the forward strand, 0 for a
strand-specific protocol where all (upstream) read are derived from
the reverse strand, or 0.5 for a non-strand-specific protocol.
(Default: 0.5)
So with my reads, if I set the upstream reads to be R1 and downstream to be R2, I would have to set --forward-prob to 0, because my R1 reads (upstream) are the reverse, not the forward. Right?
But could I also set the upstream reads to R2, and downstream to R1, and specify --strand-specific (equal to --forward-prob 1)? Would that matter?
Cheers