Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • stranded RNAseq parameters

    Hi

    I know people have posted about this before, but amazingly I did not find an answer to particular thing I am confused about, which is the upstreams and downstreams as opposed to forward and reverse...

    I have 125bp PE strand-specific reads (Illumina TruSeq Stranded kit, so dUTP protocol), assembled with Trinity using the --RF library parameter, and assigning the first read pair (i.e. R1) as the left and the second read pair (R2) as the right.... because, it says one is to do so....

    Anyways, if I understand it correctly, the R2 reads are the forward (and sense) reads, and R1 is the reverse (hence RF - reverse-forward).

    In tophat one should choose the --fr-firststrand flag, because only the first synthesised DNA strand is sequenced, because dUTP is incoporated into the second strand, so these will not be amplified during PCR amplification, right? So far, so good.

    In bowtie (and hence Detonate rsem-eval http://deweylab.biostat.wisc.edu/detonate/ used to evaluate assemblies), however, there is instead talk about upstream and downstream... Specifically,

    "--fr/--rf/--ff
    The upstream/downstream mate orientations for a valid paired-end alignment against the forward reference strand. E.g., if --fr is specified and there is a candidate paired-end alignment where mate1 appears upstream of the reverse complement of mate2 and the insert length constraints are met, that alignment is valid. Also, if mate2 appears upstream of the reverse complement of mate1 and all other constraints are met, that too is valid. --rf likewise requires that an upstream mate1 be reverse-complemented and a downstream mate2 be forward-oriented. --ff requires both an upstream mate1 and a downstream mate2 to be forward-oriented. Default: --fr when -C (colorspace alignment) is not specified, --ff when -C is specified."

    In detonate one needs to supply the two groups of reads as --upstream_reads --downstream_reads in the command, and specify either
    --strand-specific
    The RNA-Seq protocol used to generate the reads is strand specific,
    i.e., all (upstream) reads are derived from the forward strand.
    This option is equivalent to --forward-prob=1.0. With this option
    set, if RSEM-EVAL runs the Bowtie/Bowtie 2 aligner, the ’--norc’
    Bowtie/Bowtie 2 option will be used, which disables alignment to
    the reverse strand of transcripts. (Default: off)
    or

    --forward-prob
    Probability of generating a read from the forward strand of a
    transcript. Set to 1 for a strand-specific protocol where all
    (upstream) reads are derived from the forward strand, 0 for a
    strand-specific protocol where all (upstream) read are derived from
    the reverse strand, or 0.5 for a non-strand-specific protocol.
    (Default: 0.5)

    So with my reads, if I set the upstream reads to be R1 and downstream to be R2, I would have to set --forward-prob to 0, because my R1 reads (upstream) are the reverse, not the forward. Right?

    But could I also set the upstream reads to R2, and downstream to R1, and specify --strand-specific (equal to --forward-prob 1)? Would that matter?

    Cheers

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Today, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
37 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
41 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
35 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X