Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Oases Question

    So I am analyzing strand-specific RNA seq data of Schizosaccharomyces pombe using Oases v0.2.06

    I used ABySS to create kmer values of 34,36,38,...,64. When I try to use oases with a value > 31 I get this error message:

    [0.000000] Velvet can't handle k-mers as long as 34! We'll stick to 31 if you don't mind.

    Why is this? How do I get it to work with other higher kmer values?

  • #2
    I found the solution, you simply do

    Code:
    make 'MAXKMERLENGTH=92'
    or whatever when you compile velvet and oases. Good luck!

    Comment


    • #3
      One other thing that you should be aware of is that Oases probably can't handle kmers of even length. The reason is that Velvet's representation can't handle palindromes (i.e. kmers equal to their own reverse complement). Simply mandating that k must be odd avoids the problem completely.
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

      Comment


      • #4
        Oh I see, I did have another question. How do you get the insert length and expected coverage for paired-end reads? I have a bunch of fastq files I downloaded from NCBI.

        Comment


        • #5
          Neither Trinity nor Oases requires either number.

          In DNA-seq, 10x coverage means that the amount of read data is 10 times the estimated amount of genomic data. (Sometimes mitochondria or chloroplasts end up amplified more or less than chromosomes, but we'll ignore that complication for the moment.)

          For RNA-seq, there is no such thing as "expected coverage", because the amount of "expected" data depends on the degree of expression of each transcript. For DNA-seq, you can usually tell the coverage by looking at the k-mer count histogram. (Dedicated k-mer counters like Jellyfish or Meryl do this directly, but you can also get the information from the graph build pass of pretty much any de Bruijn graph-based assembler, such as velvetg or Gossamer graph-build.) There is a distinct visible "hump" on the histogram which is proportional to the nominal coverage. If you do the same for RNA-seq data, there is invariably no "hump".

          (Incidentally, this is one of the reasons why RNA-seq is significantly harder than DNA-seq. Noise is indistinguishable from transcripts with low expression, and PCR bias is indistinguishable from transcripts with higher expression.)

          As for insert length, if it's not documented, you probably can't tell without a reference genome. I'm not a wet lab person, but do I recall correctly that some older RNA-seq protocols don't have a size selection step at all? But thankfully, you probably don't need to know it.
          sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

          Comment


          • #6
            hi all
            oases is showing an error make: *** [velvet] Error 2
            help me to fix this thank u

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin



              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            45 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X