Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • de-novo differential transcription/splicing question

    Hi folks,
    I am relatively new to this kind of analysis. I am interested in discovering novel splice variants and differentially transcribed genes. I have several pairs of inputs (pre and post relapse CLL for 3 individuals). I am using STAR to map paired reads to hg19 without a GTF file. I also am using cufflinks/cuffcompare/cuffdiff to look for differential events. Is there an easy way to remove known transcript variants from the output of cuffdiff so as to focus on the novel stuff? Or, should I be using STAR with an annotated set of indexes and using a GTF file in the search? For example, to create the index for STAR:

    STAR --runMode genomeGenerate --sjdbFileChrStartEnd hg19_intron_loci.txt --sjdbOverhang 75 --genomeDir ../blabla.fa

    then to run it on a pair of reads:

    STAR --genomeDir /data/Genomes/UCSC/hg19/STAR_ANNOTATED --sjdbGTFfile /data/Genomes/UCSC/hg19/knownGene_standardchromonly.gtf --readFilesIn sample1_1.clipped.fastq sample1_2.clipped.fastq --runThreadN 32

    Or is there a better way? BTW I can already do the general differential expression analysis using a mapping to the transcriptome (i.e using BWA-eXpress and edgeR, or inverse beta binomial for the statistical test).

    I'm also using diffsplice on the star output, and that is kind of interesting. I annotate the regions with annovar.

    cheers,
    karl_s

  • #2
    OK, i think i figured out part of this. It seems that cuffcompare is used with the -r flag to compare to a reference, and the output has some flags to indicate how the transcripts are related to a known set of transcripts. Providing annotation in the mapping step maybe just helps the mappers accuracy.

    Anyhow comments on this sort of analysis from more experienced folks would be welcome. Generally, the quesiton I am trying to answer is, given two biological conditions, am I seeing anything new.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    14 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    43 views
    0 likes
    Last Post seqadmin  
    Working...
    X