Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • swaraj
    replied
    Refer to my earlier post to get fasta from Cufflinks GTF.
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Leave a comment:


  • oxydeepu
    replied
    Hi i saw the thread..
    Can i get the logic for the program to create transcripts from genome file.
    how it differ based on orientation. i mean reads which have positive and negative orientation.
    Thank you.
    Deepak

    Leave a comment:


  • Annibal
    replied
    I've taken a look at GTF specs.
    Yes, it is.
    Thanx

    Leave a comment:


  • Simon Anders
    replied
    You use the GTF file to produce the cDNA FASTA file from the reference FASTA file. This is a simple exercise in script programming.

    Leave a comment:


  • Annibal
    replied
    Originally posted by Simon Anders View Post
    What you want to do is called reference-based (as opposed to: de-novo) transcript assembly. A tool commonly used for this purpose is cufflinks:

    Roberts, Pimentel, Trapnell, and Pachter:
    Identification of novel transcripts in annotated genomes using RNA-Seq
    Bioinformatics (2011) 27 (17): 2325-2329.
    doi:10.1093/bioinformatics/btr355

    However, before doing this yourself, you may want to check whether the ENCODE people have not already done this analysis. It seems obvious that they would do this.

    I still wonder what you would need a database of all transcripts for. Instead of blasting against it, you can always blast against the genome.
    Thank you.
    I've taken a look at cufflinks, just the manual, but i did not find th FASTA file of the transcript as an output file of some task. Cufflinks instead talk about gtf file as an output (that does not contain the FASTA sequence of the transcript). I'll take a better look to the program.
    I've also read just yesterday this interesting article:
    "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks" nature protocol

    If i blast in the genome i lose informations that are in the RNA sequence and not in genome (ex. sequences in the transposable element that are not integrated in the genome...)

    Thanx again.

    Leave a comment:


  • Simon Anders
    replied
    What you want to do is called reference-based (as opposed to: de-novo) transcript assembly. A tool commonly used for this purpose is cufflinks:

    Roberts, Pimentel, Trapnell, and Pachter:
    Identification of novel transcripts in annotated genomes using RNA-Seq
    Bioinformatics (2011) 27 (17): 2325-2329.
    doi:10.1093/bioinformatics/btr355

    However, before doing this yourself, you may want to check whether the ENCODE people have not already done this analysis. It seems obvious that they would do this.

    I still wonder what you would need a database of all transcripts for. Instead of blasting against it, you can always blast against the genome.

    Leave a comment:


  • Annibal
    replied
    Originally posted by Simon Anders View Post
    Which BAM files are you talking about? ENCODE has many.

    Why do you want to make your BLAST data base from RNA-Seq reads rather than simply from, say, the cDNA FASTA file from Ensembl?
    I'm talking about BAM file of the human total RNA extract from CSHL Long RNA seq.

    I don't use Ensembl data because cDNA FASTA from Ensembl does not contain all the transcript (i guess) but only "known, novel and pseudogenes" as stated on their website

    Moreover i will probably repeat this task using RNAseq data from cell in particular conditions

    Thanx a lot.

    Davide

    Leave a comment:


  • Simon Anders
    replied
    Which BAM files are you talking about? ENCODE has many.

    Why do you want to make your BLAST data base from RNA-Seq reads rather than simply from, say, the cDNA FASTA file from Ensembl?

    Leave a comment:


  • Annibal
    replied
    I thought this task would have been easy or at least possible since i have the reads aligned to the ref genome (homo sapiens)
    Anyone can help?
    Thanx

    Leave a comment:


  • Annibal
    started a topic How to obtain full length RNA transcript sequence

    How to obtain full length RNA transcript sequence

    Hi everyone,
    i'm new to this kind of tasks so, please be patient!
    I'm trying to create a blast DB using the RNAseq data from ENCODE.
    I've downloaded both the FASTQ reads and the .bam/bai files.
    I need the fasta sequences of all the full length transcripts: is it possible to extract/obtain them from the BAM file?
    Alternatively should i try to do a de novo assembly using Trinity?
    Thanx a lot.
    Regards,

    Davide

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM
  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    06-25-2024, 06:43 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
31 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
42 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-15-2024, 06:53 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-10-2024, 07:30 AM
0 responses
43 views
0 likes
Last Post seqadmin  
Working...
X