Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • deepika123
    replied
    thanks dpryan.. now tophat run successfully

    Leave a comment:


  • dpryan
    replied
    Well, you already have the gff file, so just truncate that.

    Leave a comment:


  • deepika123
    replied
    Thanks dpryan...
    Ok.. i am trying. Can you send me the link from where i downloaded gtf files of plant because i am trying to download from ensemble but i am unable to download due DAS and i am so good in programming .
    i can be possible that error is due to annotation file if you send me the link, m thankful to you.

    Leave a comment:


  • dpryan
    replied
    gtf_to_fasta sometimes fails on non-standard chromosome/contig names (see issue #38). Alternatively, this might also be caused by the annotation file describing a feature that goes beyond the bounds of one of the chromosomes/contigs. Short of going through the code, I can't really say which. You might just try to make a truncated annotation file and see if that works. If so, just expand it until you hit the error again. Then you'll know where the problem is.

    Leave a comment:


  • deepika123
    replied
    yes i saw run log and run the command:

    /usr/local/bin/gtf_to_fasta --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir KC_tophat_new/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p5 --inner-dist-mean 50 --inner-dist-std-dev 20 --gtf-annotations known/Oryza.gff --gtf-juncs KC_tophat_new/tmp/Oryza.juncs --no-closure-search --no-coverage-search --no-microexon-search known/Oryza.gff rice_data/Oryza.fa known/Oryza.fa > KC_tophat_new/logs/g2f.out

    and the error is:
    terminate called after throwing an instance of 'std:ut_of_range'
    what(): basic_string::substr
    Aborted

    how to solve this

    Leave a comment:


  • dpryan
    replied
    If you look in the run log, you'll find the actual gtf_to_fasta command that was run. Just run that yourself to get a more informative error message.

    Leave a comment:


  • deepika123
    replied
    hello...

    i ran tophat with GTF file of rice but i got error and i do not know how to solve it please anyone can suggest me that how can i solve this problem.

    /usr/local/bin/tophat -G rice_data/Oryza.gtf -p 5 -o KC_tophat_new --transcriptome-index=known rice_data/Oryza rice_data/KC-24Hr_R1_filter.fastq rice_data/KC-24Hr_R2_filter.fastq

    and error is:

    [2013-10-02 12:36:15] Beginning TopHat run (v2.0.9)
    -----------------------------------------------
    [2013-10-02 12:36:15] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2013-10-02 12:36:15] Checking for Samtools
    Samtools version: 0.1.19.0
    [2013-10-02 12:36:19] Checking for Bowtie index files (genome)..
    [2013-10-02 12:36:19] Checking for reference FASTA file
    [2013-10-02 12:36:19] Generating SAM header for rice_data/Oryza
    format: fastq
    quality scale: phred33 (default)
    [2013-10-02 12:36:19] Reading known junctions from GTF file
    [2013-10-02 12:36:27] Preparing reads
    left reads: min. length=56, max. length=83, 37951504 kept reads (113 discarded)
    right reads: min. length=57, max. length=83, 37949924 kept reads (1693 discarded)
    [2013-10-02 13:09:09] Building transcriptome data files..
    [FAILED]
    Error: gtf_to_fasta returned an error.

    thanks in advance

    Leave a comment:


  • mikep
    replied
    OK, so you've listed a few problems, let's start with the one posted originally (cuffdiff)

    I am not familiar with the rice genome or what format it is in or what it's chromosomes are called. However, the GTF file posted above lists the chromsome name as "Un", which I doubt is an actual chromosome. Not sure where you got your GTF from , but the first column in the GTF must match exactly the names of the chromosomes in the fa file.

    Secondly, tophat2, unlike tophat1, builds a bowtie index for the transcriptome, you should read the manual about this, and use --transcriptome-index. It looks as though the index couldn't be built, might be a read permission error, or a path error.

    Thirdly, your tophat commandlines don't look like paired end inputs, since there's only one fq file being mentioned.

    Leave a comment:


  • deepika123
    replied
    Thanks for your reply.....
    Actually i have paired end illumina sequencing data, one is control and other one is experimental and mapped on rice genome using tophat and the command is:

    tophat -G rice_data/Oryza.gtf -p 5 -o K1_tophat_new rice_data/Oryza rice_data/K1_R1_filter.fastq rice_data/K1_R2_filter.fastq

    tophat -G rice_data/Oryza.gtf -p 5 -o K2_tophat_new rice_data/Oryza rice_data/K2_R1_filter.fastq rice_data/K2_R2_filter.fastq

    after running this, i got the error:[2013-09-30 18:54:34] Beginning TopHat run (v2.0.9)
    -----------------------------------------------
    [2013-09-30 18:54:34] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2013-09-30 18:54:34] Checking for Samtools
    Samtools version: 0.1.19.0
    [2013-09-30 18:54:34] Checking for Bowtie index files (genome)..
    [2013-09-30 18:54:34] Checking for reference FASTA file
    [2013-09-30 18:54:34] Generating SAM header for rice_data/Oryza
    format: fastq
    quality scale: phred33 (default)
    [2013-09-30 18:54:35] Reading known junctions from GTF file
    [2013-09-30 18:54:44] Preparing reads
    left reads: min. length=57, max. length=83, 33510885 kept reads (8 discarded)
    right reads: min. length=57, max. length=83, 33509751 kept reads (1142 discarded)
    [2013-09-30 19:30:03] Building transcriptome data files..
    [2013-09-30 19:30:23] Building Bowtie index from Oryza.fa
    [FAILED]
    Error: Couldn't build bowtie index with err = 1


    But when i ran it without GTF file then it run successfully. Can anyone tell me that where i am wrong?

    and the format of gtf file of rice is:

    Un protein_coding exon 1489 1644 . - . gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "1"; transcript_name "13113.m00129"; seqedit "false";
    Un protein_coding CDS 1489 1644 . - 0 gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "1"; transcript_name "13113.m00129"; protein_id "13113.m00129";
    Un protein_coding start_codon 1642 1644 . - 0 gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "1"; transcript_name "13113.m00129";
    Un protein_coding exon 1193 1357 . - . gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "2"; transcript_name "13113.m00129"; seqedit "false";
    Un protein_coding CDS 1196 1357 . - 0 gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "2"; transcript_name "13113.m00129"; protein_id "13113.m00129";
    Un protein_coding stop_codon 1193 1195 . - 0 gene_id "13113.t00002"; transcript_id "13113.m00129"; exon_number "2"; transcript_name "13113.m00129";
    Un protein_coding exon 4230 4368 . - . gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "1"; transcript_name "13113.m00131"; seqedit "false";
    Un protein_coding CDS 4230 4368 . - 0 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "1"; transcript_name "13113.m00131"; protein_id "13113.m00131";
    Un protein_coding start_codon 4366 4368 . - 0 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "1"; transcript_name "13113.m00131";
    Un protein_coding exon 3151 3521 . - . gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "2"; transcript_name "13113.m00131"; seqedit "false";
    Un protein_coding CDS 3151 3521 . - 2 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "2"; transcript_name "13113.m00131"; protein_id "13113.m00131";
    Un protein_coding exon 2986 3001 . - . gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "3"; transcript_name "13113.m00131"; seqedit "false";
    Un protein_coding CDS 2986 3001 . - 0 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "3"; transcript_name "13113.m00131"; protein_id "13113.m00131";
    Un protein_coding exon 2314 2900 . - . gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "4"; transcript_name "13113.m00131"; seqedit "false";
    Un protein_coding CDS 2317 2900 . - 2 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "4"; transcript_name "13113.m00131"; protein_id "13113.m00131";
    Un protein_coding stop_codon 2314 2316 . - 0 gene_id "13113.t00004"; transcript_id "13113.m00131"; exon_number "4"; transcript_name "13113.m00131";
    Un protein_coding exon 5772 5785 . + . gene_id "13113.t00006"; transcript_id "13113.m00133"; exon_number "1"; transcript_name "13113.m00133"; seqedit "false";
    Un protein_coding CDS 5772 5785 . + 0 gene_id "13113.t00006"; transcript_id "13113.m00133"; exon_number "1"; transcript_name "13113.m00133"; protein_id "13113.m00133";
    Un protein_coding start_codon 5772 5774 . + 0 gene_id "13113.t00006"; transcript_id "13113.m00133"; exon_number "1"; transcript_name "13113.m00133";
    Un protein_coding exon 6126 6308 . + . gene_id "13113.t00006"; transcript_id "13113.m00133"; exon_number "2"; transcript_name "13113.m00133"; seqedit "false";

    is there something wrong in GTF file? and if yes then pls tell me where i got rice GTF file.
    Last edited by deepika123; 09-30-2013, 06:02 AM.

    Leave a comment:


  • mikep
    replied
    You haven't really indicated how you added the cufflinks output to the cuffdiff input. From the looks of your cufflinks command line you are using -GTF which means no novel transcripts. In that case running cufflinks is entirely pointless, and you didn't need to do it.

    Moving onto cuffdiff. From the looks of it there is a problem with the Oryza.gtf that you are using. cuffdiff expects a number of fields, specifically transcript_id, gene_id, tss_id, and p_id to perform the various comparisons.

    So my first guess is your gtf is malformed/missing fields.
    Last edited by mikep; 09-29-2013, 07:37 PM.

    Leave a comment:


  • deepika123
    started a topic cufflinks output

    cufflinks output

    i am using cufflinks for differential gene expression analysis. After mapping of illumina reads on rice genome using tophat, i got accepted_hits.bam file. This files is used for cufflinks.

    I have a control dataset and a experimental dataset. In cufflink, i ran :

    [root@nbri deepika_cuff]# cufflinks -o KT_out --GTF oryza.gtf --label KT accepted_hits.bam

    [root@nbri deepika_cuff]# cufflinks -o KC_out --GTF oryza.gtf --label KC accepted_hits.bam

    after running this command i got 4 files for each and transcript.gtf files are used for differential analysis.

    then i ran:

    cuffdiff /storage/home/cdac/deepika_top/rice_data/Oryza.gtf /storage/home/cdac/deepika_top/KT-24h_output/KT_accepted_hits.bam /storage/home/cdac/deepika_top/KC-24h_output/KC_accepted_hits.bam


    after running this command, i got all the files which are mentioned in cufflink/cuffdiff's manual but most of files are blank. i did not get this.

    In gene_exp.diff...
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    11562.pre-tRNA-Arg-1 11562.pre-tRNA-Arg-1 trnR(UCU)3 Pt:34209-35955 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Arg-2 11562.pre-tRNA-Arg-2 trnR(ACG)1 Pt:98543-99399 q1 q2 LOWDATA 0 0 0 0 1 1 no
    11562.pre-tRNA-Arg-3 11562.pre-tRNA-Arg-3 trnR(ACG)2 Pt:116153-116227 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Asn-1 11562.pre-tRNA-Asn-1 trnN(GUU)1 Pt:98543-99399 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Asn-2 11562.pre-tRNA-Asn-2 trnN(GUU)2 Pt:115830-115902 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Asp-1 11562.pre-tRNA-Asp-1 trnD(GUC) Pt:16230-16304 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Cys-1 11562.pre-tRNA-Cys-1 trnC(GCA) Pt:18058-18129 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Gln-1 11562.pre-tRNA-Gln-1 - Pt:6615-6687 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Glu-1 11562.pre-tRNA-Glu-1 trnE(UCC) Pt:15649-15722 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-Gly-1 11562.pre-tRNA-Gly-1 trnG(GCC)1 Pt:12330-12401 q1 q2 NOTEST 0 0 0 0 1 1 no
    11562.pre-tRNA-His-1 11562.pre-tRNA-His-1 trnH(GUG)2 Pt:80914-81163 q1 q2 NOTEST 0 0 0 0 1 1 no


    but in cds_exp.diff :
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

    i got only this line and there are so many files which are blank. i did not get this problem and when i ran cufflink then got

    Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).

    is there something wrong???? then how can i solve it?

    [10:29:31] Loading reference annotation.
    Warning: No conditions are replicated, switching to 'blind' dispersion method
    [10:29:38] Inspecting maps and determining fragment length distributions.
    [10:41:12] Modeling fragment count overdispersion.
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [10:43:26] Calculating preliminary abundance estimates
    [10:43:26] Testing for differential expression and regulation in locus.
    > Processed 57165 loci. [*************************] 100%
    Performed 0 isoform-level transcription difference tests
    Performed 0 tss-level transcription difference tests
    Performed 0 gene-level transcription difference tests
    Performed 0 CDS-level transcription difference tests
    Performed 0 splicing tests
    Performed 0 promoter preference tests
    Performing 0 relative CDS output tests
    Writing isoform-level FPKM tracking
    Writing TSS group-level FPKM tracking
    Writing gene-level FPKM tracking
    Writing CDS-level FPKM tracking
    Writing isoform-level count tracking
    Writing TSS group-level count tracking
    Writing gene-level count tracking
    Writing CDS-level count tracking
    Writing isoform-level read group tracking
    Writing TSS group-level read group tracking
    Writing gene-level read group tracking
    Writing CDS-level read group tracking
    Writing read group info
    Writing run info


    please solve my problem.

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:54 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-13-2024, 08:58 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-12-2024, 02:20 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X