Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • apredeus
    replied
    I've had a quick question about this picture:



    does it matter for "ambiguos" reads if they land on the right strand? I.e. for cases shown in the two last cases, if gene A and gene B are on opposite strands, and the library is stranded, there is no ambiguity actually. Is that taken into consideration?

    Thank you in advance!

    Leave a comment:


  • dlepe
    replied
    Hi Simon,
    I'm analyzing SOLID data using bowtie for mapping and htseq for quantification. The thing is when I used the --stranded parameter (I tried it just to familiarize myself with htseq) I get very similar numbers whether I set it to yes or no.

    For example for my 001_02.count file when --stranded=yes
    __no_feature 278195
    __ambiguous 26690
    __too_low_aQual 0
    __not_aligned 0
    __alignment_not_unique 0

    For example for my 001_02.count file when --stranded=no
    __no_feature 255213
    __ambiguous 115445
    __too_low_aQual 0
    __not_aligned 0
    __alignment_not_unique 0

    Since my protocol wasn't stranded I should be losing half the counts when --stranded=yes but as you can see this was not the case.. I tried the same for some Illumina data I have access to and got this, which I think its alright.

    stranded=yes __no_feature 9381365
    stranded=no __no_feature 492513

    So after struggling with this for a while the only thing I found was that the sam files for the SOLID data only have two different flags 0 or 16, which I'm guessing is not enough information for htseq?

    707_1366_1065 16 Chr1 1078 255 28M * 0 0 CCCCCCCCCCCCCACCCCCCAAATTGAG [\L!2_______UBL__ZU!"_______ XA:i:2 MD:Z:28 NM:i:0 CM:i:2
    42_176_82 0 Chr1 4868 255 73M * 0 0 GGCGGTCAGTGGCTGAGTGACTATATCGACCTGCAACAGCAAGTTCCTTACTTGGCACCTTATGAAAATGAGT ___________________________________________UU______ZY^________^Z[_^^__\KM XA:i:0 MD:Z:73 NM:i:0 CM:i:0

    so my question is, are the results I'm getting for the SOLID data with --stranded=no reliable?

    Leave a comment:


  • Lagzxadr
    replied
    Finally done. Thank u very much!
    Originally posted by dpryan View Post
    Just use the GTF.

    Leave a comment:


  • dpryan
    replied
    Just use the GTF.

    Leave a comment:


  • Lagzxadr
    replied
    Yes. Got it. Thanks a lot! then how to generate a gff?
    Originally posted by dpryan View Post
    When downloading that table from the UCSC table browser, just change the "output format" drop-down box to "GTF - gene transfer format".

    Leave a comment:


  • dpryan
    replied
    When downloading that table from the UCSC table browser, just change the "output format" drop-down box to "GTF - gene transfer format".

    Leave a comment:


  • Lagzxadr
    replied
    oo... I failed to generate a gff file from UCSC. I can only download a gff3 file from ncbi. I ran the HTSeq on the gff3 and my bam file. but no Feature counted. I think it is because the ID form ncbi gff3 cannot be matched to the IDs in bam, which was mapped with ucsc basement. can u give some suggestion? Should I use the gff3 from ncbi or where can I get a ucsc gff?
    Originally posted by Simon Anders View Post
    This does not at all look like a GFF file to me. No wonder that it does not work.

    Leave a comment:


  • dpryan
    replied
    Originally posted by Lagzxadr View Post
    #bin name chrom strand txStart txEnd cdsStart cdsEnd exonCoun
    1 NM_131426 chr1 + 50321633 50410568 50322024
    1 NM_001110522 chr1 - 58701200 58722813 58701200
    9 NM_001143751 chr1 + 6072450 6331842 6072675 6331842 11
    That's genePred format from UCSC! The conversion procedure is detailed here

    Leave a comment:


  • Simon Anders
    replied
    This does not at all look like a GFF file to me. No wonder that it does not work.

    Leave a comment:


  • Lagzxadr
    replied
    #bin name chrom strand txStart txEnd cdsStart cdsEnd exonCoun
    1 NM_131426 chr1 + 50321633 50410568 50322024
    1 NM_001110522 chr1 - 58701200 58722813 58701200
    9 NM_001143751 chr1 + 6072450 6331842 6072675 6331842 11
    Originally posted by Simon Anders View Post
    Please post the beginning of your GFF file, to see whether there really is a '+' in line 2.

    Leave a comment:


  • Simon Anders
    replied
    Please post the beginning of your GFF file, to see whether there really is a '+' in line 2.

    Leave a comment:


  • Lagzxadr
    replied
    Dear Simon,
    I met a problem when using the HTSeq count. How can I fix the error? Thanks a lot!
    huoxj@ubuntu:/host/ubuntu$ htseq-count -s no -i ID Hxj3TAN_hits.bam Zv9.gff > Hxj4count.txt
    Error occured when processing GFF file (line 2 of file Zv9.gff):
    invalid literal for int() with base 10: '+'
    [Exception type: ValueError, raised in __init__.py:223]



    Originally posted by Simon Anders View Post
    Hi



    I noticed this bug myself just yesterday and fixed it. Please try again with version 0.4.3-p4 and tell me whether this solves the issue.

    Cheers
    Simon

    Leave a comment:


  • dvanic
    replied
    I'm thinking that the "stranded=reverse" is the way to go if I want to measure sense expression, since for the fr-firststrand protocol, the right most strand is sequenced first which is opposite to the coding strand. Is this correct?
    Yes. I've posted on this here:http://seqanswers.com/forums/showpos...8&postcount=50

    Leave a comment:


  • alig
    replied
    Hello,

    I've used Tophat 2.0.9 & then HTseq version 0.5.4p3 & just with 3 of my 28 SAM files I get this error.

    Error occured in line 63841485 of file RNA8_sorted.sam.
    Error: ("'seq' and 'qualstr' do not have the same length.", 'line 63841485 of file RNA8_sorted.sam')
    [Exception type: ValueError, raised in _HTSeq.pyx:772]

    Can anyone please help as it's holding up my analysis.

    Thank you
    alig

    Leave a comment:


  • ppatrickt
    replied
    library type and stranded parameter

    Hello,

    I'm trying to figure out the right "stranded" parameter to use for my RNA-seq data which was aligned using TopHat with the "--library-type fr-firststrand" parameter. I'm using paired-end reads.

    From what I can see, the results of running "stranded=no" is similar to "stranded=reverse" which gives me about ~50% of the total fragments, the majority have no feature. But if I ran using "stranded=yes", I only get ~2% of total fragments as having a feature.

    I'm thinking that the "stranded=reverse" is the way to go if I want to measure sense expression, since for the fr-firststrand protocol, the right most strand is sequenced first which is opposite to the coding strand. Is this correct?

    Thanks,
    Patrick

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
30 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
32 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Working...
X