I've had a quick question about this picture:
does it matter for "ambiguos" reads if they land on the right strand? I.e. for cases shown in the two last cases, if gene A and gene B are on opposite strands, and the library is stranded, there is no ambiguity actually. Is that taken into consideration?
Thank you in advance!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hi Simon,
I'm analyzing SOLID data using bowtie for mapping and htseq for quantification. The thing is when I used the --stranded parameter (I tried it just to familiarize myself with htseq) I get very similar numbers whether I set it to yes or no.
For example for my 001_02.count file when --stranded=yes
__no_feature 278195
__ambiguous 26690
__too_low_aQual 0
__not_aligned 0
__alignment_not_unique 0
For example for my 001_02.count file when --stranded=no
__no_feature 255213
__ambiguous 115445
__too_low_aQual 0
__not_aligned 0
__alignment_not_unique 0
Since my protocol wasn't stranded I should be losing half the counts when --stranded=yes but as you can see this was not the case.. I tried the same for some Illumina data I have access to and got this, which I think its alright.
stranded=yes __no_feature 9381365
stranded=no __no_feature 492513
So after struggling with this for a while the only thing I found was that the sam files for the SOLID data only have two different flags 0 or 16, which I'm guessing is not enough information for htseq?
707_1366_1065 16 Chr1 1078 255 28M * 0 0 CCCCCCCCCCCCCACCCCCCAAATTGAG [\L!2_______UBL__ZU!"_______ XA:i:2 MD:Z:28 NM:i:0 CM:i:2
42_176_82 0 Chr1 4868 255 73M * 0 0 GGCGGTCAGTGGCTGAGTGACTATATCGACCTGCAACAGCAAGTTCCTTACTTGGCACCTTATGAAAATGAGT ___________________________________________UU______ZY^________^Z[_^^__\KM XA:i:0 MD:Z:73 NM:i:0 CM:i:0
so my question is, are the results I'm getting for the SOLID data with --stranded=no reliable?
Leave a comment:
-
When downloading that table from the UCSC table browser, just change the "output format" drop-down box to "GTF - gene transfer format".
Leave a comment:
-
oo... I failed to generate a gff file from UCSC. I can only download a gff3 file from ncbi. I ran the HTSeq on the gff3 and my bam file. but no Feature counted. I think it is because the ID form ncbi gff3 cannot be matched to the IDs in bam, which was mapped with ucsc basement. can u give some suggestion? Should I use the gff3 from ncbi or where can I get a ucsc gff?
Originally posted by Simon Anders View PostThis does not at all look like a GFF file to me. No wonder that it does not work.
Leave a comment:
-
Originally posted by Lagzxadr View Post#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCoun
1 NM_131426 chr1 + 50321633 50410568 50322024
1 NM_001110522 chr1 - 58701200 58722813 58701200
9 NM_001143751 chr1 + 6072450 6331842 6072675 6331842 11
Leave a comment:
-
This does not at all look like a GFF file to me. No wonder that it does not work.
Leave a comment:
-
#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCoun
1 NM_131426 chr1 + 50321633 50410568 50322024
1 NM_001110522 chr1 - 58701200 58722813 58701200
9 NM_001143751 chr1 + 6072450 6331842 6072675 6331842 11
Originally posted by Simon Anders View PostPlease post the beginning of your GFF file, to see whether there really is a '+' in line 2.
Leave a comment:
-
Please post the beginning of your GFF file, to see whether there really is a '+' in line 2.
Leave a comment:
-
Dear Simon,
I met a problem when using the HTSeq count. How can I fix the error? Thanks a lot!
huoxj@ubuntu:/host/ubuntu$ htseq-count -s no -i ID Hxj3TAN_hits.bam Zv9.gff > Hxj4count.txt
Error occured when processing GFF file (line 2 of file Zv9.gff):
invalid literal for int() with base 10: '+'
[Exception type: ValueError, raised in __init__.py:223]
Originally posted by Simon Anders View PostHi
I noticed this bug myself just yesterday and fixed it. Please try again with version 0.4.3-p4 and tell me whether this solves the issue.
Cheers
Simon
Leave a comment:
-
I'm thinking that the "stranded=reverse" is the way to go if I want to measure sense expression, since for the fr-firststrand protocol, the right most strand is sequenced first which is opposite to the coding strand. Is this correct?
Leave a comment:
-
Hello,
I've used Tophat 2.0.9 & then HTseq version 0.5.4p3 & just with 3 of my 28 SAM files I get this error.
Error occured in line 63841485 of file RNA8_sorted.sam.
Error: ("'seq' and 'qualstr' do not have the same length.", 'line 63841485 of file RNA8_sorted.sam')
[Exception type: ValueError, raised in _HTSeq.pyx:772]
Can anyone please help as it's holding up my analysis.
Thank you
alig
Leave a comment:
-
library type and stranded parameter
Hello,
I'm trying to figure out the right "stranded" parameter to use for my RNA-seq data which was aligned using TopHat with the "--library-type fr-firststrand" parameter. I'm using paired-end reads.
From what I can see, the results of running "stranded=no" is similar to "stranded=reverse" which gives me about ~50% of the total fragments, the majority have no feature. But if I ran using "stranded=yes", I only get ~2% of total fragments as having a feature.
I'm thinking that the "stranded=reverse" is the way to go if I want to measure sense expression, since for the fr-firststrand protocol, the right most strand is sequenced first which is opposite to the coding strand. Is this correct?
Thanks,
Patrick
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:45 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
Today, 07:45 AM
|
||
Started by seqadmin, Yesterday, 07:59 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:59 AM
|
||
Newborn Genomic Screening Shows Promise in Reducing Infant Mortality and Hospitalization
by seqadmin
Started by seqadmin, 12-09-2024, 08:22 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
12-09-2024, 08:22 AM
|
||
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
175 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
Leave a comment: