Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Simon Anders
    replied
    Originally posted by sindrle View Post
    Hi!
    Ive seen a lot of threads on this, but I can't figure it out. I got 16-60 millions single end reads in each library. Ive used Tophat 2 with UCSC GTF file for hg19.

    [...]

    How come i get on average 25 - 50% reads that is "no_feature",
    "ambiguous" or "alignment_not_unique".
    Is this a GTF file created with UCSC's table browser? If so: These do not work. There is a bug in the Table Browser server, which causes all the gene IDs to contain not the gene ID but the transcript ID.

    Please use a GTF file from another source.

    Simon

    Leave a comment:


  • PhD1990
    replied
    hi Bruce

    that would be really nice if you could send me such a script

    thank you so much

    Sara

    Leave a comment:


  • bruce01
    replied
    Hi Sara,

    you can use GFF3 format in HTSeq, you just need to specify the feature (3rd column) using -t flag as it may be different from default which I think is 'gene_id'. For example '-t gene'. Otherwise you can use a conversion script to make a GTF from GFF3, there are a few around in various scripting languages, or I can PM you one I use if you want.

    Bruce.

    Leave a comment:


  • PhD1990
    replied
    thanks + second question

    hi everyone

    thank you so much for helping me
    i have found the problem by the way in the tutorial they say you chould download a vcredist x86 2010 version but now i downloaded 2012 and it wordks perfectly

    i have a second question though.

    Now the tutorial is working for me i still have one really weird problem. to count reads you should download exon information from internet? (ensembl or something) but in the tutorial they give a gtf file and that works perfectly, but on internet i can only find gff3 files for for example E coli strains. How do you use these because i see that the content is different from the gtf file?

    is there a standard format? of a place where i can find exon information in gtf version?

    thanks
    grtz

    Sara

    Leave a comment:


  • sindrle
    replied
    HTSeq: Very few counts recognised

    Hi!
    Ive seen a lot of threads on this, but I can't figure it out. I got 16-60 millions single end reads in each library. Ive used Tophat 2 with UCSC GTF file for hg19.

    This is my code:

    samtools view accepted_hits.bam | \
    htseq-count -m intersection-nonempty -s no -a 10 \
    - UCSC/hg19/genes.gtf \
    > Out.txt

    Here is a typical result, its propotional to the library size:

    no_feature 7013689
    ambiguous 269370
    too_low_aQual 0
    not_aligned 0
    alignment_not_unique 6645341

    How come i get on average 25 - 50% reads that is "no_feature",
    "ambiguous" or "alignment_not_unique".

    This is RNAseq, and if I must visually inspect, how to precede?

    Leave a comment:


  • Wolfgang Huber
    replied
    Dear PhD1990

    it's good that you report having a problem. Probably you need to be more specific for someone to be able to help you. Can you provide a

    - reproducible example (i.e. a self-contained piece of code and, if needed, data for others to reproduce your problem)
    - a statement of what the problem is that you experience (any error messages, warnings etc.)
    - an overview over your system (OS, Python version).

    Kind regards
    Wolfgang

    Leave a comment:


  • PhD1990
    started a topic problem with HTSeq

    problem with HTSeq

    hi everyone

    I'm trying to start to use python/HTSeq to try to analyse RNA seq data.
    I'm following a tour through HTSeq but i m having a weird problem

    i can import HTSeq
    and read in a file with the HTSeq.FastqReader
    i can get a name of a read with read.name
    but when i type read.qual python just automatically restart and i have to start over

    does anyone know why this is and how i cna solve this problem?

    thank you

Latest Articles

Collapse

  • seqadmin
    The Impact of AI in Genomic Medicine
    by seqadmin



    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
    02-26-2024, 02:07 PM
  • seqadmin
    Multiomics Techniques Advancing Disease Research
    by seqadmin


    New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

    A major leap in the field has
    ...
    02-08-2024, 06:33 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 06:12 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-23-2024, 04:11 PM
0 responses
64 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2024, 08:52 AM
0 responses
70 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-20-2024, 08:57 AM
0 responses
60 views
0 likes
Last Post seqadmin  
Working...
X