Unconfigured Ad

**GenoMax** · 12-02-2016, 08:07 AM

Sometimes if you are happy with the current state of the transcriptome (known expressed parts of the genome) then you could choose to do alignments of your data just to that part.

While that is not incorrect, you do run a small risk of having some reads mis-align (since an aligner does its best to align and the read may not have originally come from that region) by restricting to just "known" expressed parts of the genome. If splice sites are provided as well then the programs would not try to look for new ones. Both these modifications speed up the alignments to some extent.

**New2Bioinfo** · 12-03-2016, 01:36 AM

Originally posted by GenoMax View Post

Both these modifications speed up the alignments to some extent.

That's okay. But while using the HISAT2 program, I am extracting the splice site and exon information from the .gtf file. And that information is given to the index builder (Hisat2-build). So, what I am getting is that this information during indexing is helping during alignment.

If I know the splice sites, the reads will not align to those parts where splice sites lie in the middle. Is this correct?
I still don't get how exon info is helping in alignment.

A little more detailed answer would be really really helpful.

Thank you.

**wdecoster** · 12-03-2016, 10:34 AM

The most intuitive explanation might be that those "known" exons and splice sites are used as a suggestion for the read mapping, making mapping much quicker since the aligner "knows" where to look. Reads that don't behave according to the "known" annotation will still get correctly aligned and new splice sites will be discovered.

You are just "telling" the aligner a priori where the splice junctions most likely are (but not restricting the mapping to those junctions/exons).

**New2Bioinfo** · 12-05-2016, 01:59 AM

Okay. That makes sense.

Thank you very much.

**biocomputer** · 12-15-2016, 09:01 AM

Does including exons and splice sites make the alignment more accurate, faster, or both?

Topics	Statistics	Last Post
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, Today, 08:59 AM	0 responses 9 views 0 reactions	Last Post by SEQadmin2 Today, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM

Unconfigured Ad

Indexing- Exons and splice sites

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News