Mapping to transcriptome with Bowtie2-beta5

Neuromancer

Member

Join Date: Aug 2011

Posts: 28
- Share
- Tweet
#1

Mapping to transcriptome with Bowtie2-beta5

12-22-2011, 03:46 AM

Hey guys,
I wanted to map by brand new paired-end RNA-seq data to the mouse transcriptome using the current beta (b5) of Bowtie2.
As I could not find any pre-build index for this, I build it myself using bowtie2-build to make an index of ensemble transcript information.
The three mouse-fasta-files for this were downloaded from the ensemble ftp site . I wanted to get as much information as possible so I included cDNA-all, cDNA-abinitio and ncRNA fasta files for indexing.
Then, I mapped the paired-end RNA-seq data to this index using the following command:

./bowtie2 -p 4 -t --local -x mouse_transcriptome_ensembl-NCBI37_ncRNA_cDNAall_abinitiopredictons -1 <matepair1.fastq> -2 <matepair2.fastq> -S output.sam

So far so good, it all worked well with overall alignment rates of 80-90%.
Now, when I want to import the data to SeqMonk, after reading all the lines it tells me that it "Couldn't extract valid name for <ensemble-tanscript-ID/Genscen-ID>" and leaves me with no reads at all... This is probably because there is no chromosome information or not in the expected position?

From what I could find out, the ensemble-fasta-files also contain some "supercontigs" that do not have chromosome information but an NT-xxxx ID.
But still, then there should be reads with the correct annotation, right?

So what went wrong with my workflow here, and can I still rescue the SAM-files that I now produced?

btw: the SAM file header looks like this:

@HD VN:1.0 SO:unsorted
@SQ SN:GENSCAN00000015589 LN:298
@SQ SN:GENSCAN00000001573 LN:74
@SQ SN:GENSCAN00000001572 LN:260
@SQ SN:GENSCAN00000026402 LN:489
...
@SQ SN:ENSMUST00000146092 LN:216
@SQ SN:ENSMUST00000120435 LN:630
@SQ SN:ENSMUST00000118023 LN:1647
...almost endlessly...

and then the alignment comes, which looks like this:

HWI-ST933:54:C01BFACXX:3:1101:10433:5230 99 ENSMUST00000082408 34 99M = 76 169 CGAAAATCTATTTGCCTCATTCATTACCCCAACAATAATAGGATTCCCAATCGTTGTAGCCATCATTATATTTCCTTCAATCCTATTCCCATCCTCAAA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJIJJJJJIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJAHHHHHFFFFFFEEEDEEDDDDDD AS:i:198 XS:i:98 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:99YS:i:198 YT:Z:CP
Tags: bowtie2, sam file, seqmonk, transcriptome

Previous template Next

Advanced Sequencing Platforms Tackle Neuroscience’s Toughest Genomics Problems

by SEQadmin2

Genomics studies in neuroscience face a special challenge due to the brain’s complexity and scarcity of samples. Mapping changes in cell type and state using conventional next-generation sequencing methods remains challenging. Advances in technologies like single-cell sequencing, spatial transcriptomics, and long-read sequencing have opened the door to deeper studies of the brain and diseases like Alzheimer’s, amyotrophic lateral sclerosis (ALS), and schizophrenia.
...
- Channel: Articles
Yesterday, 11:10 AM
Cancer Drug Resistance: The Lingering Barrier to Rising Survival

by SEQadmin2

Cancer survival rates have significantly increased in the last few decades in the United States, reaching a combined 70% 5-year survival rate by 2021. Behind this number, there are years of research to find new therapies, drug targets, and early detection methods. But there is one core challenge that keeps slowing down these advances, and it’s about drug resistance.

There is no single reason why many patients don’t respond to treatment as expected. Cancer is...
- Channel: Articles
07-08-2026, 05:17 AM
Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing

by GATTACAT

Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
- Channel: Articles
07-01-2026, 11:43 AM

Topics	Statistics	Last Post
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, Yesterday, 10:04 AM	0 responses 8 views 0 reactions	Last Post by SEQadmin2 Yesterday, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM

Unconfigured Ad

Mapping to transcriptome with Bowtie2-beta5

Latest Articles

ad_right_rmr

News