visualisation of RNAseq (kallisto to IGV)

KamilSJaron

Junior Member

Join Date: Apr 2016

Posts: 7
- Share
- Tweet
#1

visualisation of RNAseq (kallisto to IGV)

04-05-2016, 09:49 AM

Hello everyone,

as others, I am quite excited about pseudo alignment produced by kallisto in minutes instead of real alignment computed for hours. Now, it would be useful to visualise it using IGV.

So from the .gdb file we extracted cds of our bacteria using python scripts. The name of each sequence in cds was the gene_id (which was the same as transcript_id). Exactly, how we would expect.

On this cds file I run kallisto index to index it and then I produced according to the manual of kallisto pseudobam file. (https://pachterlab.github.io/kallisto/manual.html)

kallisto quant -i cds.idx -o output -b 100 --single -l 100 -s 1 --pseudobam <all_RNAseq_reads.fq.gz> | samtools view -Sb - > pseudomap.bam

The .bam file was then sorted and indexed and loaded with .fasta and .gtf file to IGV giving following error

File does not contain any sequence names which match the current genome.
File: *****S5_genome_87, S5_genome_88, S5_genome_89, S5_genome_90, ...
Genome: S5_genome,

S5_genome_XX are gene_ids of our genome and S5 is our genome. So, I thought, that IGV thinks, that every transcript is a chromosome (from few related posts like http://seqanswers.com/forums/archive...p/t-16407.html). So I ve created alias file like this:

S5_genome_87 S5_genome
S5_genome_88 S5_genome
... ...

Now it loaded the file, but reads are not visualised at all. I guess I miss something somewhere. Imho the easiest way would be to edit somehow the .bam file (or the .sam file before it is converted to .bam) to include the information of the only one chromosome of the genome.

If you are still reading, thank you for it. Any help appreciated.
Tags: bam, igv, kallisto, rnaseq, samtools
KamilSJaron

Junior Member

Join Date: Apr 2016

Posts: 7
- Share
- Tweet
#2

07-12-2016, 07:52 AM

I wrote a small python script for conversion of .sam produced by kallisto to .sam readable by IGV using .gtf file. It is not perfect (I was bit in rush when I was writing it) and all transctipts on reverse reverse strand have reads viewed as they would be in forward direction (so opposite than they should), but on the correct place (i.e. if you want to check coverage / transcripts, it is fair enough).

So if you would be interested

Sequence-a-genome/2016_spring/se2ex2/kallisto_sam_convertor.py at master · KamilSJaron/Sequence-a-genome

https://github.com/KamilSJaron/Sequence-a-genome/blob/master/2016_spring/se2ex2/kallisto_sam_convertor.py

Source files of materials for practicals of master class Sequence a genome. - KamilSJaron/Sequence-a-genome

Usage:

python3 kallisto_sam_convertor.py <pseudoalignment.sam> <annotation.gtf> | samtools view -bS - | samtools sort - -o <output.bam>

bam should be loadable to IGV.

---edit---
I think, that to correct the script, it's needed to change a bitflag of reads mapping to transcripts from reverse strand (fw reads - to bw reads and visa reverse) and recompute position of the read (should be symmetric around the middle of a transcript.)

Last edited by KamilSJaron; 12-01-2016, 11:35 AM. Reason: correction of the specification of the problem, the script in post have.
Comment

Previous template Next

Best Practices for Single-Cell Sequencing Analysis

by seqadmin

While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
- Channel: Articles
Yesterday, 07:15 AM
Latest Developments in Precision Medicine

by seqadmin

Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
- Channel: Articles
05-24-2024, 01:16 PM

Topics	Statistics	Last Post
The Adaptation of the Cell Cycle in Multiciliated Cells by seqadmin Started by seqadmin, Today, 06:58 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 06:58 AM
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, Yesterday, 08:18 AM	0 responses 19 views 0 likes	Last Post by seqadmin Yesterday, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, Yesterday, 08:04 AM	0 responses 18 views 0 likes	Last Post by seqadmin Yesterday, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM

Seqanswers Leaderboard Ad

Announcement

visualisation of RNAseq (kallisto to IGV)

Comment

Latest Articles

ad_right_rmr

News