Unconfigured Ad

**cascoamarillo** · 10-21-2011, 12:08 PM

Hi,

I'm not sure about the iCLIP data analysis, but bowtie is an alignment tool:

http://bowtie-bio.sourceforge.net/manual.shtml

Does it take .bed files? I don't think so. It takes your reads (fasta, fastq) and your reference file (indexed fasta) and generates and output aligment file. Then, you probably can take this output and .bed file into a annotation program or similar (I'm don't know too much about this part).
What you can DO with bowtie is take your tRNA, rRNA, snoRNA and snRNA fasta file, and eliminate the reads that match those sequences (-un option).

Hope it helps.

**StephaniePi83** · 10-23-2011, 11:13 PM

Dear cascoamarillo,
Thanks for your reply.
This is exactly what i want do : eliminate reads that match to tRNA,rRNA ... but is it possible with my .bed file ? Or i have to change it in another format ?

**arvid** · 10-24-2011, 12:19 AM

Having the reads in .bed format indicates that they probably already are aligned? Anyway, then you could go a couple of different routes;

1. If the genome is well annotated, get a GFF file and filter out the lines with the descriptions of tRNA/rRNA etc. locations and filter the bed file against that (get bedtools and check out "intersectBed -v").
2. You can extract fasta for each bed feature (read, I suppose in your case) from the bed/fasta-combo with the fastaFromBed tool in bedtools. Then align the reads in fasta format to your ncRNAs with Bowtie or whatever aligner.

Good luck,
Samuel

**StephaniePi83** · 10-25-2011, 12:16 AM

Dear Arvid,
Thanks for your reply.
I have a question concerning the 2nd solution : i have to create a fasta file that contrain the sequence for each chromosome ( i work on fly D. melanogaster so i need the sequence for the 4 chromosomes) ?
I tried the 1st solution but the .gff fly are corrupted, i can't unzip it

**arvid** · 10-25-2011, 12:50 AM

For the 2nd solution:
Yes, you need the FASTA file with the chromosome sequences that the bed is associated with (to which the reads were aligned). Then issue the following bedtools (http://code.google.com/p/bedtools/) command:

fastaFromBed -name -fi [your_fasta_file] -bed [your_bed_file] -fo output.fasta

That should give you the reads in "output.fasta".

For the first solution, grab the compressed gff from FlyBase (ftp://ftp.flybase.net/releases/FB201...l-r5.41.gff.gz).

Then grep the file for rRNA, tRNA, snoRNA etc. E.g.:
gzip -cd dmel-all-r5.41.gff.gz | grep rRNA > dmel-rRNA-r5.41.gff
gzip -cd dmel-all-r5.41.gff.gz | grep tRNA > dmel-tRNA-r5.41.gff
and so on.

Then intersect your bed with these gffs:
intersectBed -wa -a [your_bed_file] -b dmel-rRNA-r5.41.gff > rRNA_reads.bed
intersectBed -wa -a [your_bed_file] -b dmel-tRNA-r5.41.gff > tRNA_reads.bed
etc.

Or if you need to filter out those ncRNA reads:
intersectBed -v -a [your_bed_file] -b dmel-rRNA-r5.41.gff | intersectBed -v -a stdin -b dmel-tRNA-r5.41.gff > no-rRNA-no-tRNA_reads.bed

Look through the bedtools web site for more examples...

Enjoy,
Samuel

**StephaniePi83** · 10-25-2011, 01:00 AM

Originally posted by arvid View Post

For the first solution, grab the compressed gff from FlyBase (ftp://ftp.flybase.net/releases/FB201...l-r5.41.gff.gz).
l

This is exactly the file i can't unzip ...

**arvid** · 10-25-2011, 01:04 AM

It is in gzip format, so you shouldn't "unzip" it. If you like to decompress it, run "gunzip dmel-all-r5.41.gff.gz". You don't need to do that for the commands I suggested, though, as they would decompress the file on-the-fly.

If it is corrupted, download it again. Also make sure that your reads were aligned to the same version of the genome, in case the chromosomes might have changed (or check the readmes on FlyBase for such information).

**StephaniePi83** · 10-25-2011, 06:14 AM

for the 2nd solution, do i need the same chromosome sequences that the bed is associated with? Because i think that the data from flybase are not the one used because "intersectBed" only remove 1 entity ...

**arvid** · 10-25-2011, 06:16 AM

Yes, you need the same chromosome sequences... Can't you find out what produced that bed file? Then you should be able to get the needed information, and possibly raw fastq files to do your own alignments...

**StephaniePi83** · 10-25-2011, 06:19 AM

yes i'd like too but the person that perform the primary analysis don't answer me !! Thank you for your reply, it help me

Topics	Statistics	Last Post
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 26 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM

Unconfigured Ad

read annotation with bowtie

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News