Unconfigured Ad

**Brian Bushnell** · 01-03-2017, 10:41 AM

Originally posted by mastercoder View Post

I have 15 miRNA seq, around 8MB each.

Those are some huge microRNAs!

Seriously, though, can you clarify a bit? Do you mean you have 15 files, 8MB each, gzip-compressed fastqs of single-ended 50bp miRNA reads, for example - and if not, what exactly do you have? And when you say you tried X, Y, and Z, what were your command lines, what did they print to the screen, and what was the output? Also, what's your experiment?

**mastercoder** · 01-04-2017, 03:07 AM

Originally posted by Brian Bushnell View Post

Those are some huge microRNAs!

Seriously, though, can you clarify a bit? Do you mean you have 15 files, 8MB each, gzip-compressed fastqs of single-ended 50bp miRNA reads, for example - and if not, what exactly do you have? And when you say you tried X, Y, and Z, what were your command lines, what did they print to the screen, and what was the output? Also, what's your experiment?

First, thanks for replying. I ll start with ur last question.

I have 15 miRNA paired-end seqs 29bp reads. First i used velvet on trimmed data and then SSPACE. The scaffolds for each are ranging from 2MB to 8MB depending on the kmer i used while doing the assembly. After this using UGENE I merged these scaffolds into single sequence. I did this step for each of them. And what I am told is apply MSA on these files. Get a consensus and do the annotation on this consensus seq.
So these files are no gzip compressed. They are .fa files.
About X,Y and Z when i use smaller files it gives me an MSA output (.aln) but when i try the X,Y,Z on my actual data. It gives nothing. It just works eventho it has been more than a week. It did not give any output although these softwares are using my cores.

I am sorry if this does not make sense, but fresh graduate, and could not find somebody to give me a lead.

**GenoMax** · 01-04-2017, 04:56 AM

@mastercoder: This is not making sense. miRNA's are inherantly small. Why are you trying to assemble them?

What did you start this analysis with? What is the aim of the experiment?

**mastercoder** · 01-04-2017, 05:09 AM

Originally posted by GenoMax View Post

@mastercoder: This is not making sense. miRNA's are inherantly small. Why are you trying to assemble them?

What did you start this analysis with? What is the aim of the experiment?

The trimmed data of these are really huge. As you can see on the picture

**GenoMax** · 01-04-2017, 05:23 AM

There is no doubt there are lots of reads.

But what experiment are they from? miRNA sequencing? Are you trying to identify how many miRNA's (known?) are there in the samples? What is the point of doing an MSA?

**mastercoder** · 01-04-2017, 06:08 AM

Originally posted by GenoMax View Post

There is no doubt there are lots of reads.

But what experiment are they from? miRNA sequencing? Are you trying to identify how many miRNA's (known?) are there in the samples? What is the point of doing an MSA?

There is a treatment and a control group. Each has 15 sequence from rats. What I am told is find out known and novel miRNA's. So i thought i can assemble, get the scaffolds and then get it into a single sequence and apply MSA so I can get a consensus sequence from both group. and then I do the annotation on the consensus sequence, instead of doing it one by one.

Is this all wrong?

**GenoMax** · 01-04-2017, 08:19 AM

Originally posted by mastercoder View Post

Each has 15 sequence from rats.

This is not making sense. Did you mean to say that you are only interested in 15 genes/regions?

Code:

What I am told is find out known and novel miRNA's.

The first part can be done by aligning against miRBASE data. No need to do any assembly (if fact that may give you some odd results). For the novel discovery part you can look for software that can do that. Here is one example.

Code:

So i thought i can assemble, get the scaffolds and then get it into a single sequence and apply MSA so I can get a consensus sequence from both group. and then I do the annotation on the consensus sequence, instead of doing it one by one.

This part is not making much sense. You need to ask whoever asked you to do this for further clarification.

**mastercoder** · 01-04-2017, 10:40 AM

@GenoMax
No, What I mean is I have 15 miRNA sequences from 15 rats that are control. and other 15 miRNA from 15 rats that are treatment. That is why i was trying to get a consensus sequence from each group. So should I try to align these sequences against miRBASE data one by one?

**GenoMax** · 01-04-2017, 10:58 AM

Ah. So you have 15 sequence files (not literally 15 sequences) each for control and treatment. Is that correct?

If that is the case then you can align each of them against the miRBASE (not sure if you only want the rat sequences subset from there) to identify reads that align to known miRNA. Then ones that don't align to miRBASE could go into other software to look for novel ones.

**mastercoder** · 01-04-2017, 11:04 AM

Originally posted by GenoMax View Post

Ah. So you have 15 sequence files (not literally 15 sequences) each for control and treatment. Is that correct?

If that is the case then you can align each of them against the miRBASE (not sure if you only want the rat sequences subset from there) to identify reads that align to known miRNA. Then ones that don't align to miRBASE could go into other software to look for novel ones.

GenoMax, I really am thankful to you. Sorry to make you straggle a bit. Last 2 question, please bear with me. Should I do aligning against miRBASE with my trimmed data or the assembled ones (scaffolds). Lastly Is there any article or a source or some other keywords that you can give me?

**GenoMax** · 01-04-2017, 11:09 AM

Originally posted by mastercoder View Post

GenoMax, I really am thankful to you. Sorry to make you straggle a bit. Last 2 question, please bear with me. Should I do aligning against miRBASE with my trimmed data or the assembled ones (scaffolds). Lastly Is there any article or a source or some other keywords that you can give me?

Happy to help.

You should use the trimmed data (hopefully it was correctly trimmed, what program did you use for that?). If this was a pure miRNA prep then the assembled data makes no sense since most of your miRNA's should be smaller than length of one read (how long were they?).

A review like this may be of help.

**mastercoder** · 01-04-2017, 11:24 AM

Originally posted by GenoMax View Post

Happy to help.

You should use the trimmed data (hopefully it was correctly trimmed, what program did you use for that?). If this was a pure miRNA prep then the assembled data makes no sense since most of your miRNA's should be smaller than length of one read (how long were they?).

A review like this may be of help.

Trimmed data was provided by the company that did the sequencing. Below is the info.

And secondly, the trimmed data has 2 files for each sample, i think this is because they are paired-end. That is why I tried to assembly.

**GenoMax** · 01-04-2017, 11:57 AM

The two files are most likely paired-end sequencing data (as described here).

You only have 29 bp reads (if that info is correct). Do you know what was the fragment size for this library?

**mastercoder** · 01-04-2017, 12:08 PM

Originally posted by GenoMax View Post

The two files are most likely paired-end sequencing data (as described here).

You only have 29 bp reads (if that info is correct). Do you know what was the fragment size for this library?

Nope that is not written on the report.

Topics	Statistics	Last Post
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, Yesterday, 12:03 PM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 Yesterday, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, Yesterday, 11:40 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM
Scientists Solve a 25-Year Mystery in RNA Interference by SEQadmin2 Started by SEQadmin2, 05-26-2026, 10:12 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 05-26-2026, 10:12 AM

Unconfigured Ad

MSA on large scale of sequences?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News