Left and right are not complements of each other. They match. That's how they overlap. For denovo assembly try using programs like SOAP denovo
http://soap.genomics.org.cn/soapdenovo.html, or SPADES de novo http://bioinf.spbau.ru/en/. Are your sequences from single cell? Or a metagenome?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
no. i have illumina paired end reads. and I want to denovo assembly without a reference genome.
how the right portion of a reads overlaps with the left portion of another reads? (it was not the same sense od read)
And have you an example
Leave a comment:
-
Assembly algorithm will depend whether you have a reference genome or not. Is it a metagenome sequencing?
Leave a comment:
-
how make an assembly with reads (paired end: two files) .
Is it that you can make an example please?
thanks
Leave a comment:
-
No. If you have a fragment of DNA: ATCGTTGAGCAGACT,
your R1: TAGCAA
and R2: GTCTGA
Leave a comment:
-
for example:
we have a fragment of DNA: ATCGTTGAGCAGACT
we will sequence this fragment and for example after the paired end sequencing reads we have R1 and R2 with a length 6. Thus:
R1 = ATCGTT
R2 = AGTCTG (reverse complement)
so after sequencing wa have R1..... R2 (the middle part is unknown).
that's the paired end reads.
or not?
Leave a comment:
-
Then you should consider that overlap is something like:
R1-ATTGCTGTG
-----------ACACTGAAAAGT-R2
Leave a comment:
-
I speak of an example.
it is assumed that it is an overlap.
I want to create an assembly tool but first I need to know how to detect overlap between the paired ends (from two files). and make assembly with paired end.
Leave a comment:
-
Don't see why
"F1.fq:
S1: R1=ATCGTTGAG
S2: R1=TGAGCAGAC " would overlap. They only match at 3 bp. Assemblers won't combine them.
Leave a comment:
-
for example:
we have the sequence: S1: ATCGTTGAGCAGACT and the sequence S2: TGAGCAGACTTAAGTAGTTTT .
and for example, was the first sequenced reads from S1: R1 = ATCGTTGAG
R2 = AGTCTGCTC (reverse complement from the right)
and from the second sequence: R1: TGAGCAGAC
R2: AAAACTACT (reverse complement from the right)
So we have the two files paired end:
F1.fq:
S1: R1=ATCGTTGAG
S2: R1=TGAGCAGAC
F2.fq:
S1: R2=AGTCTGCTC
S2: R2=AAAACTACT
in the assembly here there is an overlap between R1(S1) and R1(S2).
in assembly, we can have overlap between R1 and R2 from two differents sequence??Last edited by mido1951; 10-22-2015, 01:55 PM.
Leave a comment:
-
Originally posted by mido1951 View PostI have llumina paired end data.
I want to make an assembly of these data.
But the problem I do not understand the two F1.fq file and F2.fq.
Is that reads and reads of F1.fq F2.fq are complementary or not?
for the assembly do I have to overlap F1.fq or I have to overlap and F1.fq F2.fq?
thanky
@mido1951: See this page for a simple explanation of "shotgun sequencing": https://en.wikipedia.org/wiki/Shotgun_sequencing In the past people used sanger sequencing for this, which has now been replaced with NGS.
R1/R2 are merely sequences from the two ends of a fragment. They do not need to be complementary (in fact in most cases they will not be). You do not need to worry about R1/R2 reads individually but use them as a set for assembly.
Leave a comment:
-
I have llumina paired end data.
I want to make an assembly of these data.
But the problem I do not understand the two F1.fq file and F2.fq.
Is that reads and reads of F1.fq F2.fq are complementary or not?
for the assembly do I have to overlap F1.fq or I have to overlap and F1.fq F2.fq?
thanky
Leave a comment:
-
What is your data on? Metagenome, single genome?
What sequencing platform did you use? What is the processing computer power that you can use?
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
59 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
57 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
52 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
56 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Leave a comment: