Seqanswers Leaderboard Ad
Collapse
X
-
Definitely yes. Is there any concern about that? Do u mind to share? Anyway, I would like to try this approach whereby I assemble the parental reads with scaffold and use it as a reference sequence to align against the other two progeny. What do u think?
Leave a comment:
-
-
Hi there- when you say one of the samples is parental, does that mean you have two parents and 2 F1 samples, and you have sequenced one parent and both progeny?
Zam
Leave a comment:
-
-
Hi Zam & fcr,
Yup, we are not in the same team. Hehe. Papaya is diploid. I have 3 samples and one of the sample is parental lines. I'm not sure yet the depth coverage as I am still not getting any sequencing information from the company, but soon I will. Papaya is sequence using HiSeq platform.
Leave a comment:
-
-
Yes, and to explain that in more detail:
Rururura:
1.If you have one diploid sample you can de novo discover variants using Cortex, and then use your contigs/scaffolds to assign them coordinates. This is what Fernando meant by "CoordinatesOnly", an option for Cortex's new wrapper script.
2. If you have several samples, then you can do two things
a) You can also use the Cortex "population filter" to classify putative variants as repeat/error/polymorphism - this method is robust to reference assembly errors - it catching missing collapsed repeats in the reference - and this will give you a high quality set of variants
b) you could use this method to look into the quality of the reference and annotate regions which you trust and do not trust.
Zam
Leave a comment:
-
-
Hi Zam,
Rururara is not working in the same project as me. If papaya is a diploid, he could probably use the papaya scaffolds with the "Coordinates Only" option during the calling with cortex_var (actually a acompanying script called runcalls.pl). Right?
Cheers,
Fernando
Leave a comment:
-
-
Hi Rururara
Are you working on the same project as Fernando or a different one? If different, how many samples are you trying to discover SNPs in, and what are their depths od coverage and with what technology. Finally, sorry for ignorance, but what is the ploidy of papaya?
regards
Zam
Leave a comment:
-
-
De novo SNP calling in absence of complete reference assembly
Hai all,
What about if the incomplete reference genome like papaya? The available information on papaya are scaffolds and contigs. Is it possible to use papaya scaffolds as a reference to align against my reads? In my case, the objective is to discover the SNPs.
Leave a comment:
-
-
Hi,
Yes, Zam got it right. I want to start calling SNPs now. The assembly is unfinished and it's going to take time polishing it (~1000,000 scaffolds now). In response to Zam, the assembly is based on an individual, and the estimated coverage is 60X.
The other 10 individuals have 20 X coverage and i want to use them for SNP calling and perhaps "pilot" genotype calling. I think is worthy advance on this, even if in the future a second calling based on the assembly will help to verify/reject candidate regions of interest.
lh3: Thanks for your comment about the reference bias when estimating the population statistics...I will keep that in mind.
Cheers,
Fernando
Leave a comment:
-
-
Just to clarify one thing (and agree with Heng) - my understanding is that Fernando doesnt want to have to wait until his assembly is finished (I mean done/completed, not finished by manual finishers), and wants to get on with it and start calling now. That's what got me nervous about artefacts.
Leave a comment:
-
-
With 60X, you should be able to get an assembly decent enough for most analyses. This is true for human. Nonetheless, Zam is right that misassembly may cause artifacts. You have to live with it. If you are careful enough, you can greatly reduce the effect of that. Also beware that there will be reference bias when estimating population statistics (i.e. individuals closer to the reference will be mapped better).
Leave a comment:
-
-
Hi Fernando
>True, the distribution of coverage will include regions above 30x.
One of the examples in our paper is of SNP calling in 10 samples each sampled to 6x,
for example.
2. Actually, you could call on 10 individuals with much less than 256Gb of RAM. You need 256Gb to hold all of ALL of their genomes at the same time. But lots of the genome is either monomorphic, or doesn't consist of things Cortex can call. So you could do those 10 samples in ~80Gb of RAM (for comparison I've just done 85 humans in 320 Gb of RAM).
The trick is to call on the joint graph (1 colour, probably needs 80Gb RAM) and then pull out just the variants and make a graph just of the variants. Then "multicolourise" the graph and make a 10-colour graph of the variants only, and genotype everyone in that.
Uses far less memory.
How much coverage do your 10 samples have? Is the 60x individual a different sample?
I'm not saying it is too risky with scaffolds, just that if you find something really exciting, you need to do some work making sure it's not an artefact. I've seen people have to work very hard to avoid problems with the chimp genome.
best
Zam
Leave a comment:
-
-
Hi Zam,
Thanks a lot.
Cortex:
1. True, the distribution of coverage will include regions above 30x.
2. What are the Computational needs for 10 individuals with 2.9 Gbp genome? In your server you stated "10 humans on a 256Gb RAM server" How long this takes? Would it be possible to call SNPs with less RAM?
What to do:
This is a 60 X coverage genome. I would assume that many of the scaffolds are bona fide and that many of the changes (adding more libraries) are going to affect mainly to the connection among scaffolds rather than disrupting them...but I might be wrong and shouldn't guess. The main interest are; 1. develop genome-wide set of markers and 2. do some population inferences by estimating Fst, Pi and Ne.
So you think is too risky using scaffolds?
Cheers,
Fernando
Leave a comment:
-
-
Re Cortex:
1. You have much more than 30x coverage if you have many samples at 20x
2. It's not as simple as "you need 30x" for Cortex. But you are absolutely right that an assembly approach will be less sensitive to SNPs.
Re what to do
- it depends what you want to achieve. Do you want a conservative small set of SNPs for building a genetic map, or a big sensitive set for some other purpose etc.
If you have the time, then try both methods (mapping/assembly) and compare. If you are doing population genetic studies, then experience suggests that you will need to be very careful with SNP calls based on an assembly that is not high quality, as it is easy for assembly artefacts to look like interesting scientific finds in your SNPs.
Leave a comment:
-
-
I think you should map your reads to the assembly and then do SNP calling. SAMtools should in principle work, but I have not tried.
Leave a comment:
-
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-20-2025, 05:03 AM
|
0 responses
17 views
0 reactions
|
Last Post
by seqadmin
03-20-2025, 05:03 AM
|
||
Started by seqadmin, 03-19-2025, 07:27 AM
|
0 responses
18 views
0 reactions
|
Last Post
by seqadmin
03-19-2025, 07:27 AM
|
||
Started by seqadmin, 03-18-2025, 12:50 PM
|
0 responses
19 views
0 reactions
|
Last Post
by seqadmin
03-18-2025, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
185 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
Leave a comment: