Originally posted by saberdsl
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by saberdsl View Posthello nilshomer,can you tell me the masks work well with 35bp solid reads
Let us know how it goes.-drd
Comment
-
Does anyone have recommended masks for mouse mm9 solid 50 bp reads?
Comment
-
the supplementary word doc is quite helpful in understanding bfast
to quote from it
in response to WHAT
Instead of indexing the location of k-mer words in the genome, we generalize this concept to indexing the start positions of k-letter substrings that are obtained from a mask, which is slid along the reference genome at one base shifts to generate the index data. This is similar to spaced seeds introduced previously in homology search programs . For example, the letter selection mask suggested by the bit-pattern 0011001010, directly applied to the sequence "AAGATTACAG", selects the letter key "GAAA".
In reponse to why
it is a way of indexing the reference genome to speed up lookups.
if you are asking why do we need more than one....
Greater accuracy is to be achieved by using multiple indexes based on different masks to define the index keys, but keeping the number of letters in the key, k, large for uniqueness. Avoid using shorter keys (reducing k) to obtain accuracy, which results in exponential growth in spurious candidate locations.
Comment
-
Thank you, Kevin!
I have one more question for the forum. Is there a way to decipher the 4th row of a fastq file? By 4th row, I mean the fastq version of the phred-like values found in a *.qual file. I would like to parse the 4th row but I don't understand what each ` or ! or ? means other than that it is some mysterious code for quality value digits. Thank you for your reply!
Comment
-
Originally posted by elinor View PostI have question regarding the "masks" in the index. What are the masks and why do we need them? Thanks for the response!
-Harold
Comment
-
Originally posted by elinor View PostThank you, Kevin!
I have one more question for the forum. Is there a way to decipher the 4th row of a fastq file? By 4th row, I mean the fastq version of the phred-like values found in a *.qual file. I would like to parse the 4th row but I don't understand what each ` or ! or ? means other than that it is some mysterious code for quality value digits. Thank you for your reply!
Comment
-
Nils - I've been reading through all your documentation and it's really great -- thanks! However, I'm struggling to figure out how to build indices for sacCer, genome size ~ 12Mb. It would seem like a key size of 16 would be about right. I've looked at btestindexes, which is what I think should be used to generate an appropriate index set. Your post says they should be generated with btestindexes using "recommended" settings but I can't figure out what those should be. Suggestions?
Comment
-
Originally posted by abattenhouse View PostNils - I've been reading through all your documentation and it's really great -- thanks! However, I'm struggling to figure out how to build indices for sacCer, genome size ~ 12Mb. It would seem like a key size of 16 would be about right. I've looked at btestindexes, which is what I think should be used to generate an appropriate index set. Your post says they should be generated with btestindexes using "recommended" settings but I can't figure out what those should be. Suggestions?
Comment
-
Originally posted by abattenhouse View PostNils - These are 36 bp reads. I have both SOLiD and Illumina data. Should I use the "25 bp" SOLiD mask set from your SOM? Also, it would be nice to know how to use btestindexes so alternative index sets could be generated and compared. Thanks, Anna
Comment
-
Nils - I've just tried the 25bp SOLiD mask set and I'm getting a lot of false negatives, as determined by alignments showing up in a deleted gene. These reads don't show up in a BWA alignment of the same data. So I think I need a set of BFAST masks with a larger key size. I'm pretty sure I've looked everywhere for more info on btestindexes with no luck (altho I've been reading so much stuff the last few days my head is about to explode Thanks, Anna
Comment
-
Originally posted by abattenhouse View PostNils - I've just tried the 25bp SOLiD mask set and I'm getting a lot of false negatives, as determined by alignments showing up in a deleted gene. These reads don't show up in a BWA alignment of the same data. So I think I need a set of BFAST masks with a larger key size. I'm pretty sure I've looked everywhere for more info on btestindexes with no luck (altho I've been reading so much stuff the last few days my head is about to explode Thanks, Anna
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment