Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • swbarnes2
    replied
    blastx takes, as input, nucleotide sequences, and outputs matches to protein sequence. It handles the "translate in all 6 frames" thing.

    Leave a comment:


  • morning latte
    replied
    Dear Brian Bushnell,

    Sorry for late respond and appreciate your help. It is always very helpful.

    Leave a comment:


  • Brian Bushnell
    replied
    Originally posted by morning latte View Post
    Thank you Brian Bushnell for your explanation.

    Could you give me a bit more explanation about your statement of "BLAST can be quite useful on nucleotide searches when you are looking for all-six-frames amino acid identity rather than nucleotide identity"? I am not very clear with that. Thanks a lot in advance!
    Unfortunately, I'm not a blast expert, but at least one of its many versions allows you to map nucleotide sequences to protein databases by translating the nucleotides into amino acids. This has to be done 6 times because there are 6 possible reading frames of the sequence. Protein alignment can be more sensitive than nucleotide alignment because amino acids are more conserved, since nucleotides can sometimes change but still code for the same amino acid. This is useful when looking for cross-species homologies.

    Leave a comment:


  • morning latte
    replied
    Thank you LeightonP for suggesting very useful links. I really appreciate it.

    Leave a comment:


  • morning latte
    replied
    Thank you Brian Bushnell for your explanation.

    Could you give me a bit more explanation about your statement of "BLAST can be quite useful on nucleotide searches when you are looking for all-six-frames amino acid identity rather than nucleotide identity"? I am not very clear with that. Thanks a lot in advance!

    Leave a comment:


  • LeightonP
    replied
    Originally posted by morning latte View Post
    There were many hits with alignment length of 30 bp and % identity of 100. For me, only 30 bp of alignment length is not long enough to be called as 100% of identity.
    If the aligned region contains only matches, then the report of 100% identity for the aligned region will be correct.

    Originally posted by morning latte View Post
    So my question is, shouldn't I rely on only e-value and set up a cutoff for alignment length? Or is it common to have that short alignment length?
    Whether you should rely on E-value depends on your biological question. It is not always the most appropriate measure.

    Further to Brian's comment, BLAST is a tool for querying databases (or other sequences/collections of sequences) that uses local sequence alignment. The E-value is a measure that reflects the expected number of returned matches of similar quality from the same database that would occur by chance alone (see, e.g. http://www.ncbi.nlm.nih.gov/blast/Bl...YPE=FAQ#expect). Importantly, the returned E-value for a match varies with database size, even if the alignment between the two sequences does not change. This may or may not be important for your biological question (or depending on how you 'ask' the same biological question).

    You might like to have a closer look at this description of BLAST statistics: http://www.ncbi.nlm.nih.gov/BLAST/tu...ltschul-1.html, or invest in Ian Korf's excellent book on BLAST - even though it could do with an update to cover BLAST+, these days
    Last edited by LeightonP; 04-16-2014, 11:45 PM.

    Leave a comment:


  • Brian Bushnell
    replied
    Well... BLAST is a local aligner. I put little faith in percent identity, or statistical models that assume random genome composition, and can give you a 10^-50 probability of error on each of 100 different organisms. A global aligner will help avoid spurious hits. Certainly an alignment length cutoff would be useful; many of my fellow researchers ignore any BLAST hit under 200bp. I'm not saying I personally recommend that, but they have more experience with BLAST than I do.

    That said, BLAST can be quite useful on nucleotide searches when you are looking for all-six-frames amino acid identity rather than nucleotide identity.
    Last edited by Brian Bushnell; 04-16-2014, 07:02 PM.

    Leave a comment:


  • morning latte
    started a topic BLAST alignment length

    BLAST alignment length

    Dear experts,

    I am working on metagenomic datasets generated from Illumina HiSeq. I normally use BLAST (either BLASTn or BLASTx) for annotating assembled contigs using e-value cutoff. I only depend on e-value for hits and haven't carefully looked at other parameters. Today I extracted some of hits having % identity of 100. When I looked at alignment length between query sequence (over 10 Kbp) and reference sequence (viral genomes), it is quite short than I expected. There were many hits with alignment length of 30 bp and % identity of 100. For me, only 30 bp of alignment length is not long enough to be called as 100% of identity. So my question is, shouldn't I rely on only e-value and set up a cutoff for alignment length? Or is it common to have that short alignment length? Thank you for your help in advance.

Latest Articles

Collapse

  • seqadmin
    The Impact of AI in Genomic Medicine
    by seqadmin



    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
    Yesterday, 02:07 PM
  • seqadmin
    Multiomics Techniques Advancing Disease Research
    by seqadmin


    New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

    A major leap in the field has
    ...
    02-08-2024, 06:33 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 02-23-2024, 04:11 PM
0 responses
45 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2024, 08:52 AM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-20-2024, 08:57 AM
0 responses
51 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-14-2024, 09:19 AM
0 responses
65 views
0 likes
Last Post seqadmin  
Working...
X