Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BFAST and miRNA precursor reference

    Dear users,
    I'm using BFAST to align miRNA reads from SOLiD ABI to the precursor miRNA reference.

    I can't understand one thing.

    When I used the reference of miRNA precursor as (for example):
    >hsa-mir-548d-1 MI0003668 Homo sapiens miR-548d-1 stem-loop
    AAACAAGUUAUAUUAGGUUGGUGCAAAAGUAAUUGUGGUUUUUGCCUGUAAAAGUAAUGG
    CAAAAACCACAGUUUCUUUUGCACCAGACUAAUAAAG
    >hsa-mir-661 MI0003669 Homo sapiens miR-661 stem-loop
    GGAGAGGCUGUGCUGUGGGGCAGGCGCAGGCCUGAGCCCUGGUUUCGGGCUGCCUGGGUC
    UCUGGCCUGCGCGUGACUUUGGGGUGGCU
    ...
    ...

    (extracted by miRBasev13)
    the program returns me the error at the localalign step saying me that read and reference don't match.

    If I use the same reference, adding to each miRNA sequence 35 N (at the begin and at the end of each one), as, for example:

    >hsa-mir-1277
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACCTCCCAAATATATATATATATGTACGTATGTGTATATAAATGTATACGTAGATATATATGTATTTTTGGTGGGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    >hsa-mir-1278
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTGCTCATAGATGATATGCATAGTACTCCCAGAACTCATTAAGTTGGTAGTACTGTGCATATCATCTATGAGCGAATAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    ...
    ...

    the program runs and I can terminate my alignment.

    So, I don't know how to explain it to myself!

    I then compared the number of counts found by BFAST program (about 75000) with the counts found by the RNA_pipeline of corona lite (ABI) (about 22000).
    How can I explain this so large difference?
    Thank you very much for the help!

    Maria Elena

  • #2
    Originally posted by m_elena_bioinfo View Post
    Dear users,
    I'm using BFAST to align miRNA reads from SOLiD ABI to the precursor miRNA reference.

    I can't understand one thing.

    When I used the reference of miRNA precursor as (for example):
    >hsa-mir-548d-1 MI0003668 Homo sapiens miR-548d-1 stem-loop
    AAACAAGUUAUAUUAGGUUGGUGCAAAAGUAAUUGUGGUUUUUGCCUGUAAAAGUAAUGG
    CAAAAACCACAGUUUCUUUUGCACCAGACUAAUAAAG
    >hsa-mir-661 MI0003669 Homo sapiens miR-661 stem-loop
    GGAGAGGCUGUGCUGUGGGGCAGGCGCAGGCCUGAGCCCUGGUUUCGGGCUGCCUGGGUC
    UCUGGCCUGCGCGUGACUUUGGGGUGGCU
    ...
    ...

    (extracted by miRBasev13)
    the program returns me the error at the localalign step saying me that read and reference don't match.

    If I use the same reference, adding to each miRNA sequence 35 N (at the begin and at the end of each one), as, for example:

    >hsa-mir-1277
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACCTCCCAAATATATATATATATGTACGTATGTGTATATAAATGTATACGTAGATATATATGTATTTTTGGTGGGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    >hsa-mir-1278
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTGCTCATAGATGATATGCATAGTACTCCCAGAACTCATTAAGTTGGTAGTACTGTGCATATCATCTATGAGCGAATAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    ...
    ...

    the program runs and I can terminate my alignment.

    So, I don't know how to explain it to myself!

    I then compared the number of counts found by BFAST program (about 75000) with the counts found by the RNA_pipeline of corona lite (ABI) (about 22000).
    How can I explain this so large difference?
    Thank you very much for the help!

    Maria Elena
    What version are you using? What post processing options are you using?

    Comment


    • #3
      Thanks Dr.Homer,
      I'm using bfast-0.6.0d

      The options and the parameters, after the fasta2brg and the index step, that I use are:

      > bfast match -f hsa_human.fa -r file.fastq -A 1 > file.bmf

      > bfast localalign -f hsa_human.fa -m file.bmf -A 1 > file.baf

      > bfast postprocess -f hsa_human.fasta -a 3 -O 3 -i file.baf -A 1 > output.sam

      With fasta file without NNN, the program crashes at localalign step.

      Comment


      • #4
        Originally posted by m_elena_bioinfo View Post
        Thanks Dr.Homer,
        I'm using bfast-0.6.0d

        The options and the parameters, after the fasta2brg and the index step, that I use are:

        > bfast match -f hsa_human.fa -r file.fastq -A 1 > file.bmf

        > bfast localalign -f hsa_human.fa -m file.bmf -A 1 > file.baf

        > bfast postprocess -f hsa_human.fasta -a 3 -O 3 -i file.baf -A 1 > output.sam

        With fasta file without NNN, the program crashes at localalign step.
        Only valid DNA bases are allowed as well as N (so only ACGTN). It looks like you have Us in the reference for the miRNA. Convert those to Ts in your reference.

        Comment


        • #5
          Dr. Homer,
          in my reference there are not U but it contains only DNA bases. For example:

          >hsa-let-7a-2
          AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT
          >hsa-let-7a-3
          GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT

          So, I don't think that this is the problem!

          Comment


          • #6
            Originally posted by m_elena_bioinfo View Post
            Dr. Homer,
            in my reference there are not U but it contains only DNA bases. For example:

            >hsa-let-7a-2
            AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT
            >hsa-let-7a-3
            GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT

            So, I don't think that this is the problem!
            Give me your reference and a set of reads an I will test it out myself. Thanks!

            Nils

            Comment


            • #7
              Has anyone tried aligning to the mature sequences instead of the entire precursor using BFAST. I've recently jumped on the BFAST bandwagon for genomic data and am trying to find out whether miRNA is feasible. I have two approaches: 1. align to only the mature sequences to identify the known sequences, 2. align to the entire genomic reference and look up and down 100bp of the aligned position for a second hit in the reverse compliment to identify a potential loop structure.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                The Impact of AI in Genomic Medicine
                by seqadmin



                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                02-26-2024, 02:07 PM
              • seqadmin
                Multiomics Techniques Advancing Disease Research
                by seqadmin


                New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

                A major leap in the field has
                ...
                02-08-2024, 06:33 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 02-28-2024, 06:12 AM
              0 responses
              27 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-23-2024, 04:11 PM
              0 responses
              74 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-21-2024, 08:52 AM
              0 responses
              82 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-20-2024, 08:57 AM
              0 responses
              69 views
              0 likes
              Last Post seqadmin  
              Working...
              X