Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mapping DNA reads to transcripts

    Dear SEQanswers Community,
    Can anyone recommend a program from mapping DNA reads to transcripts? I have ~ 500 transcripts as a reference. I have DNA reads (Illumina 100 bp paired-end) from an exon-targeted sequence capture enrichment from genomic DNA that I want to map to these transcripts. These sequences should contain the target exons, but also likely have intron sequence on their ends. I think I need a program that will keep/map the reads if (for example) 50 bp of the read matches (i.e the exon), and then clip off the remaining read when it lands on an exon-exon boundary of my transcript.
    Any suggestions?
    Thanks!

  • #2
    BBMap with the right reference (how are your transcript sequences formatted?)

    Comment


    • #3
      The transcripts are in a fasta file.

      Comment


      • #4
        Are the exons in separate fasta sequences?

        Comment


        • #5
          I don't have a genome only a de novo transcriptome, so I only have predicted exons based on distantly related species, but yes I can put them in a separate fasta file.

          Comment


          • #6
            They don't need to be in separate fasta file. A single multi-fasta file is fine to create the BBMap index. I was asking since you are interested in clipping reads that map at the end of the exons.

            Comment


            • #7
              Are you suggesting I map to the predicted exons instead of the known transcripts?

              Comment


              • #8
                Sorry perhaps I am misunderstanding something. Can we do a recap?

                You have ~500 transcripts. Are they known or predicted? What format are they in (one sequence per transcript or several independent exons per as a multifasta)? And you have a set of PE reads that you want to map to this set.

                Comment


                • #9
                  Sorry if I have been confusing. The 500 transcripts are pulled out from a de novo assembled transcriptome. Each of the 500 transcripts is represented by one sequence. I currently have all of the transcripts in a single fasta file.

                  I have also predicted the exons for each transcript by mapping them to a genome from a related species. I have put all of these exons sequences into a separate fasta file so I could try that as a reference too.

                  Does that make sense?

                  Comment


                  • #10
                    I suggest you map to the exons, in this case, and use the "local" flag so that reads hanging off the ends will not be penalized. But it really depends on what you are trying to accomplish, and what quality the exome is compared to the transcriptome. Do you want coverage information? What do you want to learn from the mapping?

                    Comment


                    • #11
                      Ultimately I want to call SNPs, but I would also like to get coverage information to understand the effectiveness of my probes for particular genes. I see the transcripts as being more reliable, as the exons are predicted based on other species.

                      Comment


                      • #12
                        So the current dataset is plain DNA (not RNAseq)?

                        Go ahead and align to transcriptome and see what fraction of reads align. Use the maxindel/intronlen settings, if you know the average size of introns with BBMap.

                        Ultimately, you may have to do some comparative mapping analysis with just exons and the whole transcriptome.

                        Comment


                        • #13
                          Yes, the current dataset is DNA (100 bp reads from genomic data that was targeted to probes develop from transcripts).

                          If I am mapping to the relatively short exons, not the transcripts, would it make sense to map the data as single reads not paired end since for many of the pairs one of the reads may not land on the same exon but may completely land on the adjoining intron?

                          Comment


                          • #14
                            Originally posted by tsecogen View Post
                            If I am mapping to the relatively short exons, not the transcripts, would it make sense to map the data as single reads not paired end since for many of the pairs one of the reads may not land on the same exon but may completely land on the adjoining intron?
                            I think in this situation, where you expect the majority of alignments to be unpaired, mapping the reads as single-ended makes sense.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Recent Advances in Sequencing Analysis Tools
                              by seqadmin


                              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                              05-06-2024, 07:48 AM
                            • seqadmin
                              Essential Discoveries and Tools in Epitranscriptomics
                              by seqadmin




                              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                              04-22-2024, 07:01 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:57 AM
                            0 responses
                            11 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 05-06-2024, 07:17 AM
                            0 responses
                            16 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 05-02-2024, 08:06 AM
                            0 responses
                            19 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-30-2024, 12:17 PM
                            0 responses
                            24 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X