Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • tsecogen
    Member
    • Mar 2013
    • 10

    Mapping DNA reads to transcripts

    Dear SEQanswers Community,
    Can anyone recommend a program from mapping DNA reads to transcripts? I have ~ 500 transcripts as a reference. I have DNA reads (Illumina 100 bp paired-end) from an exon-targeted sequence capture enrichment from genomic DNA that I want to map to these transcripts. These sequences should contain the target exons, but also likely have intron sequence on their ends. I think I need a program that will keep/map the reads if (for example) 50 bp of the read matches (i.e the exon), and then clip off the remaining read when it lands on an exon-exon boundary of my transcript.
    Any suggestions?
    Thanks!
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    BBMap with the right reference (how are your transcript sequences formatted?)

    Comment

    • tsecogen
      Member
      • Mar 2013
      • 10

      #3
      The transcripts are in a fasta file.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Are the exons in separate fasta sequences?

        Comment

        • tsecogen
          Member
          • Mar 2013
          • 10

          #5
          I don't have a genome only a de novo transcriptome, so I only have predicted exons based on distantly related species, but yes I can put them in a separate fasta file.

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            They don't need to be in separate fasta file. A single multi-fasta file is fine to create the BBMap index. I was asking since you are interested in clipping reads that map at the end of the exons.

            Comment

            • tsecogen
              Member
              • Mar 2013
              • 10

              #7
              Are you suggesting I map to the predicted exons instead of the known transcripts?

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                Sorry perhaps I am misunderstanding something. Can we do a recap?

                You have ~500 transcripts. Are they known or predicted? What format are they in (one sequence per transcript or several independent exons per as a multifasta)? And you have a set of PE reads that you want to map to this set.

                Comment

                • tsecogen
                  Member
                  • Mar 2013
                  • 10

                  #9
                  Sorry if I have been confusing. The 500 transcripts are pulled out from a de novo assembled transcriptome. Each of the 500 transcripts is represented by one sequence. I currently have all of the transcripts in a single fasta file.

                  I have also predicted the exons for each transcript by mapping them to a genome from a related species. I have put all of these exons sequences into a separate fasta file so I could try that as a reference too.

                  Does that make sense?

                  Comment

                  • Brian Bushnell
                    Super Moderator
                    • Jan 2014
                    • 2709

                    #10
                    I suggest you map to the exons, in this case, and use the "local" flag so that reads hanging off the ends will not be penalized. But it really depends on what you are trying to accomplish, and what quality the exome is compared to the transcriptome. Do you want coverage information? What do you want to learn from the mapping?

                    Comment

                    • tsecogen
                      Member
                      • Mar 2013
                      • 10

                      #11
                      Ultimately I want to call SNPs, but I would also like to get coverage information to understand the effectiveness of my probes for particular genes. I see the transcripts as being more reliable, as the exons are predicted based on other species.

                      Comment

                      • GenoMax
                        Senior Member
                        • Feb 2008
                        • 7142

                        #12
                        So the current dataset is plain DNA (not RNAseq)?

                        Go ahead and align to transcriptome and see what fraction of reads align. Use the maxindel/intronlen settings, if you know the average size of introns with BBMap.

                        Ultimately, you may have to do some comparative mapping analysis with just exons and the whole transcriptome.

                        Comment

                        • tsecogen
                          Member
                          • Mar 2013
                          • 10

                          #13
                          Yes, the current dataset is DNA (100 bp reads from genomic data that was targeted to probes develop from transcripts).

                          If I am mapping to the relatively short exons, not the transcripts, would it make sense to map the data as single reads not paired end since for many of the pairs one of the reads may not land on the same exon but may completely land on the adjoining intron?

                          Comment

                          • Brian Bushnell
                            Super Moderator
                            • Jan 2014
                            • 2709

                            #14
                            Originally posted by tsecogen View Post
                            If I am mapping to the relatively short exons, not the transcripts, would it make sense to map the data as single reads not paired end since for many of the pairs one of the reads may not land on the same exon but may completely land on the adjoining intron?
                            I think in this situation, where you expect the majority of alignments to be unpaired, mapping the reads as single-ended makes sense.

                            Comment

                            Latest Articles

                            Collapse

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, Today, 10:09 AM
                            0 responses
                            9 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, Yesterday, 08:59 AM
                            0 responses
                            14 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 12:03 PM
                            0 responses
                            24 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 11:40 AM
                            0 responses
                            20 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...