Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Single-end SOLiD data for TopHat

    Hi

    I have some single-end SOLiD data about 35bp. run TopHat.
    There are some questions about TopHat. I use TopHat to produce accepted_hits.bam, but "Error Can't read file: accepted_hits.bam" in UCSC.
    Cufflinks use SAM, however, TopHat produce BAM and not SAM.
    How the maximal reads data is?
    There is a question "Error: segment-based junction search failed with err = -11"

    I look forward to everyone letter.

    Thank you

  • #2
    BAM to SAM is an easy one

    Code:
    samtools view accepted_hits.bam > accepted_hits.sam
    I haven't seen that error message, but plenty of others! You might check through this recent thread (definitely make sure you have 1.1.1 -- which I need to go install)

    Comment


    • #3
      Originally posted by wlnjseu View Post
      Hi
      Cufflinks use SAM, however, TopHat produce BAM and not SAM.
      Cufflinks accepts BAM, there should be some other problem, Do make sure that you're using the very last version though!

      Comment


      • #4
        Single-end SOLiD data for TopHat

        I am a new hand on RNA-Seq. I used TopHat1.1.1 to produce Bam, but it didn't support in UCSC.
        I convert Bam to SAM.
        SAM:979_695_572_F3 0 chr1 564895 1 35M * 0 0 AGAACCCATCCCTGAGAATCCAAAATTCTCCGTGC qqq!!qqqqqqqqqqqqqqqqqqqqqqqqqqqqqq NM:i:1 NH:i:3 CC:Z:chr2 CP:i:132143154
        986_1661_1934_F3 0 chr1 564911 3 35M * 0 0 AACTCAAAATTCTCCGTGCCACCTATCACACCCCA qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq NM:i:2 NH:i:2 CC:Z:chrM CP:i:4362
        977_724_1577_F3 0 chr1 564914 3 35M * 0 0 CCAAAATTCTCCGTGCCACCTATCACACCCCATCC qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq NM:i:0 NH:i:2 CC:Z:chrM CP:i:4365
        966_1378_922_F3 0 chr1 564915 3 35M * 0 0 CAACATTCTCCGTGCCACCTATCACACCCCATCCT qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq NM:i:1 NH:i:2 CC:Z:chrM CP:i:4366

        If the mapping position of the query is not available, RNAME and CIGAR are set as “*”, and POS and MAPQ as 0.
        I think the question maybe lead to error in UCSC, some suggestions?
        I look forward to everyone letter
        thank you
        Last edited by wlnjseu; 10-14-2010, 07:49 PM. Reason: add content

        Comment


        • #5
          Single-end SOLiD data for TopHat

          Last edited by wlnjseu; 10-14-2010, 08:12 PM. Reason: repeat

          Comment


          • #6
            This isn't a paid helpline; you really should be patient.

            Try using "head" to take slices of your SAM file & see if they upload; alternatively use your favorite language to split the SAM file into N equal chunks (you might use samtools to filter out the unmapped reads first). That will help you figure out where an example of actually troublesome data is (unless it really is just the size of your file) & really get to the bottom of it.

            Comment


            • #7
              I havn't used UCSD genome browser myself, but as far as I heard it's more desirable to upload wiggle files there, since SAM is huge.
              and about yor problem with the "*" ones, those are the reads that are not aligned to anywhere, you may check if UCSD browser needs only the aligned ones. In that case you can simply remove all of the non-aligned sequences from your SAM file using sth like this in bash:
              Code:
              gawk '$3!="*"' File_with_non-aligneds.sam  >File_with_only_aligned.sam
              hope this helps

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X