Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • lobSTR dealing with the paired-end BAM file

    Hello:

    I use the software lobSTR to deal with the paired-end bam file. As the document shows, I should run the follow code before the bam file sorted by read name.

    '''
    lobSTR \
    --index-prefix hg19_v3.0.2/lobstr_v3.0.2_hg19_ref/lobSTR_ \
    -f my_sample.bam --bampair \
    --rg-sample my_sample.sorted.bam --rg-lib my_sample \
    --out my_sample_output
    '''

    My question is which bam file ir right in the above code's third line and fourth line, the bam before sorted or after sorted?
    The document shows there should be the bam file before sorted, but I think it should be the sorted rather than the raw bam file in the third line.
    Besides, I think the fourth line should be a tag not the sorted file.

    Any other ideas about it and could anyone solve my confusion. Thanks!

  • #2
    The --rg-sample and --rg-lib parameters should be tags with read-group information, not the bam files.

    See the lobSTR FAQs for more info.



    I think if you start with a bam file rather than fastq files then it should be sorted and indexed. Note that the command line options description for --bampair says that the file should be sorted by name order, 'samtools sort -n'.

    Comment


    • #3
      Originally posted by mastal View Post
      The --rg-sample and --rg-lib parameters should be tags with read-group information, not the bam files.

      See the lobSTR FAQs for more info.



      I think if you start with a bam file rather than fastq files then it should be sorted and indexed. Note that the command line options description for --bampair says that the file should be sorted by name order, 'samtools sort -n'.
      I know what you mean, the --rg-sample and --rg-lib parameters should not be the bam files. But I am confused of the third line "-f my_sample.bam --bampair". I think the -f parameter should be the sorted bam not the raw bam. The document shows it is the raw bam not the sorted bam file,which confuse me.

      the document is:

      Comment


      • #4
        In the end, it shouldn't matter, because anything that is expecting an unsorted bam file will work perfectly well if you give it a sorted bam file (but of course the reverse is not true, and if a program is expecting a sorted bam file it will throw an error if you give it an unsorted bam file).

        I agree with you that the lobSTR documentation is confusing and not consistent. I had looked at the 'usage' page, which seems to give different examples than the genotype-calling page you gave the link to. Also in the usage page's definition of parameters, for --bampair it says that you have to give a bam file sorted by name (samtools sort -n), which it doesn't mention anywhere else where it shows examples of sorting the bam files with samtools sort.

        Comment


        • #5
          I think maybe I understand a bit better now.

          I think if you are running lobSTR with a paired-end bam file, you need to give it a bam file that has been name-sorted (samtools sort -n) for the -f parameter, because it needs the two reads of a pair to be next to each other in the file.

          Later steps, like running allelotype, may need bam files sorted by coordinate, as shown in some of the examples where they sort the bam files that are output from running losSTR.

          Hope this makes more sense.

          Comment


          • #6
            Originally posted by mastal View Post
            I think maybe I understand a bit better now.

            I think if you are running lobSTR with a paired-end bam file, you need to give it a bam file that has been name-sorted (samtools sort -n) for the -f parameter, because it needs the two reads of a pair to be next to each other in the file.

            Later steps, like running allelotype, may need bam files sorted by coordinate, as shown in some of the examples where they sort the bam files that are output from running losSTR.

            Hope this makes more sense.


            Thank you for your reply. I agree with you, but there is something wrong with me.

            When I run lobSTR with my name-sorted paired-end bam file just like what you mentioned, I get the follow warnings:

            '''
            WARNING: Could not find pair for BRISCOE:4:... Is the bam file sorted by read name?
            '''

            All my screen is full of the warnings. Is it nomal?

            Besides, I when I run allelotype with the bam it generates, I also get the warnings in my output:

            '''
            WARNING: Skipping locus chr1:123585375. Invalid period size (20)
            WARNING: Discarding duplicate of locus chr1: 123587826
            '''

            I don't know what it means and what should I do?

            Thank you.
            Last edited by Alphabets; 03-21-2016, 05:03 AM.

            Comment


            • #7
              Where do the bam files you are trying to use with lobSTR come from?

              Are they unaligned bams, or are they the result of alignment with another aligner?

              Have any reads been removed from the data set? For example, reads that didn't align, leaving some reads without a mate?

              Comment


              • #8
                Originally posted by mastal View Post
                Where do the bam files you are trying to use with lobSTR come from?

                Are they unaligned bams, or are they the result of alignment with another aligner?

                Have any reads been removed from the data set? For example, reads that didn't align, leaving some reads without a mate?
                The bam files I use are alignment of HiSeq reads aligned to the reference genome, removed redundancy and base realignments were done.

                Does it matter? And how should I do?

                Thank you!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X