Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dnusol
    Senior Member
    • Jul 2009
    • 136

    samtools specific region

    I have downloaded all fasta files for human chromosomes and merged them with cat. In order to use the "region" option in samtools view command, do I have to specify the exact header of the fasta file?

    so for example if I want to analyse only a specific region in chromosome 12:

    $ samtools view gi|89161190|ref|NC_000012.10|NC_000012 Homo sapiens chromosome 12, reference assembly, complete sequence:1000000-2000000

    I guess it's better then to modify the header to something like ">chr12".

    Any suggestions?

    Thanks
  • krobison
    Senior Member
    • Nov 2007
    • 734

    #2
    I would modify the header; if you don't you'll always need to put quotes around & type that whole mess. I don't believe in any case you would need the description information -- just the text up to the first space (but my samtools-indexed FASTA files lack descriptions, so I can't be sure)

    How did you convert FASTA to SAM? Or did you mean to say "samtools index"?

    Comment

    • bioinfosm
      Senior Member
      • Jan 2008
      • 483

      #3
      definitely, some sed commands to remove the ugly string and put >chr10 as header. Helps later on as well, when using UCSC or IGV!
      --
      bioinfosm

      Comment

      • dnusol
        Senior Member
        • Jul 2009
        • 136

        #4
        Thanks for your help

        I indexed the fasta reference using bwa, but before doing that I changed the headers using sed. If after aligning against the whole genome, just wanted to get subalignments for a specific region, I have to use the samtools view command with the region option, right?
        Last edited by dnusol; 03-17-2010, 12:30 AM.

        Comment

        • dnusol
          Senior Member
          • Jul 2009
          • 136

          #5
          Actually, just found out that the samtools view command does not work with the "region" option unless you feed an indexed BAM file, or so it seems:

          $ samtools view -uS /s_1/s_1.sam.gz chr6:136000000:146000000 | ./samtools sort - /s_1/s_1
          [samopen] SAM header is present: 25 sequences.
          [main_samview] random alignment retrieval only works for indexed BAM files.

          any suggestions?

          Comment

          • dnusol
            Senior Member
            • Jul 2009
            • 136

            #6
            I will answer myself.

            I first sorted and then indexed the BAM file. Then the region option seems to work.

            D.

            Comment

            • krobison
              Senior Member
              • Nov 2007
              • 734

              #7
              You just need to index (with "samtools index") the bam file that "samtools sort" generated -- then you are off to the races

              Comment

              • bioinfosm
                Senior Member
                • Jan 2008
                • 483

                #8
                I think its sort, and then index, but perhaps it works in the other order as well
                --
                bioinfosm

                Comment

                • genbio64
                  Member
                  • Dec 2009
                  • 42

                  #9
                  Does anyone know of a way to use Samtools to split off individual chromosomes for easier viewing? If I attempt to use samtools view <myfile.bam> chr1 I receive a bam file that I can no longer view because it does not associate with the indexed file it was derived from.

                  Comment

                  • nilshomer
                    Nils Homer
                    • Nov 2008
                    • 1283

                    #10
                    Originally posted by genbio64 View Post
                    Does anyone know of a way to use Samtools to split off individual chromosomes for easier viewing? If I attempt to use samtools view <myfile.bam> chr1 I receive a bam file that I can no longer view because it does not associate with the indexed file it was derived from.
                    Could you post the error message when it complains?
                    Try:
                    Code:
                    rm myfile.bam.bai
                    samtools index myfile.bam
                    samtools view -b myfile.bam chr1 > myfile.chr1.bam
                    samtools view -b myfile.bam chr2 > myfile.chr2.bam
                    ...

                    Comment

                    • genbio64
                      Member
                      • Dec 2009
                      • 42

                      #11
                      @nilshomer
                      I'll try that.

                      thanks

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 11:10 AM
                      0 responses
                      7 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      42 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      103 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      125 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...