Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting certain genes from a SAM/BAM file.

    Hello,

    So I am working on an experement in which I have narrowed down the data of interest to a pretty small number of genes, about 200. I want to use many of the tools out there for analysis, like picard for instance, but SAM/BAM files are often required. So since I need to compare the data output from picard with other data on these genes specifically, I want to make a pseudo-SAM file that is just the concatenated information from the SAM file for the right locations. So lets say that I want to look at data from a gene that starts at position 100 and goes to position 1000, then the next gene I am interested in is from position 5000-5500. Is there a relatively simple way to grab 100-1000, 5000-5500, etc and make a SAM/BAM file from only those regions of interest so I can use that (in conjunction with a respective pseudo-reference genome) to do my analysis? Thanks very much for any advice!

  • #2
    Just make a BED file of the regions and:
    Code:
    samtools view -b -L regions.bed alignments.bam > subset.bam

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-25-2024, 11:49 AM
    0 responses
    19 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-24-2024, 08:47 AM
    0 responses
    20 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    62 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    61 views
    0 likes
    Last Post seqadmin  
    Working...
    X