Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying substitutions in a region of BAM file

    Hi group:

    I am interested in finding substitutions in a read downstream of a particular site.

    In detail:

    My target region - chr2: 1200-2000.
    CRISPR site: chr2:1220-1240

    First, find the reads that have 'NO substitution' in 1220-1240. Within those reads, that means if read A has no substitution at this site, I want to count the substitutions downstream of site which is between 1240 - 2000 in the read A. I want to tabulate what kind of substitutions between 1240-2000 (for example. A->T is found in 1200 reads..or G>A is found in 200 reads..etc..)

    Second, find reads that have 'substitution' in 1220-1240 and in those reads see if downstream of that read have any substitution. In case yes, then what type of substitution and how many. For example, if read B has a substitution in 1220-1240, then I want to count the # of substitutions between 1240-2000 in read B.

    Case where there is substitution..

    1220--------------------1240-------------------------------------------------|
    |------------A------------|-------------------------------A/G------G/T-------|
    |--------G-------AT------|---------------------A/T-----------------G/T------|


    Case where is no substitution...
    1220--------------------1240-------------------------------------------------|
    |-------------------------|-------------------------------G/A------------------|
    |-------------------------|-------------T/G------------------------------------|
    |-------------------------|---------------------A/T----------------------------|



    What I could do :
    using pysam, I could seperate the reads that have and does not substitution in the 1220-1240 into two files.

    How can I find identify and count substitutions in 1240-2000.. region..
    Any ideas..


    thanks
    Adrian

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:06 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-30-2024, 12:17 PM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-29-2024, 10:49 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-25-2024, 11:49 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Working...
X