Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cursecatcher
    replied
    Originally posted by Richard Finney View Post
    NC_000016 is a name used for "chr16".
    This the "official" name used at NCBI : https://www.ncbi.nlm.nih.gov/nuccore/NC_000016.10/

    You have to convert the "NCBI name" to "chr" names (or vice versa).

    There are many ways to rename fields. You can always brute force it using a custom simple program or script using your favorite programming language : bash, python, perl, C, etc.

    Any easy way would be to reheader the bam file. Please see samtools documentation for this.
    It works, thank you so much!!
    I'm sorry for the triviality of the problem, but I'm not very practical with this stuff and the bedtools message wasn't very helpful.
    Again, thank you!

    Best regards
    Nicola

    Leave a comment:


  • Richard Finney
    replied
    NC_000016 is a name used for "chr16".
    This the "official" name used at NCBI : https://www.ncbi.nlm.nih.gov/nuccore/NC_000016.10/

    You have to convert the "NCBI name" to "chr" names (or vice versa).

    There are many ways to rename fields. You can always brute force it using a custom simple program or script using your favorite programming language : bash, python, perl, C, etc.

    Any easy way would be to reheader the bam file. Please see samtools documentation for this.

    Leave a comment:


  • cursecatcher
    replied
    Originally posted by Richard Finney View Post
    Check your chromosome names.
    Are they "chr" style in both bed and bam?
    Hi Richard, thanks for the reply.
    About your question, I think not.

    In the bed file I have record like this:
    Code:
    chr1    11123000        11123242        chr1:11123018-11123218
    chr1    16696418        16696674        chr1:16696447-16696647
    while in the SAM file (and consequently in the BAM file) I have record like that:

    Code:
    M03971:33:000000000-BN5NL:1:2114:12003:16132    99      NC_000016.9     24163387        255     151M    =       24163431        194     TGATCGGTGGTGA
    TGGGTTAGGTAGAGTGTATTAGTTCGTTTTTATGTTGTTGATAAAGATATATTCGAGATTGTGTAATTTATGAAAAAGAGGTTTAATGGATTTGGGGAGGTTTTAATTATGGTGGAAGGTTAAAGTTATGTTTTATAT BCCCCCCBBC
    ABGGGGGGFGGGGHHHHFGGHHHHHHHHGGHGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHFHHHHGGHHHHHHHHHHHHHHHHGGFGGEHHGHHHHHHHHHHGHHHGHHHHHHHHHHHHHHHHHHH NM:i:1
      ZS:Z:++
    Ps. the SAM file is the result of an alignment with BSMAP.

    Is this the problem? How can I resolve it?
    Nicola

    Leave a comment:


  • Richard Finney
    replied
    Check your chromosome names.
    Are they "chr" style in both bed and bam?

    Leave a comment:


  • cursecatcher
    started a topic Unable to intersect BAM file with bedtools

    Unable to intersect BAM file with bedtools

    Hi everyone, I'm trying to use bedtools intersect to check the number of mapped reads in target regions (in a .bed file) originated by targeted bisulfite sequencing experiment (EpiSeq Roche).

    I used the following command.

    Code:
    ./bedtools intersect -bed -abam sample2.bam -b 
     ~/Data/MethylSeq/dataset/Agesmoke_dataset/AgeSmkSop_all_primary_targets.bed
    The program terminate with the following message and no result at all.

    Code:
    * WARNING: File sample2.bam has inconsistent naming convention for record:
    NC_000016.9  24163386  24163537  M03971:33:000000000-BN5NL:1:2114:12003:16132/1  255  +
    
    * WARNING: File sample2.bam has inconsistent naming convention for record:
    NC_000016.9  24163386  24163537  M03971:33:000000000-BN5NL:1:2114:12003:16132/1  255  +
    I tried to modify the original SAM file removing the read that cause the problem (that was the first read in the SAM file) and the problem persists with the second read. I tried also the option -nonamecheck with no results.

    Can someone help us? Thank you.
    Nicola

Latest Articles

Collapse

  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    Today, 06:43 AM
  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 06-21-2024, 07:49 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-20-2024, 07:23 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-17-2024, 06:54 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Working...
X