Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • maasha
    replied
    This can be done with Biopieces (www.biopieces.org) like this:

    Code:
    read_fasta -i genome.fna | patscan_seq -p tcgtgacggta | write_bed -o out.bed -x

    Leave a comment:


  • Richard Finney
    replied
    Here's a fast C solution ... output is 0 based (so add one).

    This takes about 2 minutes on my 2000 bogomips machine for hg19 ...

    ______ begin code _______
    // this program finds short patterns and reverse complemented patterns in *.fa fasta (genome) files
    // how to compile: gcc -Wall -O3 -o patmatch patmach.c
    // how to use : for i in *.fa; do cat $i | ./patmatch; done

    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <ctype.h>

    // put your pattern in pat1 and the reverse complement in pat2 ...
    char pat1[] = "TCGTGACGGTA"; // use upper case - NB: this should be a parameter but you can do it
    char pat2[] = "TACCGTCACGA"; // reverse complement of pat1

    #define MAXCHROMSIZE 254235640 // maximum chromsome size - edit as necessary - you will need to fix this IF you're using a whole genome FASTA
    char chr[MAXCHROMSIZE + 50];

    int main()
    {
    long int i,len;
    int j;
    char header[5012];
    char s[5012];
    char *spot = &chr[0];

    memset(chr,0,sizeof(chr));
    while (gets(s))
    {
    if (s[0] == '>') { strcpy (header,s); continue;}
    for (i=0;s[i];i++) s[i] = toupper(s[i]);
    strcat(spot,s);
    for ( ; *spot ; spot++) ;
    }
    len = spot-chr;
    spot = chr;
    for (i=0;i<len;i++)
    {
    if (chr[i] == (char)0) break;
    for (j=0;pat1[j]==chr[i+j];j++);
    if (j == 11) printf("F %s at %ld\n",header,i);
    for (j=0;pat2[j]==chr[i+j];j++);
    if (j == 11) printf("R %s at %ld\n",header,i);
    }
    return 0;
    }

    _______ end code ______

    Example for hg19 fastas ...

    -bash-3.00$ for i in *.fa; do cat $i | ./patmatch; done
    R >chr10 at 111870831
    F >chr11 at 36061863
    R >chr11 at 77190239
    R >chr12 at 119747880
    R >chr14 at 81206117
    R >chr14 at 95419269
    R >chr16 at 11844841
    F >chr16 at 78553508
    R >chr17 at 45266108
    F >chr1 at 17428420
    F >chr1 at 52442586
    F >chr2 at 25065131
    R >chr2 at 53867779
    R >chr2 at 114616666
    F >chr3 at 55121176
    F >chr4 at 1412897
    R >chr4 at 136465390
    R >chr5 at 84661141
    R >chr5 at 103058499
    F >chr7 at 2875412
    R >chr9 at 75467056
    Last edited by Richard Finney; 02-05-2013, 11:28 AM.

    Leave a comment:


  • rboettcher
    replied
    Hi HSV-1,

    You can use bowtie with option -c TCGTGACGGTA
    and specify that you want to output all alignments via option -a.
    However, in this case you also need to specify that only perfect matches are allowed via -v 0.

    See http://bowtie-bio.sourceforge.net/manual.shtml for more details.

    Regards
    Last edited by rboettcher; 02-05-2013, 07:48 AM.

    Leave a comment:


  • How to count the repeating times of a certain 11bp long sequence

    Hi, all
    I want to count the repeating times of an 11bp long sequence(for example: tcgtgacggta ) in human and mouse genome. Further I need know the positions of this sequence located in. This 11bp sequence is a potential motif . First I need know how many copies of this sequence in human and moue genomes. And then to know where they are.

    Is there any tool for this?

    Thanks .

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin



    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has seen remarkable advancements,...
    12-02-2024, 01:49 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-02-2024, 09:29 AM
0 responses
158 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 09:06 AM
0 responses
56 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 08:03 AM
0 responses
48 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
76 views
0 likes
Last Post seqadmin  
Working...
X