Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dagga
    replied
    Thanks!! I appreciate it!

    Leave a comment:


  • TiborNagy
    replied
    Just for you :-)
    Code:
    #!/usr/bin/perl
    
    $seq = "";
    
    while(<>){
       chomp;
    
       if(/^>/){
          if($seq ne ""){
             if($seq =~ /([^N]+)N+([^N]+)/){
                print  "$id.1\n$1\n";
                print STDERR "$id.2\n$2\n";
             }
          }
          $seq = "";
          $id = $_;
       }
       else{
          $seq .= $_;
       }
    }
    
    if($seq =~ /([^N]+)N+([^N]+)/){
      print "$id.1\n$1\n";
      print STDERR "$id.2\n$2\n";
    }

    Leave a comment:


  • Dagga
    replied
    Excellent!

    Whilst this does help with some genomes that I am assembling right now, we have some older genomes that were sequenced by BGI and these contain N's that we still need to have removed...

    Leave a comment:


  • mastal
    replied
    If you are doing your assemblies with velvet, setting '-scaffolding no' will stop velvet from joining contigs together with stretches of Ns.

    Leave a comment:


  • Dagga
    replied
    TiborNagy: Sorry, the file will be in fasta format post de novo assembly.

    would you be able to alter the script to handle contig names please?

    Thanks!

    Leave a comment:


  • TiborNagy
    replied
    mastal: you are right!
    Dagga: This script does not handle the contig names, only sequences, because you do not tell us what kind of input format do you have.
    Last edited by TiborNagy; 02-18-2014, 05:42 AM.

    Leave a comment:


  • Dagga
    replied
    Thanks for that!!

    Will this rename the contigs?

    Will the contig that is split be called the same thing in contig1.txt and contig2.txt.

    Is it possible to rename the contigs when they are split. For example, if contig 84 is split into two contigs can they be renamed contig 84.1 and contig 84.2 for each half, respectively?

    Leave a comment:


  • mastal
    replied
    should that be

    Code:
    print stderr $2

    Leave a comment:


  • TiborNagy
    replied
    perl -ne 'if($_ =~ /([^N]+)N+([^N]+)/){print $1;print stderr $1}' input.seq >contig1.txt 2>contig2.txt

    It will split the input file (input.seq) into contig1.txt and contig2.txt

    Leave a comment:


  • Dagga
    started a topic Remove N's and split contigs

    Remove N's and split contigs

    Hi,

    I have some genomes that I will be uploading to NCBI soon. I have been told that all N's need to be removed and the contigs split at this position.

    I am new to command line interface so I was hoping someone could recommend a program and simple script that could do this for me. I would like to remove all N's and then split the contig at the location of the N's results in two new contigs. For example

    Contig 1: ATCGGATAANNNNNNNNNATCGCCGAT

    Contig 1.1: ATCGGATAA

    Contig 1.2 ATCGCCGAT


    Thanks!

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 06-21-2024, 07:49 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-20-2024, 07:23 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-17-2024, 06:54 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
25 views
0 likes
Last Post seqadmin  
Working...
X