Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Just for you :-)
Code:#!/usr/bin/perl $seq = ""; while(<>){ chomp; if(/^>/){ if($seq ne ""){ if($seq =~ /([^N]+)N+([^N]+)/){ print "$id.1\n$1\n"; print STDERR "$id.2\n$2\n"; } } $seq = ""; $id = $_; } else{ $seq .= $_; } } if($seq =~ /([^N]+)N+([^N]+)/){ print "$id.1\n$1\n"; print STDERR "$id.2\n$2\n"; }
Leave a comment:
-
Excellent!
Whilst this does help with some genomes that I am assembling right now, we have some older genomes that were sequenced by BGI and these contain N's that we still need to have removed...
Leave a comment:
-
If you are doing your assemblies with velvet, setting '-scaffolding no' will stop velvet from joining contigs together with stretches of Ns.
Leave a comment:
-
TiborNagy: Sorry, the file will be in fasta format post de novo assembly.
would you be able to alter the script to handle contig names please?
Thanks!
Leave a comment:
-
Thanks for that!!
Will this rename the contigs?
Will the contig that is split be called the same thing in contig1.txt and contig2.txt.
Is it possible to rename the contigs when they are split. For example, if contig 84 is split into two contigs can they be renamed contig 84.1 and contig 84.2 for each half, respectively?
Leave a comment:
-
perl -ne 'if($_ =~ /([^N]+)N+([^N]+)/){print $1;print stderr $1}' input.seq >contig1.txt 2>contig2.txt
It will split the input file (input.seq) into contig1.txt and contig2.txt
Leave a comment:
-
Remove N's and split contigs
Hi,
I have some genomes that I will be uploading to NCBI soon. I have been told that all N's need to be removed and the contigs split at this position.
I am new to command line interface so I was hoping someone could recommend a program and simple script that could do this for me. I would like to remove all N's and then split the contig at the location of the N's results in two new contigs. For example
Contig 1: ATCGGATAANNNNNNNNNATCGCCGAT
Contig 1.1: ATCGGATAA
Contig 1.2 ATCGCCGAT
Thanks!Tags: None
Latest Articles
Collapse
-
by seqadmin
The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...-
Channel: Articles
07-08-2024, 03:19 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 07-25-2024, 06:46 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
07-25-2024, 06:46 AM
|
||
Started by seqadmin, 07-24-2024, 11:09 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
07-24-2024, 11:09 AM
|
||
Started by seqadmin, 07-19-2024, 07:20 AM
|
0 responses
160 views
0 likes
|
Last Post
by seqadmin
07-19-2024, 07:20 AM
|
||
Started by seqadmin, 07-16-2024, 05:49 AM
|
0 responses
127 views
0 likes
|
Last Post
by seqadmin
07-16-2024, 05:49 AM
|
Leave a comment: