Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Good point gringer! however, because I have already flash-stitched the reads, I expect that the sequence I'm looking for will be in the same orientation in all reads. Still, I would see if I have any potholes. Also, I chose 14nt because my samples are from a bacteria with a genome size of 4.2 million, so I expect anything equal or more than 12nt will be unique.
-
Originally posted by PoorSeq View Postbioawk -c fastx '/SEQUENCE/ {print "@"$name; print $seq; print "+"; print $qual }' inut.fq > output.fq
Note that what you've got there won't work for reverse complement orientation, so you'll need to have both forward and reverse included. Also, picking any 14+nt substring of SEQUENCE (or its reverse-complement) will be a bit trickier to implement.
Leave a comment:
-
Thanks all for the answers, specially dariober for the code. I have also developed a bioawk code later which I was able to use. The code is:
bioawk -c fastx '/SEQUENCE/ {print "@"$name; print $seq; print "+"; print $qual }' inut.fq > output.fq
Yes, my aim is to collect all the reads that contains a sequence (or subsequence, at least 14nt) and make a new file.
Leave a comment:
-
Originally posted by PoorSeq View PostI'm pretty new to bioinformatics and sorry if it's not worthy of asking here.
I have a FLASH stitched fastq file from my paired end data, from which I want to sort the reads containing a particular sequence or part of that sequence in any orientation, and make a new fastq file with them. Is there any easy tool/code to do that?
Code:## Get reads containing substring AAA or its revcomp TTT gunzip -c fastq.fq.gz \ | paste - - - - \ | grep -P '^@.*?\t(.*?AAA.*?)|(.*?TTT.*?)\t\+' \ | tr '\t' '\n' \ | gzip > sub.fq.gz ## Example input fastq: @seq1 ACTGAAACTG +comment IIIIIIIIII @seq2 ACTGNNNCTGTTT +comment BBBBBBBBBBBBB @seq3 CCCCCCCCCCCCC +comment BBBBBBBBBBTTT @seq4 AAACCCCCCCCCC +comment BBBBBBBBBBTTT ## Output sub.fq.gz @seq1 ACTGAAACTG +comment IIIIIIIIII @seq2 ACTGNNNCTGTTT +comment BBBBBBBBBBBBB @seq4 AAACCCCCCCCCC +comment BBBBBBBBBBTTT
| paste - - - -"
Leave a comment:
-
If you are trying to get reads that align to a particular sequence, try bowtie2. Not sure if that's your purpose though?
Leave a comment:
-
I have a FLASH stitched fastq file from my paired end data, from which I want to sort the reads containing a particular sequence
or part of that sequence
Leave a comment:
-
how to parse reads containing a particular sequence in any orientation
I'm pretty new to bioinformatics and sorry if it's not worthy of asking here.
I have a FLASH stitched fastq file from my paired end data, from which I want to sort the reads containing a particular sequence or part of that sequence in any orientation, and make a new fastq file with them. Is there any easy tool/code to do that?Last edited by PoorSeq; 10-30-2013, 10:25 PM.
Latest Articles
Collapse
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
09-30-2024, 08:33 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: