Concatenation of large number of fasta files

woa

Member

Join Date: Mar 2011

Posts: 13
- Share
- Tweet
#1

Concatenation of large number of fasta files

11-22-2023, 12:12 PM

Hello All,

I'm a beginner in Metaproteomics. I've a very large collection of Fasta files (around 100K) which I want to join to a single fasta file. Note that some of the files have a very large number of sequences (~ 2Mil, whole taxonomic family of organisms).

Total size of the sequences is 140GB.

I've access to a High Performance university Computer Cluster. I'm wondering if a simple command like "cat *.fasta > Joined.faa" will efficiently work for this volume of data, or, I need some better method?

eventually I want to run CD-HIT on the concatenated sequence file
Tags: None

Previous template Next

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad