Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • archie.chauhan
    Junior Member
    • Nov 2011
    • 9

    FastX tool for removing duplicates

    Hi,
    I have gone through various SeqAns posts regarding duplicate removal but could not get desired answer. Since I am a mol biologist new to bioinformatics i have a few queries.
    I am having illumina DNA 2x100 paired end reads. FAstQC analysis indicated a large number of duplicates which seem to be correct. Since the dataset is too big I wanted to remove the duplicates. Therefore, i used Galaxy. I first used Fastq groomer followed by FastX collapse for both R1 and R2 reads separately. My plan of action was : to first remove duplicates, filter and trim my seq and finally assemble them using velvet. As far as I know velvet requires shuffling of the paired end reads prior to assembly. Therefore I have few questions wrt my approach:
    1) the fastX collapse tool gives its own headers to the seq. It seems that the paired end information is lost. Am I right OR it just that the headers have changed but the inf is still there. If so where is it?
    2) I used R1 and R2 reads separately for grooming and FastX collapse analysis. Should i first shuffle my reads using velvet and than use the FastX collapse tool on the shuffled seq OR
    3) I should first join the paired end data and then use FastX tool. But in this case how do i do shuffling with velvet?

    I would appreciate if someone can answer the queries.

    Regards,
    Archana

Latest Articles

Collapse

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-09-2026, 11:58 AM
0 responses
24 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
29 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
39 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-02-2026, 12:03 PM
0 responses
62 views
0 reactions
Last Post SEQadmin2  
Working...