Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • splitting big paired fastq files

    Hi there,
    I do my 'bioinformatic' work in CLC. Now I sit with many (30) large files with paired end reads (~10GB each direction) and my computer is stalling if I'd try to use all in a de novo assembly. Hence, I am looking for a tool to split the files in, say, 4.
    I am afraid I am not familiar with the linux world. So, I am lookiing for scripts (R preferably, or Perl) that would solve this?

    Thank you.
    jd

  • #2
    If you split your fastq, you aren't going to get a good assembly. You really want a computer with more memory, so it can handle the whole fatq.

    If you really need to split it, use unix built-in programs.

    Code:
    split -l 40000000 myfastq.fq
    should split it into separate files, each with 40,000,000 lines, or 10 million reads.

    Comment


    • #3
      Thank you for your prompt reply!
      There are 150-200 mill reads in each of the paired fastq files and I just expected that to be quite redundant.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Non-Coding RNA Research and Technologies
        by seqadmin




        Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

        Nobel Prize for MicroRNA Discovery
        This week,...
        10-07-2024, 08:07 AM
      • seqadmin
        Recent Developments in Metagenomics
        by seqadmin





        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
        09-23-2024, 06:35 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 06:55 AM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-02-2024, 04:51 AM
      0 responses
      105 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-01-2024, 07:10 AM
      0 responses
      113 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-30-2024, 08:33 AM
      1 response
      117 views
      0 likes
      Last Post EmiTom
      by EmiTom
       
      Working...
      X