Header Leaderboard Ad

Collapse

Concatenate several SRA reads to a single fastq file

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Concatenate several SRA reads to a single fastq file

    Hi all,

    I want to re-analyse a dataset available in GEO:

    I've downloaded all the files, however it seems that each replicate/experiment has several file corresponding to several runs. The question is: how to concatenate these runs once they are converted into fastq? Would a simple cat command work?

    Also, is this a silly thing to do? (I'm a newbie) Ultimately my goal is to determine gene expression changes.

    Cheers.

  • #2
    [QUOTE=krespim;80152]

    I've downloaded all the files, however it seems that each replicate/experiment has several file corresponding to several runs. The question is: how to concatenate these runs once they are converted into fastq? Would a simple cat command work?
    [./QUOTE]

    I'm sorry, why exactly do you want to cat all these files? A simple a simple
    cat *.fastq >> consolidated_fastq.fq

    will be fine but whats your need for doing so? Every single run usually corresponds to a different sample so why merge all?

    Originally posted by krespim View Post

    Also, is this a silly thing to do? (I'm a newbie) Ultimately my goal is to determine gene expression changes.

    Cheers.
    Process files separately then use comparitive studies on the sam/bam files!

    Comment


    • #3
      Originally posted by arkal View Post
      will be fine but whats your need for doing so? Every single run usually corresponds to a different sample so why merge all?
      Well, this is actually my main issue as I don't know if each run is a different sample, or the same sample ran in multiple lanes. The sample GEO page lists 3 SRA files.

      The paper does not mention biological or technical replicates.

      Comment


      • #4
        Originally posted by krespim View Post
        Well, this is actually my main issue as I don't know if each run is a different sample, or the same sample ran in multiple lanes. The sample GEO page lists 3 SRA files.

        The paper does not mention biological or technical replicates.
        It seems to be the same sample in different lanes/runs... so you can either merge the fastqs and align or align the 3 separately and merge the sams! I recommend the latter as it will take less time (provide you have the resources to align them parallelly)!

        Comment


        • #5
          Originally posted by arkal View Post
          It seems to be the same sample in different lanes/runs... so you can either merge the fastqs and align or align the 3 separately and merge the sams! I recommend the latter as it will take less time (provide you have the resources to align them parallelly)!
          I have recently been granted access to a server with a very decent number or processors so that should be easy

          Thanks a lot arkal!

          Comment


          • #6
            No worries all the best!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
              by seqadmin



              Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
              03-21-2023, 01:49 PM
            • seqadmin
              Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
              by seqadmin




              Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
              03-10-2023, 05:31 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 01:40 PM
            0 responses
            7 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-29-2023, 11:44 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-24-2023, 02:45 PM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2023, 12:26 PM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Working...
            X