Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem working with Illumina paired-end sequence data

    I'm new to SEQanswers. I have Illumina paired-end sequence data. After individually removing low quality sequences, duplicated sequences and sequences with human DNA, the total number of sequences in the forward and reverse sequence data is different. This problem blocks me to do further analysis. In the future, I want to use seq2amos.pl to convert paired-end sequence data to .afg file. Then use AMOScmp-shortReads to assemble short reads.

    Does anybody know any software or have script to help me figure it out?

    Any help is much appreciated. Thank you.

  • #2
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Here is the script that i used successfully to remove the unpaired reads from paired end reads. Hope this helps
      Attached Files

      Comment


      • #4
        Problem working with Illumina paired-end sequence data

        Dear upendra_35,

        Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
        "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
        Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

        Could you help me figure it out? I have no experience about perl language.

        Comment


        • #5
          Originally posted by yangfangisok View Post
          Dear upendra_35,

          Thanks for your help. I downloaded your script and changed .fq to .fa, because I already used FastX to convert fastq file to fasta file. I want to output .1.fasta and .2.fasta file. When I input "$ perl PE_match.pl --pe1 BVCN4.1.fa --pe2 BVCN4.2.fa", I am told that
          "Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337.
          Use of uninitialized value in print at PE_match.pl line 56, <IN2> line 16723337."

          Could you help me figure it out? I have no experience about perl language.
          Forgot to mention that this script is intented to work with Illumina version < 1.8 and that too fq files only. So you better off using your original fq files and try this again.

          Comment


          • #6
            Problem working with Illumina paired-end sequence data

            My original fq files are already paired.

            Comment


            • #7
              Originally posted by yangfangisok View Post
              My original fq files are already paired.
              Ok Let me get this right. You original fq files are paired and then you pass those files separately through Quality control and found out that after QC your paired end fq files have different number of reads. Right? What i meant to say before was to run the paired end fq files (after QC) using my script and finally you will have paired end fq files with the same number of reads and labelled as _matched_s_1.fq and _matched_s_2.fq. If you want to keep the unpaired reads separately let me know and i can give you another script. Hope this helps

              Comment


              • #8
                Problem working with Illumina paired-end sequence data

                Thanks for your reply. Let me make it clear. After I get fq file, first of all, I remove low quality sequences and output fa file. Then I remove duplicated sequence and sequence with human DNA from fa file. Finally, I got two fa file with different number of sequence. I want to remove unpaired sequence from the two data and output two fa file with the same number of sequence.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM
                • seqadmin
                  Understanding Genetic Influence on Infectious Disease
                  by seqadmin




                  During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                  Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                  09-09-2024, 10:59 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 10-02-2024, 04:51 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-01-2024, 07:10 AM
                0 responses
                21 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-30-2024, 08:33 AM
                0 responses
                26 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-26-2024, 12:57 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Working...
                X