Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MG-RAST - header format problem?

    Hi everyone,

    after successfully using MG-RAST with assembled data, I am trying to use it with raw fastq illumina sequences. I originally had two fastq files (forward and reverse paired ends); I merged them and uploaded the file.

    I now receive an error message:
    Warning: The unique id count does not match the sequence count. You will not be able to use this file for submission.

    Basically the unique id count is half the number of sequences.
    My reads are ordered as forward and reverse with the following format:

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 1:Y:0:TTAGGC

    and

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 2:Y:0:TTAGGC

    My guess is that I may need to modify the header. Any suggestion?

    Thanks
    Max

    - Edit: I should be able to modify the header by myself (I know a little bit of Python), but I am not sure if that is the problem and what my header should be.
    Thanks again
    Max
    Last edited by mstagliamonte; 06-18-2013, 06:12 AM. Reason: Clarify my request

  • #2
    Only the first part of the header is being used to identify the read,
    Just replace the space with a "_" or other character.


    instead of;
    @HWUSI-EAS1700R:25:FC:6:1:12466:1106 1:Y:0:TTAGGC

    have

    @HWUSI-EAS1700R:25:FC:6:1:12466:1106_1:Y:0:TTAGGC

    Something like

    Code:
    sed 'e/\ /\_/g' seqfile > seqfile_ed

    Comment


    • #3
      Thanks,

      I'll try immediately and let you know.

      Regards,
      Max

      Comment


      • #4
        sorry,

        Code:
        sed 's/\ /\_/'

        Comment


        • #5
          Hahaha,

          I noticed

          I was not able to fix it, so I've just started running my python script.

          Let's see how it goes

          Comment


          • #6
            Hi, Ciaran,

            many thanks for your advice, it worked.

            Have a nice day
            Max

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X