Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ximo
    Junior Member
    • Feb 2012
    • 7

    newbler2.6 454 and illumina seq, help

    Hi all,

    I am trying to assembly a trancriptome with 454 and Illumina sequences using newbler2.6. The runAssembler finished without error. It charges the sequences; but no illumina sequences have been aligned, only 454 ones. It seems that the program don't use the file to do the assemble, in the tests only expend 6 second in finish the analysis. We have tested with different parameters and options without result. Could you help me?

    Thanks,

    Ximo

    /runAssembly -cdna -mi 95 -ml 20 illumina_.sfastaq 454_.sfastaq

    454NewblerMetrics.txt: readAlignmentResults
    {
    file
    {
    path = "/illumina_.sfastq";

    numAlignedReads = 0, 0.00%;
    numAlignedBases = 0, 0.00%;
    inferredReadError = 0.00%, 0;

    I have tried to test if runAssembly can read my illumina_fastaq sequences with this test

    /runAssembly -cdna -mi 95 -ml 50% illumina_test_file.fastaq

    output:

    >Created assembly project directory newbler_test
    >1 read file successfully added.
    > test_100000_ill (Fastq dataset, with standard scores)
    >Assembly computation starting at: Tue Mar 27 12:35:19 2012 (v2.6 (20110517_1502))
    >Indexing/Screening test_100000_ill (with quality scores)...
    > -> 100000 reads, 3668500 bases.
    >Building contigs/isotigs...
    > -> 0 large contigs, 0 all contigs
    > -> 0 isogroups, 0 isotigs
    >Computing signals...
    > -> 0 of 0...
    >Checkpointing...
    >Generating output...
    > -> 0 of 0...
    >Assembly computation succeeded at: Tue Mar 27 12:35:23 2012

    The runAssembler can read my sequences (test without cdna option):

    /runAssembly -mi 95 -ml 50% -urt illumina_test_file.fastaq

    runAssembly -o newbler_test test_file.100000_ill [12:41:44]

    Output:
    >Created assembly project directory newbler_test
    >1 read file successfully added.
    >test_100000_ill (Fastq dataset, with standard scores)
    >Assembly computation starting at: Tue Mar 27 13:02:27 2012 (v2.6 (20110517_1502))
    >Indexing test_100000_ill (with quality scores)...
    > -> 100000 reads, 3668500 bases.
    > Warning: Suspected 5' primer AAGCAGTGGTATCAACGCAGAGTAC, 15773 exact matches found.
    > Warning: Suspected 5' primer AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTT, 8227 exact matches found.
    > Warning: Suspected 3' primer GTACTCTGCGTTGATACCACTGCTT, 2397 exact matches found.
    > Warning: Suspected 3' primer AAAAAAAAAAAAAGTACTCTGCGTTGATACCACTGCTT, 8227 exact matches found.
    >Building contigs/scaffolds...
    > -> 0 large contigs, 0 all contigs
    >Computing signals...
    -> 0 of 0...
    >Checkpointing...
    >Generating output...
    > -> 0 of 0...
    >Assembly computation succeeded at: Tue Mar 27 13:02:29 2012
  • colindaven
    Senior Member
    • Oct 2008
    • 417

    #2
    It may not be too helpful, but we used illumina and 454 together in a genome project.

    We used gsAssembler to build 454 contigs, then Velvet to assemble Illumina.

    Afterwards we included mapped fragments <2000bp of the Illumina assembly as fake reads
    in Newbler.

    It didn't improve results too much however, so we ended up using SSPACE with PE Illumina reads and 454 contigs.

    Comment

    • flxlex
      Moderator
      • Nov 2008
      • 412

      #3
      Looks like something is wrong with your fastq file. Could you post the first few lines (say 12 or 16)?

      Comment

      • ximo
        Junior Member
        • Feb 2012
        • 7

        #4
        Originally posted by flxlex View Post
        Looks like something is wrong with your fastq file. Could you post the first few lines (say 12 or 16)?
        I have used this file with mira and bwa whitout problems, but?

        Thanks

        @CUES000161
        AGAGAATCACCTGCTCAGTACAAAAATAATGACGCCCA
        +
        ######################################
        @CUES000162
        AAGCAGTGGCATCAACGCAGAGTACGC
        +
        GG5>3C;AC<DD=DDFFFAD@?79<><
        @CUES000163
        AGATTGTTGCCTGGATTATGATATGATACAATACAAAT
        +
        HHGHHHHGFH?HHHHHH0HHHHADHCHHHHEHGHHH=H
        @CUES000164
        TCTTGTTGTTCGAGTCAATAGGAGCTGTACTCTGTACT
        +
        FEFEFFFEFFEE:<FEE:EEFFFBFFEFF>G:F@=CCE
        @CUES000165
        GATATGTTTGTAGGAATTTTCTTGAACTTTTTACCAAT
        +
        GGGGGGCCCG3FCDD55544GGBBGBGGGGGGGGGGFE
        @CUES000166
        CTTTGCTTCTTCAGTTCAAATTGGAATTTGAGCTCGGA
        +
        C>@AC3CCCCA>.@<[email protected]
        @CUES000167
        ATTGGATATTTTTGTTAAATTATGTTTGTTCCAAAGAT
        +
        HHGHHGGHHHHHEEEEEHHHHHHHHHHHHHHHHHHHGA
        @CUES000168
        TATACTTATGTACAAGACGCTGTTATTGATATTAAATC
        +
        GHHCHHHHHHHHHGGHHFHHGHHE8EDFFFBHHGF1DA
        @CUES000169
        AGAATGTGAACCCACACACACAGCCATTTGGATCACTT
        +
        AEGDGGGGGDGGFEGEGEECG2GCGCCGGGGFGGCGCG
        @CUES000170
        CGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTT
        +
        ?>4?CC8;<F3)A@3DBBD5A459<FF??FBBBBBBBB
        @CUES000171
        TTGGAGGAAAGTTCAGCCATCCCAATAATGAAAGAGAT
        +
        ?FFFDFFDGGDG?FGAGGADDGG=AC5?CC2DDD=AFF
        @CUES000172
        GATGAACATTTTAAAATCTTAATTCCTCCAATTTGGAT
        +
        CCCCCAGAGGGGGGCCGFGGGGGGGGGGGGGGGGGGGG
        @CUES000173
        GGTATGGGTGAGTTTGGTGATCGTTACTTCGGAACTGA
        +
        HHGHHHHHEHEHHHHHHDHBFGGFG@FGGFHHHEHHHE
        @CUES000174
        TTCCAAAGGGGTCGCCTTTTCAATCTCCACCATTCATG
        +
        GGGDDC;CCCGCGEGG?EGCEBBEEGGB7GFBEFG?0D
        @CUES000175
        ATCCAACTGCTGTGGAAGGCCGTCTCCTTTCAGTCAGC
        +
        ==<<;1@9@>=E@EEHHACHHHHHHHHHHDH?HHEHHH
        @CUES000176
        GAGAAGGGTTATCAGATCATGATTCCTTTCTTTGATTG
        +
        BGHGHHGCDHHDFHFGHHHEHAHHHCHHHBHHCCFHHH
        @CUES000177
        TATATTCTTCGGGCAGCCGCCATTAAAGCTTTGGGATC
        +
        FFF?FAF?FFDDGAGDFG?DA=G=5C/=ACGG?.GAA=
        @CUES000178
        AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTT
        +
        HGHHHHHHHGEHHHHHHHHHHHHHH8=DAD=;<6<>C=
        @CUES000179
        TGAATTTCTATCTACAAACATGAACAATACCAATCTCT
        +
        DDADAD5@@AFFFFF>>;?>GD55A>>>?;:A/AD;A?
        @CUES000180
        AGCAGCCTCCACGTATGAACTCATCGTCACGTTAGATT
        +
        HGGEDHHHHHEHHHHHH>HHCECHHHHHDDHFEBF3<A

        Comment

        • flxlex
          Moderator
          • Nov 2008
          • 412

          #5
          Sorry, nothing is wrong with your file of course. However, newbler will not recognize it. It expects this header style:

          Read 1:
          Code:
          @EAS139_FC706VJ:2:2104:15343:197393#0/1
          GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACC
          +
          IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC
          Read1 (in a separate file)
          Code:
          @EAS139_FC706VJ:2:2104:15343:197393#0/2
          CGATGGTCGTTTCGGAAGATGACGTGAATTGCCTGG
          +
          IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC
          The /1 and /2 at the end tell newbler the pairing info.

          One solution would be to adjust the header. Alternatively, you could convert your files to fasta + qual files, and include the pairing information int header, as I explain in my blog post here.

          Comment

          • ximo
            Junior Member
            • Feb 2012
            • 7

            #6
            flxlex,

            Thanks for your help and for your useful post, but my reads are not paired-end. Do you know if Newbler works with non paired-end Illumina data?

            Thanks

            Ximo

            Comment

            • flxlex
              Moderator
              • Nov 2008
              • 412

              #7
              Not if they are that short. Newbler's minimum read length is 50 bases, which I now see is why your 36 base reads did not assemble. You could try setting the minlen parameter to your read length. But don't try to assemble the Illumina reads only using newbler, it is not built for such short reads...

              Comment

              • wustudybreak
                Junior Member
                • Jan 2012
                • 1

                #8
                454 newbler runMapping alignment

                Hello,

                Does anyone know if 454 runMapping alignment doing local alignment or global alignment?

                Any information on how its aligning algorithm is helpful.

                Thanks

                Comment

                • ximo
                  Junior Member
                  • Feb 2012
                  • 7

                  #9
                  Originally posted by flxlex View Post
                  Not if they are that short. Newbler's minimum read length is 50 bases, which I now see is why your 36 base reads did not assemble. You could try setting the minlen parameter to your read length. But don't try to assemble the Illumina reads only using newbler, it is not built for such short reads...
                  I have tested this parameter, but I have the same result. When I have used 454 and Illumina seqs, it makes the assembling but in the 454ReadStatus.txt the illumina seqs are all labeled as TooShort

                  runAssembly -ml 50% -mi 95 -minlen 15 -o newbler_test test_100000_ill test_100000_454


                  Any suggestion?
                  Thanks

                  Comment

                  • flxlex
                    Moderator
                    • Nov 2008
                    • 412

                    #10
                    Oops... I had forgotten that reads between minlen and 50 bases only are used when there is at least one read dataset that newbler recognizes as paired end (i.e. mate pair, long insert library). In your case, I don't think you can use newbler for your short reads. Perhaps you can assemble the Illumina reads into contigs using something like velvet, and use those contigs as reads for a contigs+454 reads assembly?

                    Comment

                    • ximo
                      Junior Member
                      • Feb 2012
                      • 7

                      #11
                      Ok. Thanks a lot

                      Ximo

                      Comment

                      • flxlex
                        Moderator
                        • Nov 2008
                        • 412

                        #12
                        I just saw that newbler 2.7, which just came out, has a new flag: -short "Force use of reads shorter than 50 bp in projects that don’t include any paired end data. Reads shorter than 50bp are automatically used if any paired-end data is used in the project. The lower limit is 20 bp (or minlen if –minlen is used)."

                        So, I advice you to try to get you hands on this version (through the Roche website)!

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Pathogen Surveillance with Advanced Genomic Tools
                          by seqadmin




                          The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                          03-24-2025, 11:48 AM
                        • seqadmin
                          New Genomics Tools and Methods Shared at AGBT 2025
                          by seqadmin


                          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                          The Headliner
                          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                          03-03-2025, 01:39 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 03-20-2025, 05:03 AM
                        0 responses
                        41 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-19-2025, 07:27 AM
                        0 responses
                        51 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-18-2025, 12:50 PM
                        0 responses
                        38 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-03-2025, 01:15 PM
                        0 responses
                        193 views
                        0 reactions
                        Last Post seqadmin  
                        Working...