Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • KaiYe
    Senior Member
    • Jun 2009
    • 133

    #16
    Originally posted by Fabrice ODEFREY View Post
    Hi KaiYe,

    I'm working with SOLiD data...and would like to use Pindel but couldn't find anything about it. is Pindel only for Illumina data?
    thanks in advance for your reply.
    Fabrice
    hi Fabrice,

    I don't have a procedure with SOLiD data but would explore this together with you.

    First you need to convert the data from color space to sequence space.

    Second, convert the sequence to the correct strand. Pindel assume the data is paired-end as illumina so that the reads are facing each other rather than on the same strand.

    You may then try my sam2pindel.cpp to extract reads and run Pindel.

    Please visit https://trac.nbic.nl/pindel and register as a Pindel user.

    Kai

    Comment

    • Fabrice ODEFREY
      Member
      • May 2010
      • 21

      #17
      thanks a lot Kai for your quick reply.
      for the second step is there a tool to do that?
      thanks again!
      Fabrice

      Comment

      • KaiYe
        Senior Member
        • Jun 2009
        • 133

        #18
        Originally posted by Fabrice ODEFREY View Post
        thanks a lot Kai for your quick reply.
        for the second step is there a tool to do that?
        thanks again!
        Fabrice
        The second step is rather straight forward but requires knowledge on your SOLiD data about the strand. I know that some SOLiD data satisfies the second requirement without any modification but the others need additional a converting step.

        You may need to write a script to do that.

        Kai

        Comment

        • Fabrice ODEFREY
          Member
          • May 2010
          • 21

          #19
          alright, that's what I thought, thanks!
          Fabrice

          Comment

          • chariko
            Member
            • Jun 2010
            • 56

            #20
            Originally posted by KaiYe View Post
            I will send you my source code via email.
            Hi KaiYe,

            I am having the same problem as jtjli (http://seqanswers.com/forums/showthr...0820#post30820). I did as follows:

            1) Download all files from http://www.ebi.ac.uk/~kye/pindel/v_0.2.0/. I aligned with BWA, processed with samtools and filtered by MAPQ quality (<30).
            2) ran bam2pindel.pl on one paired-end samples (aligned using BWA). My bam file is sorted and duplicates are removed but it does not have the header expected by your program, so i used the -om to force the script to run. A file for each chromosome was generated: e.g. myprefix.1.txt (chr1)
            3) I downloaded your source code from sourceforge (with svn) and compiled your pindel from scratch. It seems to work.
            4) I run the following comand
            /home/Pindel_source_v0.2.2/pindel -f /home/hg19.fa -i /s_4_QC_sort_pind.bam_chr1.txt -o ./s4 -c chr1 empty

            but whichever chromosome i try, i always get "There are no reads for this chromosome":

            BreakDancer events: 0
            Processing chromosome: chr10
            Skipping chromosome: chr10
            ...

            Processing chromosome: chr1
            Chromosome Size: 249250621
            26926 10000
            Looking at chromosome chr1 bases 0 to 10000000.
            BinBorder 0 10000000
            There are no reads for this bin.
            Looking at chromosome chr1 bases 10000000 to 20000000.
            BinBorder 10000000 20000000
            There are no reads for this bin.
            ....
            Loading genome sequences and reads: 0 seconds.
            Mining, Sorting and output results: 0 seconds.

            What I am doing wrong? How did you solve jtjli's problem?
            Last edited by chariko; 05-09-2011, 11:36 PM. Reason: Incomplete

            Comment

            • KaiYe
              Senior Member
              • Jun 2009
              • 133

              #21
              Originally posted by chariko View Post
              Hi KaiYe,

              I am having the same problem as jtjli (http://seqanswers.com/forums/showthr...0820#post30820). I did as follows:

              1) Download all files from http://www.ebi.ac.uk/~kye/pindel/v_0.2.0/. I aligned with BWA, processed with samtools and filtered by MAPQ quality (<30).
              2) ran bam2pindel.pl on one paired-end samples (aligned using BWA). My bam file is sorted and duplicates are removed but it does not have the header expected by your program, so i used the -om to force the script to run. A file for each chromosome was generated: e.g. myprefix.1.txt (chr1)
              3) I downloaded your source code from sourceforge (with svn) and compiled your pindel from scratch. It seems to work.
              4) I run the following comand
              /home/Pindel_source_v0.2.2/pindel -f /home/hg19.fa -i /s_4_QC_sort_pind.bam_chr1.txt -o ./s4 -c chr1 empty

              but whichever chromosome i try, i always get "There are no reads for this chromosome":

              BreakDancer events: 0
              Processing chromosome: chr10
              Skipping chromosome: chr10
              ...

              Processing chromosome: chr1
              Chromosome Size: 249250621
              26926 10000
              Looking at chromosome chr1 bases 0 to 10000000.
              BinBorder 0 10000000
              There are no reads for this bin.
              Looking at chromosome chr1 bases 10000000 to 20000000.
              BinBorder 10000000 20000000
              There are no reads for this bin.
              ....
              Loading genome sequences and reads: 0 seconds.
              Mining, Sorting and output results: 0 seconds.

              What I am doing wrong? How did you solve jtjli's problem?
              hi,

              You should use -p for extracted reads. -i is for configuration file.

              Pindel accepts two types of input:
              1. extracted reads with sam2pindel or bam2pindel, using -p
              2. a configure file for a list of BAMs, using -i
              the format of the configure file
              /path/to/bam_1/BAM_1 400 sample_1
              /path/to/bam_2/BAM_2 400 sample_2
              ...
              /path/to/bam_n/BAM_n 400 sample_n

              you may also use -c chrN:start-end to specify a small region of the region to parallelize the computation.

              Kai

              Comment

              • icg
                Junior Member
                • Jan 2011
                • 5

                #22
                sam2pindel

                Originally posted by KaiYe View Post
                Would you please inform me your email address? I have cpp code to extract reads from sam files for Pindel.

                Thanks.

                Hi KaiYe,

                I'm trying to convert my BAM file (illumina single-end reads, aligned using Novoalign) to the pindel format, using sam2pindel but the output file is empty.

                I used the following command:
                ./sam2pindel novo.sam Output4Pindel.txt 300 test 0

                What am I doing wrong?


                Thanks in advance for your reply,

                Inbar

                Comment

                • KaiYe
                  Senior Member
                  • Jun 2009
                  • 133

                  #23
                  Originally posted by icg View Post
                  Hi KaiYe,

                  I'm trying to convert my BAM file (illumina single-end reads, aligned using Novoalign) to the pindel format, using sam2pindel but the output file is empty.

                  I used the following command:
                  ./sam2pindel novo.sam Output4Pindel.txt 300 test 0

                  What am I doing wrong?


                  Thanks in advance for your reply,

                  Inbar
                  hi Inbar,

                  sam2pindel requires the mate information stored in each record. I guess novoalign doesn't report that.

                  can you provide a few lines of sam records?

                  Kai

                  Comment

                  • chariko
                    Member
                    • Jun 2010
                    • 56

                    #24
                    Originally posted by KaiYe View Post
                    hi,

                    You should use -p for extracted reads. -i is for configuration file.

                    Pindel accepts two types of input:
                    1. extracted reads with sam2pindel or bam2pindel, using -p
                    2. a configure file for a list of BAMs, using -i
                    the format of the configure file
                    /path/to/bam_1/BAM_1 400 sample_1
                    /path/to/bam_2/BAM_2 400 sample_2
                    ...
                    /path/to/bam_n/BAM_n 400 sample_n

                    you may also use -c chrN:start-end to specify a small region of the region to parallelize the computation.

                    Kai
                    I finally managed it to work. After following your instructions I had to change also my input files (those generated by bam2pindel) because when comparing them with the demodata, mines had only one "@" in each line instead of 2 which had the demodata. I don´t know why did that happen because I obtained those input files with bam2pindel but anyway now it worked

                    Thanks a lot

                    Comment

                    • icg
                      Junior Member
                      • Jan 2011
                      • 5

                      #25
                      Hi Kai,

                      Thank you for the quick reply!

                      Here's the first 20 lines of my sam file.

                      Many thanks,
                      Inbar

                      @HD VN:1.0 SO:unsorted
                      @PG ID:novoalign VN:V2.07.05 CL:novoalign -d NC_007530.fna.nix -f output4.fastq -r ALL -o SAM
                      @SQ SN:gi|50196905|ref|NC_007530.2| AS:NC_007530.fna.nix LN:5227419
                      @SQ SN:gi|47566322|ref|NC_007322.2| AS:NC_007530.fna.nix LN:181677
                      @SQ SN:gi|50163691|ref|NC_007323.3| AS:NC_007530.fna.nix LN:94830
                      4:1:1169:930:Y 4 * 0 0 * * 0 0 NAAACAGTGAAGTATATAACGTACATGTCNAANNNNNNNNNNNNNNGNNNNNNNNNANNNNNNNNNNNNNNNNN #+,)-23444@@8@@@@@@@C@@@C################################################# PG:Z:novoalign ZS:Z:NM
                      4:1:1205:937:Y 0 gi|50196905|ref|NC_007530.2| 3730834 150 1S73M * 0 0 NAAAGAAGAATTACATCGCCATCTGTAGAATGAGCATAAGCTTTCACTACCGCTTCATCTAAAGTATCGACACT #(()'3..22@@@@@@@@7@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ PG:Z:novoalign AS:i:10 UQ:i:10 NM:i:0 MD:Z:73
                      4:1:1231:930:Y 16 gi|50196905|ref|NC_007530.2| 899640 150 73M1S * 0 0 GAAAAGACCGAATTATCAGAATGTGTCGAATCTTCTTTTGAGAAAGTTCTTGATAACGAATGGTTTTGTATAGN 22CC@@@@@@C@@C@@@@C@C@CCC@@@@@@222@@@@@CC@@@C@C@@C@@CCC@@@@@CC2257777+000# PG:Z:novoalign AS:i:6 UQ:i:6 NM:i:0 MD:Z:73
                      4:1:1259:938:Y 4 * 0 0 * * 0 0 NAGCAAGGCAATGTAAAAGGCGAAAGACAAACAGCGGAAAGAGAAATTGAAATACAAAATAAATTAAGAAATAC ########################################################################## PG:Z:novoalign ZS:Z:QC
                      4:1:1290:941:Y 0 gi|47566322|ref|NC_007322.2| 169212 150 1S73M * 0 0 NGAAAATGCTCTTCAACTATTTGTATAGTCTATGTCACTCTTTTTTGGACTTTCCATATTGGGAGGGATGATTA #(*()00322@@@@C@C@@C@C@C@C@@@C@@@C@@@@@@@@@2222@@@C@C@@@@@C@@@@:@222:<@@@@ PG:Z:novoalign AS:i:10 UQ:i:10 NM:i:0 MD:Z:73
                      4:1:1307:933:Y 0 gi|50196905|ref|NC_007530.2| 4528377 150 1S73M * 0 0 NACGTAGTGGAATAGTTGAAAATTTAGATGAAGCTGATCCAGAAATTATTTTCTACACAAAAAAGCTCAGAGCA #(.((*,*))77755/0/00@@@@@@@@@@@@@:@@@@@@57055@@@@@22222@@@@@@2222@@@@@@7@@ PG:Z:novoalign AS:i:13 UQ:i:13 NM:i:0 MD:Z:73
                      4:1:1349:932:Y 0 gi|50196905|ref|NC_007530.2| 5190228 150 1S73M * 0 0 NGAACTATTTGAAAGATTATCTACGACTATAATTTTATAATTATTATTTAATAATTCTACACATGTATGACTAC #)*,.3103.@@@7@3<<<:@@@@@@@@@@@@@@@@@22@@@@@@@@@22@@@@@@@@@@<<<:::::::@@@@ PG:Z:novoalign AS:i:8 UQ:i:8 NM:i:0 MD:Z:73
                      4:1:1427:930:Y 4 * 0 0 * * 0 0 NATGTATTTGAATTATAACGTGATTCAATTTGGTTCTGGCGCAAGGAACCCAAGGGAGTTATAACTAACTCCCT ########################################################################## PG:Z:novoalign ZS:Z:QC
                      4:1:1455:932:Y 16 gi|50196905|ref|NC_007530.2| 4436146 150 73M1S * 0 0 CAAGACCTCCGGAATATGCTAATACAACTTTTTTCTTCTCCATTTTGCATCCCCCTAAAGAATAAATATTCATN @C@@@@C@CC@CC@@C@C@@@@@@@@@22222@@@@@@@@CC@@@CC@@@CCC@@22CC22C@C55566*(,,# PG:Z:novoalign AS:i:8 UQ:i:8 NM:i:0 MD:Z:73
                      4:1:1503:932:Y 0 gi|50196905|ref|NC_007530.2| 5174187 150 1S73M * 0 0 NAGAAGGAGAAACTTCAAATACAGTGAAACACCGCGATGGCCGTGTTTATGCGGAAGTAAGTGCAAAACTAACA #(*)&)*)*+<77<:58777:::::<:<<:8888885888:<:<:<<3<<:::1:@@@@@@@@@@@@@@@::<< PG:Z:novoalign AS:i:15 UQ:i:15 NM:i:0 MD:Z:73
                      4:1:1513:948:Y 16 gi|50196905|ref|NC_007530.2| 2854792 150 73M1S * 0 0 TGTAGAAAGTGAAAGTAAAAAAGATTCCAAAGACGCTCGTCCTTTTTCTCTATGAAATTCTTCTGCAAAATAAN C@C@C@CC@@@@@@@2222@C@@CC@@@C@C@@C@C@C@@@222C@@@C@@CCC@CCC@@@@@C71115-///# PG:Z:novoalign AS:i:6 UQ:i:6 NM:i:0 MD:Z:73
                      4:1:1536:944:Y 16 gi|50163691|ref|NC_007323.3| 77515 150 73M1S * 0 0 TTGCTTCAAGAAGGCGAAGAACAAATTTCTCTTTTCGATAATGTCACGCAACGAGAACAAGAAGTAAAGCTTAN @CC@@@@C@@CCC@@@C@22C@@@@@@@22@@C@@C@CC@@@@@@@@@@C@C@CC@@@@C@@C@58454,0*,# PG:Z:novoalign AS:i:7 UQ:i:7 NM:i:0 MD:Z:73
                      4:1:1696:942:Y 4 * 0 0 * * 0 0 NCTTATCTGCAATTGAAGGAATTAAAGTAGACAAACATTCAACTGGTGGTGTTGGTGATACAACAACATTAGTA ########################################################################## PG:Z:novoalign ZS:Z:QC
                      4:1:1724:952:Y 0 gi|50196905|ref|NC_007530.2| 641128 150 1S73M * 0 0 NAGATCTATTTTCGATAAAAATAACGAATGAAATTCCTACAATTGTGATGGACCAGAGAACGCCGACAAATGTA #+++-32223C22CC@@CC222222@@@@0:::::CC@@@CC@C@@@C@@CC@@CCC@CC@C@C@C@@@@@@@@ PG:Z:novoalign AS:i:6 UQ:i:6 NM:i:0 MD:Z:73
                      4:1:1766:932:Y 4 * 0 0 * * 0 0 NCATTAAGAAGTTTCATCATGTCCGCTGTAAACTGTTGTTCTAGTTCGTTACTTAAGACGCTTCCCTTTGAAAG ########################################################################## PG:Z:novoalign ZS:Z:QC

                      Comment

                      • KaiYe
                        Senior Member
                        • Jun 2009
                        • 133

                        #26
                        Originally posted by chariko View Post
                        I finally managed it to work. After following your instructions I had to change also my input files (those generated by bam2pindel) because when comparing them with the demodata, mines had only one "@" in each line instead of 2 which had the demodata. I don´t know why did that happen because I obtained those input files with bam2pindel but anyway now it worked

                        Thanks a lot
                        one @ is enough.

                        Comment

                        • DexterDuncan
                          Junior Member
                          • Apr 2011
                          • 8

                          #27
                          pindel_filter

                          Hi Kai,

                          Thanks again to you and Eric_Wubbo for the new pindel and pindel2vcf. Is it still a good idea to use the filter of the bam2pinel.pl result files before using the new pindel?

                          Thanks,

                          Dex

                          Comment

                          • KaiYe
                            Senior Member
                            • Jun 2009
                            • 133

                            #28
                            Originally posted by DexterDuncan View Post
                            Hi Kai,

                            Thanks again to you and Eric_Wubbo for the new pindel and pindel2vcf. Is it still a good idea to use the filter of the bam2pinel.pl result files before using the new pindel?

                            Thanks,

                            Dex
                            If you use BAM files as input, you certainly don't have to use any filtering. If bam2pindel.pl is used first to extract reads, you can directly use it as input.

                            So you don't need to use filtering.

                            Kai

                            Comment

                            • DexterDuncan
                              Junior Member
                              • Apr 2011
                              • 8

                              #29
                              new pindel output format

                              Hi Kai,

                              Would you briefly explain the new pindel out put for version 0.2.3 below?

                              Thanks,

                              Dex

                              ####################################################################################################
                              0 D 2 NT 0 "" ChrID 20 BP 74310 74313 BP_range 74310 74316 Supports 19 18 + 9 8
                              - 10 10 S1 110 SUM_MS 1016 1 NumSupSamples 1 1 blood 9 8 10 10

                              Comment

                              • DexterDuncan
                                Junior Member
                                • Apr 2011
                                • 8

                                #30
                                I will be more explicit.

                                Hi Kai,

                                Here is one SV from the new version of pindel output. Could you explain how we get the number of normal reads versus the SV reads? Also, the header now is different with the new version for each SV, and for some reason, it is not coming to me what it all means. I must admit, I need to be become more educated with SVs.

                                ####################################################################################################
                                4 D 1 NT 0 "" ChrID 20 BP 34005 34007 BP_range 34005 34023 Supports 11 11 + 2 2 - 9
                                9 S1 30 SUM_MS 660 2 NumSupSamples 2 2 COLO-829 2 2 5 5 COLO-829-BL 0 0 4 4
                                CAACCAGATATGCCTCCTTACAAGAGATTCTTAAGGGAGCTCTAAACCTACAATCAAAAGAACAACACCTGCTACaAAAAAAAAAAAAAAAACATACTTATGCACATAAAGACACTATAAAGCAACTACACTATCAAGTCTACATAATAA
                                CTTAAGGGAGCTCTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCAC - 34175 60 COLO-829 @@EAS188_62:6:20:111:1106/2
                                CAAAAAAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACATAAAGACACTATAAAGCAACTACA + 33660 60 COLO-829 @@EAS188_62:3:40:104:1946/1
                                CTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTTTGCACATAAAGACACTA - 34184 60 COLO-829 @@EAS139_60:7:37:896:889/2
                                AACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCAAAAAAAAACACTATAAAGCAACTACACTATCA - 34396 60 COLO-829 @@EAS139_60:5:24:381:681/1
                                AAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACAAACTTATGCACATAAAGACACTATA - 34196 60 COLO-829 @@EAS131_8:8:43:784:1438/2
                                TAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACATAAAGACACTAT - 34388 60 COLO-829 @@EAS131_6:8:39:243:1719/1
                                CTCTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACATAAAGACAC + 33667 60 COLO-829 @@EAS25_5:1:80:1493:28/1
                                TCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACATAAAGACACTATAAAGCAACTAC - 34198 60 COLO-829-BL @@USI-EAS39_8289_FC30GCV_PE:5:18:1550:123/1
                                CTTAAGGGAGCTCTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCAC - 34175 60 COLO-829-BL @@HWI-EAS300_8282_FC30BVC_PE:1:15:777:1187/1
                                TAAGGGAGCTCTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACAT - 34164 60 COLO-829-BL @@HWI-EAS255_8291_FC30GRN_PE:2:73:533:1356/2
                                GAGCTCTAAACCTACAATCAAAAGAACAACACCTGCTAC AAAAAAAAAAAAAAAACATACTTATGCACATAAAGA - 34192 60 COLO-829-BL @@HWI-EAS138_4_FC30GP8:4:54:1227:1320/2


                                Thanks for all of your assistance,

                                Dex

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 08:59 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                22 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                32 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...