Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ctsa
    Junior Member
    • Jan 2011
    • 6

    Strelka: Somatic small-variant calling workflow for matched tumor-normal samples

    Hello All,

    Strelka is a new workflow available to call SNVs and small indels from sequencing data for matched tumor-normal samples. It is designed to detect somatic variants at lower frequencies typically encountered in tumors due to sample impurity or sub-clone variation. The workflow also provides computational efficiency appropriate for the whole genome sequencing case: requiring ~1 core-hour per 2x combined tumor normal coverage.

    More information/source code available here:



    We appreciate any feedback on how these methods can be improved.

    Best Regards,

    -Chris Saunders
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Looks like the license is fairly restrictive. Any chance of moving this to an open source license?

    Comment

    • Jane M
      Senior Member
      • Aug 2011
      • 239

      #3
      Hello,

      I could be interested in the tool that you suggest since it deals with my problematic, but as said nilshomer, there seems to be restrictions with the license, I am not able to download the sources...

      Jane

      Comment

      • ctsa
        Junior Member
        • Jan 2011
        • 6

        #4
        Hi Nils and Jane --

        Thanks for highlighting this issue, I will take a look today to see what our options are wrt the source license.

        -Chris

        Comment

        • ctsa
          Junior Member
          • Jan 2011
          • 6

          #5
          I've gotten additional feedback about the source download link on some web browsers. The source download URL is:

          ftp://[email protected]

          Note that no password is required. In firefox it looks like a password prompt comes up anyway -- you can leave the password field blank and just hit "Ok" to enter the ftp site.

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by ctsa View Post
            I've gotten additional feedback about the source download link on some web browsers. The source download URL is:

            ftp://[email protected]

            Note that no password is required. In firefox it looks like a password prompt comes up anyway -- you can leave the password field blank and just hit "Ok" to enter the ftp site.
            I am able to download the source code fine, but it is the license to which I do not agree.

            Comment

            • ctsa
              Junior Member
              • Jan 2011
              • 6

              #7
              Hi Nils --

              Sorry if there's a misunderstanding, the additional ftp advice is in response to a separate conversation. As I replied above, I'm working on the license issue.

              -Chris

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                #8
                Sorry, thanks for looking into it!

                Comment

                • pravee1216
                  Member
                  • Aug 2010
                  • 35

                  #9
                  platform independent?

                  Good news!!

                  Couple of questions:

                  a) Does Strelka support alignment files from Roche 454/FLX sequencing reads? or Is it designed mainly for Illumina data?

                  b) How does it handle calls at homopolymer regions, especially for 454/MiSeq/IoT platform data? Is this tested?

                  Thanks in advance

                  Raj

                  Comment

                  • genomicist
                    Member
                    • Jan 2011
                    • 12

                    #10
                    Why are there two "format" fields in the output?

                    I wonder why there are two "format" fields in the output (the last two columns of the output file) of this type: DP:FDP:SDP:SUBDP:AU:CU:GU:TU. Have been looking for explanation but all in vane.

                    Comment

                    • genomicist
                      Member
                      • Jan 2011
                      • 12

                      #11
                      I also wonder what are the optional "extraStrelkaArguments" that are possible to specify in the configuration file. Is there a list?

                      Specifically, is it possible to filter the calls on variant allele frequency? My "passed" SNV list contains lots of calls that are supported by 1 or 2 reads with the alternative base, along with a couple of hundred reads with reference base. These are presumably sequencing errors.

                      Comment

                      • lethalfang
                        Member
                        • Aug 2011
                        • 95

                        #12
                        I'm trying to use Strelka on some sequencing data we got from Solid 5500, with its BAM file aligned with LifeScope 2.5.
                        The LifeScope-produced BAM file seems to be incompatible with Stralka.
                        Does anyone know of a way to convert the BAM into something acceptable by Strelka?

                        Thanks in advance.

                        Well, it seems I was just missing an index .bam.bai file, which I created using samtools index aln.bam.
                        It is running now. Let's see how it goes.
                        Last edited by lethalfang; 09-26-2012, 01:13 PM. Reason: Problem may be solved.

                        Comment

                        • malachig
                          Senior Member
                          • Aug 2010
                          • 117

                          #13
                          It looks like nilshomer's initial question was never addressed. We have the same concern with using this software:

                          "Looks like the license is fairly restrictive. Any chance of moving this to an open source license?"

                          If this could be moved to open source, that would make it easier to deploy in pipelines/platforms that are themselves open source projects...

                          Comment

                          • ctsa
                            Junior Member
                            • Jan 2011
                            • 6

                            #14
                            Looks like I'm not getting emails for this thread. I'll try to briefly cover the existing questions but encourage you to re-post any current issue to the strelka mailing list here:



                            - License:

                            Strelka has recently been moved to the Illumina Open Source Software License (v1). Details are on github here:




                            - Incompatible BAMs:

                            All known BAM restrictions are described in the FAQ here:




                            - Format Fields:

                            All format fields are described in the VCF header, as well as on the website here:

                            Comment

                            • Jane M
                              Senior Member
                              • Aug 2011
                              • 239

                              #15
                              Hello,

                              I am using Strelka 1.0.14 on WGS data for a few days.
                              Strelka ran without problem on my samples.
                              To further filtered my list of variants, I need information about the coverage of reference and "variant" alleles in both normal and tumor samples.

                              As I read in the strelka discussion list, the number of reads supporting the "indel"(=variant) allele is given by the TIR column (its first field preferentially).
                              My problem is to find the number of reads supporting the reference allele.
                              Maybe I missed something, but I tried DP, TAR, DP-TOR, TAR+TOR,... without success....
                              In some cases, DP seem ok. In other cases, TAR+TOR look ok.
                              This information is crucial for my analysis and I am stuck here with this detail...

                              Could you please tell me how to compute precisely the number of reads supporting the reference allele?

                              Thank you in advance,
                              Jane

                              ps: I just posted this question on the strelka mailing list, but it could be more visible here

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 08:59 AM
                              0 responses
                              10 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              17 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...