Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help getting fastq_filter.py working on command line?

    I want to use galaxy's fastq_filter tool on the command line.

    Basically, I already know what the inputs are required by fastq_filter.py, but not sure how to generate two of them.

    After you read the python and xml file, you learn that it is expecting us to run a line something like this:
    Code:
    fastq_filter.py $input_file $fastq_filter_file $output_file $output_file.files_path '${input_file.extension[len( 'fastq' ):]}'
    • $input_file
    • $fastq_filter_file I don't know how to make this
    • $output_file
    • $output_file.files_path I don't know what this is or how to avoid it
    • ${input_file.extension[len( 'fastq' ):]} Seems to be type check input file type ? Not going to worry about this for now


    The fastq_filter.ply is interesting. In it it has something like
    Code:
    def fastq_read_pass_filter( fastq_read ):
         def mean( score_list ):
             return float( sum( score_list ) ) / float( len( score_list ) )
         if len( fastq_read ) < $min_size:
             return False
         if $max_size > 0 and len( fastq_read ) > $max_size:
             return False
         num_deviates = $max_num_deviants
         qual_scores = fastq_read.get_decimal_quality_scores()
         for qual_score in qual_scores:
             if qual_score < $min_quality or ( $max_quality > 0 and qual_score > $max_quality ):
                 if num_deviates == 0:
                     return False
                 else:
                     num_deviates -= 1
     #if not $paired_end:
         qual_scores_split = [ qual_scores ]
     #else:
         qual_scores_split = [ qual_scores[ 0:int( len( qual_scores ) / 2 ) ], qual_scores[ int( len( qual_scores ) / 2 ): ] ]
     #end if
     #for $fastq_filter in $fastq_filters:
         for split_scores in qual_scores_split:
             left_column_offset = $fastq_filter[ 'offset_type' ][ 'left_column_offset' ]
             right_column_offset = $fastq_filter[ 'offset_type' ][ 'right_column_offset' ]
     #if $fastq_filter[ 'offset_type' ]['base_offset_type'] == 'offsets_percent':
             left_column_offset = int( round( float( left_column_offset ) / 100.0 * float( len( split_scores ) ) ) )
             right_column_offset = int( round( float( right_column_offset ) / 100.0 * float( len( split_scores ) ) ) )
     #end if
             if right_column_offset > 0:
                 split_scores = split_scores[ left_column_offset:-right_column_offset]
             else:
                 split_scores = split_scores[ left_column_offset:]
             if split_scores: ##if a read doesn't have enough columns, it passes by default
                 if not ( ${fastq_filter[ 'score_operation' ]}( split_scores ) $fastq_filter[ 'score_comparison' ] $fastq_filter[ 'score' ]  ):
                     return False
     #end for
         return True
    Is that python? Is this how the xml turns user input into a filter script? I had someone suggest I use the galaxy api for this, but that might be just as much work to get set up as getting this script to run? I'm not opposed to it, but I want to the easy way out because this is the last galaxy tool I have to run in my analysis I think before I move on to other things.

    Any help and assistance would be appreciated.
    Last edited by hlyates; 03-27-2015, 06:12 AM. Reason: Added tags

  • #2
    The development repository is here:
    Contains a set of Galaxy Tools mostly written by the Galaxy Team. - File not found · galaxyproject/tools-devteam


    Correction: The code you quoted is from the <configfile> XML snippet, it is a Python-like templating language called Cheetah.
    Last edited by maubp; 03-29-2015, 09:20 AM. Reason: correction

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Genetic Variation in Immunogenetics and Antibody Diversity
      by seqadmin



      The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
      11-06-2024, 07:24 PM
    • seqadmin
      Choosing Between NGS and qPCR
      by seqadmin



      Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
      10-18-2024, 07:11 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 11-08-2024, 11:09 AM
    0 responses
    223 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-08-2024, 06:13 AM
    0 responses
    165 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-01-2024, 06:09 AM
    0 responses
    80 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-30-2024, 05:31 AM
    0 responses
    27 views
    0 likes
    Last Post seqadmin  
    Working...
    X