Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • arkilis
    replied
    Originally posted by manvendra7 View Post
    Dear FOlks,
    I am so new, an early stage researcher.

    I am using TopHat2 to map the reads, I guess, I am fulfilling all the requirements, my code is

    /usr/local/bin/tophat2 -p 8 -G ~/path/to/Homo_sapiens.GRCh37.72.gtf -o
    ~/path/to/Human_mapping_iPS_s7_rep1
    --splice-mismatches 1 --max-multihits 30 --microexon-search --fusion-search
    ~/path/to/bowtie2_index/hg19
    ~/path/to/myfile.fastq

    I am submitting on grid engine cluster with qsub -l h_vmem=50G [above_script]
    this is showing error as:
    """""TopHat requires all reads be either FASTQ or FASTA. Mixing formats is not supported"""

    I am bit frustrated because my fastq files look fine to me as shown in code

    @SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    NGGCCTTCCCACATTCTTTACACTCATAGGTTTTCTCACCAGTGTGAGTTCTCTTGTGCACAATAAGGTAAGAGCC
    +SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    !454478347;09977778<655476;69;8588380745<75;57495945158::=677976:7674:64763-

    Please help???????
    For all what I know is there are diff verions of fastq format. you better have to check of the software requirements.

    FASTQ: three main versions, illumina 1.3+, 1.5+ and 1.8+

    Leave a comment:


  • manvendra7
    replied
    Thanks guys,
    My Problem is figured out. There was a problem with my fastq file

    Leave a comment:


  • GenoMax
    replied
    Looking at the sequence identifiers I wonder if this is old data from a GAII machine. It is then likely in the older illumina (1.3) Fastq format. If that is the case then you may need to add the relevant options for tophat to take that into account.

    From TopHat manual

    --solexa-quals Use the Solexa scale for quality values in FASTQ files.
    --solexa1.3-quals As of the Illumina GA pipeline version 1.3, quality scores are encoded in Phred-scaled base-64. Use this option for FASTQ files from pipeline 1.3 or later.

    Leave a comment:


  • KatsenPlatz
    replied
    My Guess is that there is a problem somewhere in the fastq file, although the first 4 lines look good! A few checks that you can do are:

    1. there are 4 lines for each sequence read in the file, i.e. total count of lines in the fastq file is 4x the total number of reads
    2. every first line of each record starts with @ and every third line starts with +
    3. the length of the quality sequence is the same as the length of the sequence read for every record

    Leave a comment:


  • where is real problem, tophat2 code or my fastq files

    Dear FOlks,
    I am so new, an early stage researcher.

    I am using TopHat2 to map the reads, I guess, I am fulfilling all the requirements, my code is

    /usr/local/bin/tophat2 -p 8 -G ~/path/to/Homo_sapiens.GRCh37.72.gtf -o
    ~/path/to/Human_mapping_iPS_s7_rep1
    --splice-mismatches 1 --max-multihits 30 --microexon-search --fusion-search
    ~/path/to/bowtie2_index/hg19
    ~/path/to/myfile.fastq

    I am submitting on grid engine cluster with qsub -l h_vmem=50G [above_script]
    this is showing error as:
    """""TopHat requires all reads be either FASTQ or FASTA. Mixing formats is not supported"""

    I am bit frustrated because my fastq files look fine to me as shown in code

    @SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    NGGCCTTCCCACATTCTTTACACTCATAGGTTTTCTCACCAGTGTGAGTTCTCTTGTGCACAATAAGGTAAGAGCC
    +SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    !454478347;09977778<655476;69;8588380745<75;57495945158::=677976:7674:64763-

    Please help???????

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin







    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has...
    12-02-2024, 01:49 PM
  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-02-2024, 09:29 AM
0 responses
139 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 09:06 AM
0 responses
49 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 08:03 AM
0 responses
38 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
69 views
0 likes
Last Post seqadmin  
Working...
X