Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sklages
    replied
    Remove the backslashes \ from your command. Why did you put them there?

    And please do not hijack (old) threads .. just open a new one ..

    Leave a comment:


  • zillur
    replied
    Hi,
    I was trying to use picard using the following lines. But it is always showing me unrecognised option.
    MacBook-Pro:Strain_Naushon ZILLURRAHMAN$ java -Xmx16g -jar /Users/ZILLURRAHMAN/Desktop/PhD/GATK/picard-tools-1.119/AddOrReplaceReadGroups.jar I=Naushon.sorted.bam O=Naushon.picard.bam \ PL=illumina ID=$RANDOM SM=mysample
    ERROR: Unrecognized option: PL

    java -Xmx16g -jar /Users/ZILLURRAHMAN/Desktop/PhD/GATK/picard-tools-1.119/AddOrReplaceReadGroups.jar I=Naushon.sorted.bam O=Naushon.picard.bam \ RGPL=illumina RGID=$RANDOM RGSM=mysample
    ERROR: Unrecognized option: RGPL


    MacBook-Pro:Strain_Naushon ZILLURRAHMAN$ java -Xmx16g -jar /Users/ZILLURRAHMAN/Desktop/PhD/GATK/picard-tools-1.119/AddOrReplaceReadGroups.jar I=Naushon.sorted.bam O=Naushon.picard.bam \ CREATE_INDEX=true \ RGPL=illumina RGID=$RANDOM RGSM=mysample
    ERROR: Unrecognized option: CREATE_INDEX


    Anyone help me.

    Leave a comment:


  • Jane M
    replied
    Well, I have something similar, when using Picard with VALIDATION_STRINGENCY to lenient. It's not really an error in my case, since it's working afterwards (thanks to lenient maybe) :

    Ignoring SAM validation error: ERROR: Record 104416918, Read name HWI-ST584_0081:4:2202:1737:149838#AGTCAA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 104416919, Read name HWI-ST584_0081:4:2205:16036:86626#AGTCAA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 104416920, Read name HWI-ST584_0081:4:2207:1759:176116#AGTCAA, MAPQ should be 0 for unmapped read.
    INFO 2012-09-05 15:13:24 FixMateInformation Sorting by queryname complete.
    INFO 2012-09-05 15:13:24 FixMateInformation Output will be sorted by coordinate
    INFO 2012-09-05 15:13:24 FixMateInformation Traversing query name sorted records and fixing up mate pair information.
    INFO 2012-09-05 15:13:34 FixMateInformation Processed 1,000,000 records. Elapsed time: 00:00:10s. Time for last 1,000,000: 10s. Last read position: chr14:74,489,555
    Last edited by Jane M; 09-05-2012, 06:23 AM.

    Leave a comment:


  • dpryan
    replied
    @aforntacc: Try setting the stringency to "Lenient", GATK can be rather picky.

    Leave a comment:


  • aforntacc
    replied
    hello all please i need some help
    i am trying to add readgroup to my bam files and i got this error or exception
    please can some one explain and what should i do.
    Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 52142073, Read name HWI-ST132_0461:3:1201:1275:7049#GTCCTA, MAPQ should be 0 for unmapped read.
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:506)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:487)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:446)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:641)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:619)
    at net.sf.picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:98)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119)
    at net.sf.picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:66)

    Leave a comment:


  • Jane M
    replied
    Ok. I finally managed to run GATK completely
    I guess my input files are correct now! I have some trouble with the output, but it will be a new topic!

    Thanks everybody for your help!

    Leave a comment:


  • gringer
    replied
    I cannot use the same FASTA file that was used for the mapping, because the mapping was done by a allowance, and they didn't give us this file... So, I will change the name chrM into chrMt in my fasta files.
    Okay, but take a little bit of care in interpreting the results. There may be slight differences in sequences / lengths for different FASTA files.

    Leave a comment:


  • Jane M
    replied
    Thanks for pointing it out ! I haven't noticed that the name are different
    I cannot use the same FASTA file that was used for the mapping, because the mapping was done by a allowance, and they didn't give us this file... So, I will change the name chrM into chrMt in my fasta files.

    Leave a comment:


  • gringer
    replied
    Picard is reporting 'chrMt', while your fasta index file suggests 'chrM'. These are different, but they shouldn't be. You should be using exactly the same FASTA file that was used for the mapping.

    Leave a comment:


  • Jane M
    replied
    Thanks for the information Carlos. If you find a clear documentation for setting these parameters, I am also interested !


    I ran picard on my two datasets and my BAM files seem to be correctly formatted now.
    I reran GATK, but I still have an error. It seems that it doesn't accept the mitochondrial chromosome. That's a pity because I was finally at the end of the analysis, after one week of trials

    /opt/jdk1.7.0_02/bin/java -Xmx10g -jar GenomeAnalysisTK.jar -R ~/fasta/hg19.fasta -T SomaticIndelDetector --minCoverage 10 -o ~/../../../data/patient1/garma_indels.vcf -verbose indels.txt -I:normal ~/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam -I:tumor ~/../../../data/patient1/picard_s_garma-296_converted_sorted.bam

    INFO 09:28:04,178 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 09:28:04,180 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.4-15-gcd43f01, Compiled 2012/01/12 16:14:10
    INFO 09:28:04,180 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 09:28:04,180 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki
    INFO 09:28:04,181 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa
    INFO 09:28:04,181 HelpFormatter - Program Args: -R /home/merlevede/fasta/hg19.fasta -T SomaticIndelDetector --minCoverage 10 -o /home/merlevede/../../../data/patient1/garma_indels.vcf -verbose indels.txt -I:normal /home/merlevede/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam -I:tumor /home/merlevede/../../../data/patient1/picard_s_garma-296_converted_sorted.bam
    INFO 09:28:04,182 HelpFormatter - Date/Time: 2012/01/20 09:28:04
    INFO 09:28:04,182 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 09:28:04,182 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 09:28:04,195 GenomeAnalysisEngine - Strictness is SILENT
    INFO 09:28:04,237 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 09:28:04,261 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
    INFO 09:28:04,343 SomaticIndelDetectorWalker - No gene annotations available
    INFO 09:28:09,169 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING]
    INFO 09:28:09,169 TraversalEngine - Location processed.reads runtime per.1M.reads completed total.runtime remaining
    INFO 09:28:34,684 TraversalEngine - chr1:28598356 2.83e+06 30.0 s 10.6 s 0.9% 54.1 m 53.6 m
    INFO 09:29:04,690 TraversalEngine - chr1:60378092 6.04e+06 60.0 s 9.9 s 2.0% 51.3 m 50.3 m
    INFO 09:29:34,692 TraversalEngine - chr1:104116800 9.18e+06 90.0 s 9.8 s 3.4% 44.6 m 43.1 m
    INFO 09:30:04,697 TraversalEngine - chr1:155165725 1.29e+07 2.0 m 9.3 s 5.0% 39.9 m 37.9 m
    INFO 09:30:34,708 TraversalEngine - chr1:186157226 1.64e+07 2.5 m 9.1 s 6.0% 41.6 m 39.1 m
    INFO 09:31:04,714 TraversalEngine - chr1:230810753 1.99e+07 3.0 m 9.0 s 7.5% 40.2 m 37.2 m
    INFO 09:31:34,716 TraversalEngine - chr2:15519846 2.22e+07 3.5 m 9.5 s 8.6% 40.9 m 37.4 m
    INFO 09:32:04,720 TraversalEngine - chr2:61729349 2.57e+07 4.0 m 9.4 s 10.0% 39.8 m 35.8 m
    INFO 09:32:34,728 TraversalEngine - chr2:113404649 2.91e+07 4.5 m 9.3 s 11.7% 38.4 m 33.9 m
    INFO 09:33:04,729 TraversalEngine - chr2:170063504 3.24e+07 5.0 m 9.3 s 13.5% 36.9 m 31.9 m
    INFO 09:33:34,734 TraversalEngine - chr2:204820624 3.59e+07 5.5 m 9.2 s 14.7% 37.5 m 32.0 m
    INFO 09:34:04,740 TraversalEngine - chr3:3111817 3.89e+07 6.0 m 9.3 s 16.0% 37.5 m 31.5 m
    INFO 09:34:34,743 TraversalEngine - chr3:48501244 4.22e+07 6.5 m 9.2 s 17.5% 37.2 m 30.7 m
    INFO 09:35:04,744 TraversalEngine - chr3:108117422 4.54e+07 7.0 m 9.2 s 19.4% 36.1 m 29.1 m
    INFO 09:35:34,748 TraversalEngine - chr3:148577626 4.88e+07 7.5 m 9.2 s 20.7% 36.2 m 28.7 m
    INFO 09:36:04,754 TraversalEngine - chr3:197748351 5.21e+07 8.0 m 9.2 s 22.3% 35.9 m 27.9 m
    INFO 09:36:34,757 TraversalEngine - chr4:56325162 5.50e+07 8.5 m 9.3 s 24.1% 35.2 m 26.7 m
    INFO 09:37:04,764 TraversalEngine - chr4:106861604 5.84e+07 9.0 m 9.3 s 25.8% 34.9 m 25.9 m
    INFO 09:37:34,773 TraversalEngine - chr4:169215011 6.16e+07 9.5 m 9.3 s 27.8% 34.2 m 24.7 m
    INFO 09:38:04,775 TraversalEngine - chr5:37665146 6.44e+07 10.0 m 9.3 s 29.7% 33.7 m 23.7 m
    INFO 09:38:34,784 TraversalEngine - chr5:94859344 6.77e+07 10.5 m 9.3 s 31.5% 33.3 m 22.8 m
    INFO 09:39:04,794 TraversalEngine - chr5:141334425 7.11e+07 11.0 m 9.3 s 33.0% 33.3 m 22.3 m
    INFO 09:39:34,806 TraversalEngine - chr6:5109692 7.40e+07 11.5 m 9.3 s 34.5% 33.4 m 21.8 m
    INFO 09:40:04,817 TraversalEngine - chr6:41773298 7.73e+07 12.0 m 9.3 s 35.7% 33.6 m 21.6 m
    INFO 09:40:34,825 TraversalEngine - chr6:91364545 8.07e+07 12.5 m 9.3 s 37.3% 33.5 m 21.0 m
    INFO 09:41:04,826 TraversalEngine - chr6:147249706 8.40e+07 13.0 m 9.3 s 39.1% 33.3 m 20.3 m
    INFO 09:41:34,831 TraversalEngine - chr7:23650833 8.69e+07 13.5 m 9.3 s 40.6% 33.2 m 19.7 m
    INFO 09:42:04,836 TraversalEngine - chr7:87046856 9.00e+07 14.0 m 9.3 s 42.7% 32.8 m 18.8 m
    INFO 09:42:34,838 TraversalEngine - chr7:128220270 9.34e+07 14.5 m 9.3 s 44.0% 33.0 m 18.5 m
    INFO 09:43:04,839 TraversalEngine - chr8:16032718 9.64e+07 15.0 m 9.3 s 45.5% 33.0 m 18.0 m
    INFO 09:43:34,848 TraversalEngine - chr8:70978649 9.97e+07 15.5 m 9.3 s 47.3% 32.8 m 17.3 m
    INFO 09:44:04,849 TraversalEngine - chr8:133141853 1.03e+08 16.0 m 9.3 s 49.3% 32.5 m 16.5 m
    INFO 09:44:34,861 TraversalEngine - chr9:36607567 1.06e+08 16.5 m 9.4 s 50.9% 32.4 m 15.9 m
    INFO 09:45:04,863 TraversalEngine - chr9:111848126 1.09e+08 17.0 m 9.3 s 53.3% 31.9 m 14.9 m
    INFO 09:45:34,870 TraversalEngine - chr10:5788378 1.12e+08 17.5 m 9.4 s 54.5% 32.1 m 14.6 m
    INFO 09:46:04,881 TraversalEngine - chr10:63662048 1.15e+08 18.0 m 9.4 s 56.3% 32.0 m 14.0 m
    INFO 09:46:34,884 TraversalEngine - chr10:102740355 1.19e+08 18.5 m 9.3 s 57.6% 32.1 m 13.6 m
    INFO 09:47:04,885 TraversalEngine - chr11:4967349 1.22e+08 19.0 m 9.4 s 58.8% 32.3 m 13.3 m
    INFO 09:47:34,886 TraversalEngine - chr11:47843747 1.25e+08 19.5 m 9.4 s 60.2% 32.4 m 12.9 m
    INFO 09:48:04,894 TraversalEngine - chr11:83180407 1.28e+08 20.0 m 9.3 s 61.3% 32.6 m 12.6 m
    INFO 09:48:34,902 TraversalEngine - chr11:125325718 1.32e+08 20.5 m 9.3 s 62.7% 32.7 m 12.2 m
    INFO 09:49:04,904 TraversalEngine - chr12:25232248 1.35e+08 21.0 m 9.4 s 63.8% 32.9 m 11.9 m
    INFO 09:49:34,911 TraversalEngine - chr12:56646266 1.38e+08 21.5 m 9.4 s 64.9% 33.2 m 11.7 m
    INFO 09:50:04,920 TraversalEngine - chr12:102038451 1.41e+08 22.0 m 9.4 s 66.3% 33.2 m 11.2 m
    INFO 09:50:34,922 TraversalEngine - chr13:21429694 1.44e+08 22.5 m 9.4 s 68.0% 33.1 m 10.6 m
    INFO 09:51:04,924 TraversalEngine - chr13:77736127 1.47e+08 23.0 m 9.4 s 69.9% 32.9 m 9.9 m
    INFO 09:51:34,930 TraversalEngine - chr14:34395037 1.50e+08 23.5 m 9.4 s 72.2% 32.6 m 9.1 m
    INFO 09:52:04,936 TraversalEngine - chr14:76644221 1.54e+08 24.0 m 9.4 s 73.5% 32.6 m 8.6 m
    INFO 09:52:34,947 TraversalEngine - chr15:32450575 1.57e+08 24.5 m 9.4 s 75.6% 32.4 m 7.9 m
    INFO 09:53:04,951 TraversalEngine - chr15:62991093 1.60e+08 25.0 m 9.4 s 76.6% 32.7 m 7.7 m
    INFO 09:53:35,534 TraversalEngine - chr16:81725 1.64e+08 25.5 m 9.4 s 77.8% 32.8 m 7.3 m
    INFO 09:54:05,545 TraversalEngine - chr16:48204121 1.67e+08 26.0 m 9.3 s 79.4% 32.8 m 6.7 m
    INFO 09:54:35,749 TraversalEngine - chr17:69409 1.70e+08 26.5 m 9.4 s 80.8% 32.8 m 6.3 m
    INFO 09:55:05,756 TraversalEngine - chr17:28791767 1.74e+08 27.0 m 9.3 s 81.7% 33.1 m 6.1 m
    INFO 09:55:35,767 TraversalEngine - chr17:56399591 1.77e+08 27.5 m 9.3 s 82.6% 33.3 m 5.8 m
    INFO 09:56:05,781 TraversalEngine - chr18:8370781 1.80e+08 28.0 m 9.3 s 83.7% 33.5 m 5.5 m
    INFO 09:56:35,800 TraversalEngine - chr18:70149611 1.83e+08 28.5 m 9.3 s 85.7% 33.3 m 4.8 m
    INFO 09:57:05,803 TraversalEngine - chr19:31039619 1.87e+08 29.0 m 9.3 s 86.9% 33.4 m 4.4 m
    INFO 09:57:35,805 TraversalEngine - chr19:55119393 1.90e+08 29.5 m 9.3 s 87.7% 33.7 m 4.1 m
    INFO 09:58:05,806 TraversalEngine - chr20:34599131 1.93e+08 30.0 m 9.3 s 88.9% 33.8 m 3.7 m
    INFO 09:58:35,812 TraversalEngine - chr21:34721693 1.97e+08 30.5 m 9.3 s 91.0% 33.5 m 3.0 m
    INFO 09:59:05,817 TraversalEngine - chr22:38336813 2.00e+08 31.0 m 9.3 s 92.6% 33.5 m 2.5 m
    INFO 09:59:35,822 TraversalEngine - chrX:64028096 2.03e+08 31.5 m 9.3 s 95.1% 33.1 m 96.7 s
    INFO 10:00:05,834 TraversalEngine - chrY:5605887 2.06e+08 32.0 m 9.3 s 98.3% 32.6 m 34.0 s
    INFO 10:00:10,787 GATKRunReport - Uploaded run statistics report to AWS S3
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A USER ERROR has occurred (version 1.4-15-gcd43f01):
    ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
    ##### ERROR Please do not post this error to the GATK forum
    ##### ERROR
    ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
    ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
    ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
    ##### ERROR
    ##### ERROR MESSAGE: Badly formed genome loc: Parameters to GenomeLocParser are incorrect:Unknown contig chrMt
    ##### ERROR
    I don't know if I should remove this chromosome from my fasta file or my bam files. I guess that it's present in both...

    Here is my fasta.fai file:
    ...
    chr21 48129895 2837230949 50 51
    chr22 51304566 2886323449 50 51
    chrX 155270560 2938654113 50 51
    chrY 59373566 3097030091 50 51
    chrM 16571 3157591135 50 51
    Do you know what should be done from this mitochondrial chromosome?

    Leave a comment:


  • Carlos Borroto
    replied
    Originally posted by Jane M View Post
    Code:
    [merlevede@U1009-PCJane patient1]$ samtools view -H picard_s_garma-fibros_converted_sorted.bam | grep ^@RG
    @RG	ID:garma-fibros	PL:illumina	PU:tata	LB:toto	SM:garma
    I think you need to be a little more careful than just putting gibberish on some of these fields. I'm in the process of getting familiar with GATK, there are some parts and don't quite understand yet, like the use of these fields.

    This two links are the principal source of my information:
    Best Practice Variant Detection with the GATK v3 [broadinstitute.org]
    Exome sequencing analysis manual [seqanswers.com]

    From what I can understand, several of the tools in the GATK pipeline use these fields in order to do their magic. Again I'm not close to be sure what are the dos and don'ts here, but my rule of thumb is to set these three to the same value, sample name(ex. control01, treatment01, etc):
    RGLB=String Read Group Library Required.
    RGPU=String Read Group platform unit (eg. run barcode) Required.
    RGSM=String Read Group sample name Required.

    And this one to my platform(ex. illumina):
    RGPL=String Read Group platform (e.g. illumina, solid) Required.

    This might not be completely correct, but this way at least I'm not confusing the tools telling it two samples are from the same sequencer lane or library preparation, when they are actually not. Which you might be doing if you reuse the same 'gibberish' for more than one BAM file.

    I would actually love to find a clear documentation on how to set the correct values. Mainly what information I need to get from the sequencing center to set these values. Are there any documentation I could take a look at?

    Regards,
    Carlos

    Leave a comment:


  • Jane M
    replied
    Thank you dpryan,

    I think that it's finally working. I ran Picard and I got an non empty file and:
    java -jar ./AddOrReplaceReadGroups.jar I=~/../../../data/patient1/s_garma-fibros_converted_sorted.bam O=~/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam \
    > SORT_ORDER=coordinate CREATE_INDEX=true \
    > RGPL=illumina RGID=garma-fibros RGSM=garma RGLB=toto RGPU=tata VALIDATION_STRINGENCY=LENIENT
    [Wed Jan 18 16:42:12 CET 2012] net.sf.picard.sam.AddOrReplaceReadGroups INPUT=/home/merlevede/../../../data/patient1/s_garma-fibros_converted_sorted.bam OUTPUT=/home/merlevede/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam SORT_ORDER=coordinate RGID=garma-fibros RGLB=toto RGPL=illumina RGPU=tata RGSM=garma VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false
    [Wed Jan 18 16:42:12 CET 2012] Executing as merlevede@U1009-PCJane on Linux 3.1.6-1.fc16.x86_64 amd64; OpenJDK 64-Bit Server VM 1.6.0_22-b22; Picard version: 1.60(1086)
    INFO 2012-01-18 16:42:12 AddOrReplaceReadGroups Created read group ID=garma-fibros PL=illumina LB=toto SM=garma

    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1106:7158:91967, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1205:8058:144770, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1206:20528:185225, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1203:4551:95140, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2205:4442:123188, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1204:17464:45698, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2208:16832:136911, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1107:17717:4065, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2104:7277:38433, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1105:16587:151639, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2108:18278:149545, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1104:12598:60315, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1105:1489:49925, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2205:16036:86626, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1204:9783:76003, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1103:8346:98868, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1105:8354:181686, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1108:9333:173330, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2106:5867:60527, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:2202:1737:149838, CIGAR M operator maps off end of reference
    Ignoring SAM validation error: ERROR: Read name HWI-ST584_81:4:1203:14352:92986, CIGAR M operator maps off end of reference
    [Wed Jan 18 17:23:48 CET 2012] net.sf.picard.sam.AddOrReplaceReadGroups done. Elapsed time: 41,60 minutes.
    Runtime.totalMemory()=733675520
    [merlevede@U1009-PCJane picard-tools-1.60]$
    et la commande de zee donne cette fois quelque chose :
    [merlevede@U1009-PCJane patient1]$ samtools view -H picard_s_garma-fibros_converted_sorted.bam | grep ^@RG
    @RG ID:garma-fibros PL:illumina PU:tata LB:toto SM:garma
    To be sure that my BAM files are correct now, I'm running Picard on my "(tumoral) BAM file".
    Then, I will rerun GATK on my two files to do the SomaticIndelDetector analysis.
    I will keep you informed if it's definitely ok. I hope so !

    Leave a comment:


  • dpryan
    replied
    From the Picard FAQ:
    Q: A Picard program complains that CIGAR M operator maps off the end of reference. I want this record to be treated as valid despite the fact that the alignment end is greater than the length of the reference sequence.

    A: Picard validation errors may be turned into warnings by passing the command line argument VALIDATION_STRINGENCY=LENIENT. Picard validation messages may be suppressed completely with VALIDATION_STRINGENCY=SILENT. Another option is to use CleanSam to soft-clip these reads so they don't map off the end of the reference.

    Leave a comment:


  • Jane M
    replied
    Originally posted by dpryan View Post
    You're just telling Picard to creating a missing information section at the start of the file and add a small label to the associated reads, it won't do anything else. The UnifiedGenotyper error message you posted at the start of the thread indicated that it doesn't even use the RGLB field, so you can probably enter anything you want in there. You could probably enter gibberish and not have it matter for this purpose.
    OK, then I tried to fill the option with "gibberish", it ran for a while and gave:
    java -jar ./AddOrReplaceReadGroups.jar I=~/../../../data/patient1/s_garma-fibros_converted_sorted.bam O=~/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam \
    > SORT_ORDER=coordinate CREATE_INDEX=true \
    > RGPL=illumina RGID=garma-fibros RGSM=garma RGLB=toto RGPU=tata
    [Wed Jan 18 13:53:51 CET 2012] net.sf.picard.sam.AddOrReplaceReadGroups INPUT=/home/merlevede/../../../data/patient1/s_garma-fibros_converted_sorted.bam OUTPUT=/home/merlevede/../../../data/patient1/picard_s_garma-fibros_converted_sorted.bam SORT_ORDER=coordinate RGID=garma-fibros RGLB=toto RGPL=illumina RGPU=tata RGSM=garma CREATE_INDEX=true VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false
    [Wed Jan 18 13:53:51 CET 2012] Executing as merlevede@U1009-PCJane on Linux 3.1.6-1.fc16.x86_64 amd64; OpenJDK 64-Bit Server VM 1.6.0_22-b22; Picard version: 1.60(1086)
    INFO 2012-01-18 13:53:51 AddOrReplaceReadGroups Created read group ID=garma-fibros PL=illumina LB=toto SM=garma


    [Wed Jan 18 14:08:08 CET 2012] net.sf.picard.sam.AddOrReplaceReadGroups done. Elapsed time: 14,29 minutes.
    Runtime.totalMemory()=2009399296
    Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Read name HWI-ST584_81:4:1106:7158:91967, CIGAR M operator maps off end of reference
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMRecord.getCigar(BAMRecord.java:247)
    at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:136)
    at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:37)
    at net.sf.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:210)
    at net.sf.samtools.util.SortingCollection.add(SortingCollection.java:150)
    at net.sf.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:170)
    at net.sf.picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:93)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119)
    at net.sf.picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:61)
    [merlevede@U1009-PCJane picard-tools-1.60]$
    I needed to add also the option RGPU.
    The bam file that I got is of course empty

    Leave a comment:


  • dpryan
    replied
    You're just telling Picard to creating a missing information section at the start of the file and add a small label to the associated reads, it won't do anything else. The UnifiedGenotyper error message you posted at the start of the thread indicated that it doesn't even use the RGLB field, so you can probably enter anything you want in there. You could probably enter gibberish and not have it matter for this purpose.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Choosing Between NGS and qPCR
    by seqadmin



    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
    10-18-2024, 07:11 AM
  • seqadmin
    Non-Coding RNA Research and Technologies
    by seqadmin




    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

    Nobel Prize for MicroRNA Discovery
    This week,...
    10-07-2024, 08:07 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 11-01-2024, 06:09 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-30-2024, 05:31 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-24-2024, 06:58 AM
0 responses
24 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-23-2024, 08:43 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Working...
X