I have a fasta file I created from the bovine gene information and I ran a uniq -d command on the names to make sure I didn't have any name duplicated. But when I use it as a reference and align reads to it and then try to run those reads through picard. Picard tells me I have duplicate sam sequences.
Does anyone know of a simple solution to this or have a way to identify those troubling sequences that appear to have the same name, even though the uniq command won't identify them?
Does anyone know of a simple solution to this or have a way to identify those troubling sequences that appear to have the same name, even though the uniq command won't identify them?
Comment