Header Leaderboard Ad

Collapse

More Unique ELAND Alignments Than Reads?

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • More Unique ELAND Alignments Than Reads?

    Can anyone explain to me why I am finding more unique alignments in my eland_results file than I have reads in my sequence file?

    wc -l s7_sequence.txt
    5808194 (23232776 / 4)
    grep -c " U[0,1,2] " s7_eland_result.txt
    6371019

    It doesn't make sense to me that the uniquely aligned sequence count should be greater than the total sequences read...

    Thanks for your advice!

  • #2
    Originally posted by seq7 View Post
    Can anyone explain to me why I am finding more unique alignments in my eland_results file than I have reads in my sequence file?

    wc -l s7_sequence.txt
    5808194 (23232776 / 4)
    grep -c " U[0,1,2] " s7_eland_result.txt
    6371019

    It doesn't make sense to me that the uniquely aligned sequence count should be greater than the total sequences read...

    Thanks for your advice!
    I believe your grep is counting twice some stuff. First of all you shouldn't include a comma in square brackets, they define the alternative characters. And what does

    Code:
    wc -l s7_eland_result.txt
    output?
    Anyway, the correct grep for you is:

    Code:
    grep -cw U[012] s7_eland_result.txt
    In the end, the s_7_sequence contains the filtered and uniquely mapped reads, doesn't it? (I may be wrong, it's a long time since I've used illumina output)

    d

    Comment


    • #3
      wc -l s7_eland_result.txt
      9196920

      grep -cw U[012] s7_eland_result.txt
      6371019

      sequence.txt is filtered output.
      eland_result.txt is unfiltered ELAND alignment output.

      Thanks for the reply!

      Comment

      Working...
      X