Hi SEQanswers community,
I am using RNAplfold to get the probability of binding sites being single stranded, and got stuck trying to interpret the output _lunp files, and I can't find documentation on this format. As I understand, using the -u flag with the length of the binding motif should give me information on accessibility of all k-mers of the given length (probability of being single stranded). I used -u 5 to get 5-mer accessibility, which apparently generates 5 columns per position in the sequence. How do I interpret the numbers in the columns?
My guess is that the number of the column represents where this position is in the k-mer, so each column will represent a different k-mer containing this nucleotide. Which would mean that column 1 for position 1 represents the same k-mer as column 2 for position 2. So, to get the accessibility of a certain k-mer, I would guess that the values need to be averaged across the diagonals, so to say. However, I'm not sure if I'm interpreting these files right. Sometimes I get very different numbers between position 1, column 1 and position 2, column 2, and so on, and I would expect them to be more similar if my assumption was right. Anyone who can help me with how to interpret this?
I am using RNAplfold to get the probability of binding sites being single stranded, and got stuck trying to interpret the output _lunp files, and I can't find documentation on this format. As I understand, using the -u flag with the length of the binding motif should give me information on accessibility of all k-mers of the given length (probability of being single stranded). I used -u 5 to get 5-mer accessibility, which apparently generates 5 columns per position in the sequence. How do I interpret the numbers in the columns?
My guess is that the number of the column represents where this position is in the k-mer, so each column will represent a different k-mer containing this nucleotide. Which would mean that column 1 for position 1 represents the same k-mer as column 2 for position 2. So, to get the accessibility of a certain k-mer, I would guess that the values need to be averaged across the diagonals, so to say. However, I'm not sure if I'm interpreting these files right. Sometimes I get very different numbers between position 1, column 1 and position 2, column 2, and so on, and I would expect them to be more similar if my assumption was right. Anyone who can help me with how to interpret this?