Hi,
my Boss wants me to look at publicly available Solexa data and I'm not sure what to do with it in order to visualize the data in a genome browser. I would like to start with a graphical representation of the coverage to identify regions with many reads. In case the data looks suitable I'd like to run MACS in order to identify overrepresented regions in the genome (human hg18).
What I did so far:
1. I ran Bowtie and received a .sam file as output
this data looks like:
and later:
This looks like correct sam-format to me.
2: I downloaded ConvertToBed utility from Vancouver short read archive and invoke it like
but get an error saying:
But that logfile is there, so it has been created.
What might be wrong?
I'm on Ubuntu 9.1. I have no clue how to get this script running.
In parallel I tried the samToBed.py I ound on Sourceforge, no luck there, too:
Might it be related to an improperly formated sam-file that I got from Bowtie?
Suggestions are really appreciated!!
Maxim
my Boss wants me to look at publicly available Solexa data and I'm not sure what to do with it in order to visualize the data in a genome browser. I would like to start with a graphical representation of the coverage to identify regions with many reads. In case the data looks suitable I'd like to run MACS in order to identify overrepresented regions in the genome (human hg18).
What I did so far:
1. I ran Bowtie and received a .sam file as output
this data looks like:
Code:
@HD VN:1.0 SO:unsorted @SQ SN:chr1 LN:247249719 @SQ SN:chr2 LN:242951149 @SQ SN:chr3 LN:199501827 @SQ SN:chr4 LN:191273063 @SQ SN:chr5 LN:180857866 @SQ SN:chr6 LN:170899992 @SQ SN:chr7 LN:158821424 @SQ SN:chr8 LN:146274826 @SQ SN:chr9 LN:140273252
Code:
SRR027956.6443341 SL-XAU_2_FC30E0LAAXX:1:100:1461:151 length=76 4 * 00 * * 0 0 CCATGATCAAGTGGGCTTCATCCCTGGGATGCAAGGCTGGTTCAACATACGAAAATCAAAGATCGGAAGAGCGGTT <<3<<<<<-<6<;06<<<<<<<<<<<<<<<44334,,,---,,,-+,,,,,,,,,,+,,,+,,*+(,,,,++*+&+ XM:i:0
2: I downloaded ConvertToBed utility from Vancouver short read archive and invoke it like
Code:
java -jar conversion_util/ConvertToBed.jar -aligner sam -input Hi-C_HindIII_GM_1_1.sam -qualityfilter 10 -output /output
Code:
Error: Coundn't create log file : /output/AlignReadsToBed.log
What might be wrong?
I'm on Ubuntu 9.1. I have no clue how to get this script running.
In parallel I tried the samToBed.py I ound on Sourceforge, no luck there, too:
Code:
python samToBed.py -s Hi-C_HindIII_GM_1_1.sam Traceback (most recent call last): File "samToBed.py", line 138, in <module> sys.exit(main()) File "samToBed.py", line 135, in main processSAM(samFile, aType) File "samToBed.py", line 47, in processSAM makeBED(samLine, alignType) File "samToBed.py", line 53, in makeBED samFlag = int(samFields[1]) ValueError: invalid literal for int() with base 10: 'VN:1.0'
Suggestions are really appreciated!!
Maxim
Comment