Hello everyone,
I am relatively new to NGS and bioinformatics and am working on CRISPR screens in cell lines.
I would like to create a reference genome containing approx 120 000 different sequences. I have a fasta and txt file containing all sgRNA sequences along with the gene they target, like this:
"Gene" "id" "Sequence"
"A1BG" "HGLibA_00001" "GTCGCTGAGCTCCGATTCGA"
"A1BG" "HGLibA_00002" "ACCTGTAGTTGCCGGCGTGC"
"A1BG" "HGLibA_00003" "CGTCAGCGTCACATTGGCCA"
"A1CF" "HGLibA_00004" "CGCGCACTGGTCCAGCGCAC"
"A1CF" "HGLibA_00005" "CCAAGCTATATCCTGTGCGC"
"A1CF" "HGLibA_00006" "AAGTTGCTTGATTGCATTCT"
"A2M" "HGLibA_00007" "CGCTTCTTAAATTCTTGGGT"
...
I would like to create a reference genome so that I can map my reads to this file, allowing for one mismatch (BAM alignment?) .
Thanks a lot,
Matt
I am relatively new to NGS and bioinformatics and am working on CRISPR screens in cell lines.
I would like to create a reference genome containing approx 120 000 different sequences. I have a fasta and txt file containing all sgRNA sequences along with the gene they target, like this:
"Gene" "id" "Sequence"
"A1BG" "HGLibA_00001" "GTCGCTGAGCTCCGATTCGA"
"A1BG" "HGLibA_00002" "ACCTGTAGTTGCCGGCGTGC"
"A1BG" "HGLibA_00003" "CGTCAGCGTCACATTGGCCA"
"A1CF" "HGLibA_00004" "CGCGCACTGGTCCAGCGCAC"
"A1CF" "HGLibA_00005" "CCAAGCTATATCCTGTGCGC"
"A1CF" "HGLibA_00006" "AAGTTGCTTGATTGCATTCT"
"A2M" "HGLibA_00007" "CGCTTCTTAAATTCTTGGGT"
...
I would like to create a reference genome so that I can map my reads to this file, allowing for one mismatch (BAM alignment?) .
Thanks a lot,
Matt
Comment