Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sklages
    replied
    Script in terms of "perl script"? I never do this automatically ..

    You need to know your 5' vector/adaptor sequences, re sites if applicable and the 3' vector/adaptor/whatever sequences ... and then create a multi fasta file as mentioned before.

    Code:
    [FONT=Courier New]                                  ----f2------------------------->
                          ----f1------------------------->
    |======================[]=====================|
                                      <-------------------------r1----
                         <-------------------------r2----[/FONT]
    I am afraid I am missing something?

    cheers,
    Sven

    Leave a comment:


  • dan
    replied
    It's unclear to me how, given an arbitrary vector sequence, one generates the associated .splice file.

    Given the position of the splice site, I guess its straight forward.

    Could you demo some simple script for doing this?

    Leave a comment:


  • sklages
    replied
    keep in mind that you should use a non-proportional font (fixed) so that it makes sense.

    btw, it's not really clear to me what is unclear to you ... ;-)

    Sven
    Last edited by sklages; 09-28-2009, 01:52 AM. Reason: .. rethinking ..

    Leave a comment:


  • dan
    replied
    Since I at least have something working for this question, I thought I'd update the thread. No clear answers exactly, but I got something that seemed to work (hopefully useful for someone) ...

    Some of what I eventually worked out on this topic is described here:





    And here is some info from an email exchange with Sven Klages (user 'sven').

    > What is the "sequence of the vector splice site"?

    The flanking bases of the cloning site, e.g. pUC19/SmaI:
    Figure
    ======



    ----f2------------------------->
    ----f1------------------------->
    |========================= GGG/CCC =========================|
    <-------------------------r1----
    <-------------------------r2----


    f1 = for.begin
    f2 = for.end
    r1 = rev.begin
    r2 = rev.end

    OVERLAPS f1/f2 and/or r1/r2 ~ 50bp

    So your splice site file could look like this (sequences
    shortened, [...]):

    >pUC19.for.begin
    attcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctat
    [...]
    >pUC19.for.end
    tttcccagtcacgacgttgtaaaacgacggccagtgaattcgagctcggtaCCCGGGgat
    [...]
    >pUC19.rev.begin
    gggcagtgagcgcaacgcaattaatgtgagttagctcactcattaggcaccccaggcttt
    [...]
    >pUC19.rev.end
    aggaaacagctatgaccatgattacgccaagcttgcatgcctgcaggtcgactctagagg
    [...]

    "man lucy" will tell you more (after compiling).



    But I still didn't understand! Sven continued...

    roughly, you take the 5' flanking sequence,
    CAGTCCAGTTACGCTGGAGTCTGAGGCTCGTCCTGAATGATATCAAGCTTGAATTCGTT

    and the 3' flanking sequence,
    GACGAATTCTCTAGATATCGCTCAATACTGACCATTTAAATCATACCTGACCTCCATAGCAGAAAG

    and join it to form

    >pSMART-HCAmp.for.begin
    CAGTCCAGTTACGCTGGAGTCTGAGGCTCGTCCTGAATGATATCAAGCTTGAATTCGTT
    GACGAATTCTCTAGATATCGCTCAATACTGACCATTTAAATCATACCTGACCTCCATAGCAGAAAG
    >pSMART-HCAmp.for.end
    CAGTCCAGTTACGCTGGAGTCTGAGGCTCGTCCTGAATGATATCAAGCTTGAATTCGTT
    GACGAATTCTCTAGATATCGCTCAATACTGACCATTTAAATCATACCTGACCTCCATAGCAGAAAG

    Which is pretty much the the same for 'begin' and 'end' ..
    This is not what is proposed, but it should work.

    You should "reverse complement" if you need reverse clipping
    as well.

    >pSMART-HCAmp.rev.begin
    [sequence]
    >pSMART-HCAmp.rev.end
    [sequence]

    lucy is pretty "tolerant" ...

    Just use 'lucy' with the flag '-debug FILENAME' to see if clipping
    was successful.


    If you're expecting any adaptors they should be included in
    the sequence as they are read by sequencing,

    Vector-Adaptor-(INSERT)-Adaptor-Vector



    So I said...

    Thanks Sven, its all clear now. Just to make sure I understand though,
    the GenBank sequence for this pSMART vector (pSMART-HCKan, AF532107.1)
    just 'happens' to start with:

    GACGAATTCTCTAGATATCGCTCAATACTGACCATTTAAATCATACCTGACCTCCATAGCAGAAAGTCAA


    and just 'happens' to end with:

    TGAGGCTCGTCCTGAATGATATCAAGCTTGAATTCGTT


    but actually, I need some detailed knowledge of where on the vector
    sequence the sequence 'insert site' (or splice site) is before I can
    create what you did above?



    And Sven said...

    Yes, you should know about the insert location.
    But that's easy, isn't it?

    If you have the whole sequence you should design the splice file as
    mentioned.


    ----f2------------------------->
    ----f1------------------------->
    |========================= INSERT =========================|

    <-------------------------r1----
    <-------------------------r2----


    f1 = for.begin
    f2 = for.end
    r1 = rev.begin
    r2 = rev.end

    OVERLAPS f1/f2 and/or r1/r2 ~ 50bp, individual length of f1,f2,r1,r2 ~150bp.

    Leave a comment:


  • dan
    started a topic Celera Assembler (WGS) - splice site file?

    Celera Assembler (WGS) - splice site file?

    Hi,

    I want to use the Celera Assembler (WGS) in my assembly pipeline in order to compare the results to Phred / Phrap. I read that to vector / quality trim my reads, I should use Lucy, but on this point I am confused.

    What is the "sequence of the vector splice site"?


    I am reading this: http://www.cbcb.umd.edu/research/CeleraAssembler.shtml

    "Each vector file [one per vector] must be accompanied by a splice site file containing the sequence within the vector that is adjacent to the splice sites used in the project. In case your project uses an adapter it should be included in the splice file. ... The vector file must contain a single FASTA-formatted sequence representing the entire sequencing vector. The splice file contains 4 FASTA records corresponding to approximately 200 bp flanking either side of the splice site, presented in both the forward and reverse-complemented orientation."


    Unfortunately I don't understand what this means, specifically, what is the splice site file and how do I identify the splice sites? Typically will this refer to the sequencing vector or the cloning vector (BAC)?

    The project uses the pSMART-HCKan (AF532107) sequencing vector from the Lucigen CLONESMART Blunt Cloning Kit ... does that mean anything to anyone?

    Should I just use the 200 bp either side of the primer sites?


    Sorry for the potentially very dumb question!

    Dan.

Latest Articles

Collapse

  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:04 AM
0 responses
80 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-21-2024, 09:19 AM
0 responses
76 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-08-2024, 11:09 AM
0 responses
320 views
0 likes
Last Post seqadmin  
Working...
X