Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • kmcarr
    replied
    Originally posted by Samarpana View Post
    Hi kmcarr,

    I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

    Code:
    /illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*
    Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

    I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.
    Since the sequencing format was paired end with a dedicated index read added you will need to adjust the --use-bases-mask appropriately. On the assumption that the dedicated index read is not providing you any useful information (your index is part of read 1) it is safe to ignore it. Note I also mentioned ignoring the "T" which is in between your index and your actual read. Given the design of your libraries and the run format used by the sequencing center try this

    Code:
    --use-bases-mask "I7NY*,N*,Y*"

    Leave a comment:


  • Samarpana
    replied
    Hi kmcarr,

    I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

    Code:
    /illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*
    Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

    I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.

    Originally posted by kmcarr View Post
    That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

    In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

    If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

    I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

    Leave a comment:


  • thermophile
    replied
    Originally posted by kmcarr View Post
    That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.
    sorry, my bad for not looking at the sequence close enough

    Leave a comment:


  • Samarpana
    replied
    Yes, you have guessed it correct. After ligating these adapters, there is a PCR step that adds sequences corresponding to the P5 and P7 oligo sequences to the DNA fragment, to allow binding to the flow cell/bridge amplification.

    Thanks for guiding through the demultiplexing procedure. I will try what you have suggested and let you know how it works out.

    Originally posted by kmcarr View Post
    That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

    In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

    If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

    I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

    Leave a comment:


  • kmcarr
    replied
    Originally posted by thermophile View Post
    index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.
    That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

    In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

    If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

    I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

    Leave a comment:


  • thermophile
    replied
    index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.

    Leave a comment:


  • Samarpana
    replied
    Thanks so much for explaining. So, when I use it in the sample sheet for demultiplexing, should I use the 7 base's complimentary sequence?

    Originally posted by kmcarr View Post
    The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

    Code:
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
    Read 1 primer

    Leave a comment:


  • kmcarr
    replied
    Originally posted by Samarpana View Post
    I am trying to replicate a protocol established in the labs of our collaborator. The problem is they have provided me with the adapter sequences for barcoding multiple samples but did not provide any further information. Nor are they responding to emails.

    I have always used Illumina generated indexes (nextera) so can't figure out what are the index sequences in these adapters. Any help will be appreciated.

    The sequences are:
    Code:
    a1: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT GTC ATG A*T
    a2: /5Phos/TCA TGA CAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T
    
    a3: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT ACA TCT C*T
    a4: /5Phos/GAG ATG TAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T
    * = phosphorothiorate bond

    Which are the index sequences to be used for demultiplexing the samples, once the run is over?
    The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

    Code:
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
    Read 1 primer

    Leave a comment:


  • Samarpana
    started a topic Index Sequence in adapters for multiplexing

    Index Sequence in adapters for multiplexing

    I am trying to replicate a protocol established in the labs of our collaborator. The problem is they have provided me with the adapter sequences for barcoding multiple samples but did not provide any further information. Nor are they responding to emails.

    I have always used Illumina generated indexes (nextera) so can't figure out what are the index sequences in these adapters. Any help will be appreciated.

    The sequences are:
    a1: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT GTC ATG A*T
    a2: /5Phos/TCA TGA CAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

    a3: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT ACA TCT C*T
    a4: /5Phos/GAG ATG TAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

    * = phosphorothiorate bond

    Which are the index sequences to be used for demultiplexing the samples, once the run is over?

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-14-2024, 07:03 AM
0 responses
24 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
44 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
58 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
44 views
0 likes
Last Post seqadmin  
Working...
X