Effects of library quality and indexes on demultiplexing NovaSeq X Series data
Last updated
Last updated
© 2023 Illumina, Inc. All rights reserved. All trademarks are the property of Illumina, Inc. or their respective owners. Trademark information: illumina.com/company/legal.html. Privacy policy: illumina.com/company/legal/privacy.html
BackgroundMultiple factors affect the number of undetermined reads on the NovaSeq X Series, including:
Library quality
Loading concentration
Index sequence
Index color balance
Demultiplexing settings
Library Quality and Loading ConcentrationAdapter dimer contamination alone can result in higher-than-expected undetermined reads during the demultiplexing process. This effect can further be exacerbated by nonideal loading concentrations.
During testing, Illumina found that when Illumina DNA PCR-Free libraries were loaded at their ideal concentration of 130 pM, only 10% of reads were reported as undetermined. However, when the loading was increased to 650 pM, the undetermined reads rose to 19%. With a 1x adapter dimer spike-in, the percentage of undetermined reads increased to 13% when optimally loaded, but this amount increased to 79% when runs were overloaded. As shorter fragments preferentially cluster during sequencing, the higher loading concentration leads shorter fragments, such as adapter dimers, to more significantly outcompete the target library. With a 2x adapter dimer spike-in, the percentage of undetermined reads increased further with both loading concentrations (130 pM and 650 pM). The elevated number of undetermined reads can be mitigated by performing an additional bead cleanup to remove adapter dimers prior to sequencing.
Index DesignThe NovaSeq X Series uses XLEAP-SBS chemistry which is a fundamentally new Sequencing by Synthesis (SBS) chemistry. With NovaSeq X Series XLEAP-SBS chemistry, C nucleotides are the dual-color base, rather than A nucleotides, as with other two-channel systems. This affects not only the rules for color balancing, but some indexes may physically perform differently (1 bp mismatch rate, % CV, etc.) with XLEAP-SBS chemistry. This chemistry change necessitates re-screening of indexes, even if they have been previously used successfully on other Illumina platforms, including the NovaSeq 6000 or other two-channel systems.
Illumina has validated two index sets with XLEAP-SBS chemistry: the Illumina DNA/RNA UD Indexes (v3) and the IDT for Illumina TruSeq DNA or RNA UD Indexes. Low-plex pooling guidance for these indexes with XLEAP-SBS chemistry can be found in the Index Adapters Pooling Guide. Note: 1 index (UDI035 from well C5) of the IDT for Illumina TruSeq UD indexes has higher than optimal mismatch rate (~20% MM) on the NovaSeq X Series and is not recommended for use on the NovaSeq X Series.
When screening custom or third-party indexes for the NovaSeq X Series, Illumina recommends:
Identify library preparation methods planned to be run on NovaSeq X Series.
Synthesize multiple oligo pairs containing the custom index sequences (HPLC, Page purification recommended). Make sure a sufficient number of indexes are synthesized for screening (at least two times the number needed is a reasonable starting point), as some index pairs may not meet performance criteria.
Prepare libraries with the library prep kit of choice using the synthesized index pairs.
Identify index pairs with a low library yield and omit them from further consideration.
Sequence libraries generated with the remaining index sets on NovaSeq X Series at high plexity (eg, 96-plex). Illumina recommends a separate run for every platform due to the distinct sequencing chemistries. Sequences screened on NextSeq 2000 with standard SBS chemistry or NovaSeq 6000 must be rescreened on the NovaSeq X Series.
Discard index pairs with elevated 1 bp mismatch values.
When pooling different indexes in the same run/lane, follow the color balancing guidelines ensure base diversity in every run.
DemultiplexingDemultiplexing settings can impact the number of reads recovered from a sequencing run and the percentage of undetermined reads. BCL Convert allows users to specify the number of mismatches that can be tolerated during demultiplexing sequencing runs, with the default set to 1 bp mismatch.
Illumina recommends 1 bp mismatch as a matter of routine for NovaSeq X Series runs. Illumina indexes are designed with a hamming distance of 4, therefore up to 2 bp mismatch settings can be used to maximize data recovery. 0 bp mismatch (Perfect Read) metrics are given in standard demultiplexing reports if there is a user interest in tracking this metric.
For more information, see the article “Why is allowing mismatches when demultiplexing desirable?”
For any feedback or questions regarding this article (Illumina Knowledge Article #9246), contact Illumina Technical Support techsupport@illumina.com.