Library Loading Concentration and Optimization Guide for NovaSeq X/X Plus Instruments
Background
The NovaSeq X/X Plus instruments leverage numerous improvements to achieve higher data throughput than previous platforms. While the fundamental principles of the technology remain the same, library loading concentrations should be reoptimized on NovaSeq X/X Plus instruments when transitioning an assay developed on another Illumina platform. The following guide discusses considerations when optimizing the loading concentration when running libraries on the NovaSeq X/X Plus instruments.
Recommended Loading Concentrations for Illumina Libraries
Illumina has tested several Illumina library prep kits including TruSeq DNA PCR-Free, Illumina DNA PCR-Free, Illumina DNA Prep with Enrichment, Illumina Stranded Total RNA Prep Ligation with Ribo-Zero Plus, and PhiX. For the most up to date loading concentrations recommendations for Illumina libraries, please refer to the Denature and Dilute Protocol Generator. However, the recommended loading concentrations should be validated by the user as there can be variation depending user workflow optimization.
For Illumina libraries that have not been tested, third party libraries, or custom libraries, the following titration experiments must be performed to determine the optimal loading concentration. Note that libraries previously optimized for 10B flow cells on v1.1 of the NovaSeq X Control Software should be reoptimized on v1.2, as the clustering recipe was updated resulting in subtle improvements in clustering.
Illumina Recommendations for Optimizing Loading Concentrations
Library Quantification
Illumina recommends an accurate, reproducible method. Illumina uses size-normalized quantitative PCR (qPCR) for library quantification as it gives accurate and consistent results. Alternative methods to qPCR may give sub-optimal results. For more information, refer to the Best practices for library quantification Knowledge Article.
Titration Experiments
Illumina recommends performing titration runs over a wide concentration range for unoptimized libraries to identify the optimal loading concentration. It is critical to test a range that includes underloaded and overloaded concentrations to identify the ideal performance range
Starting Concentrations
When transitioning a workflow from the NovaSeq 6000, Illumina recommends centering titration experiments at ~30% of the optimized loading concentration for the NovaSeq 6000 standard onboard workflow, and using a wide-range of loading concentrations for the initial titration (use increments of 50 pM). Smaller increments can be used for fine-tuning the optimal loading concentration. 25B Flow Cells often show similar optimal loading concentrations to 10B and 1.5B Flow Cells, though some libraries require 10-15% higher loading concentrations due to higher nanowell density.
Evaluating Titration Data
Illumina recommends using primary metrics such as %Occupancy and %Pass Filter (%PF) to narrow the range of potentially optimal loading concentrations. From there, use secondary metrics such as duplicate rates, average insert size, and coverage to evaluate these candidates and identify the optimal loading concentration. Relying solely on %PF vs %Occupied plots might result in selecting a concentration that results in lower coverage, increased short insert clustering, or higher duplicate rates.
Optimization Workflow
Step 1: Design a Titration experiment across a wide range of concentrations.
When transitioning from the NovaSeq 6000 to the NovaSeq X Series 10B flow cell, Illumina recommends centering titrations at ~30 % of the optimal NovaSeq 6000 loading concentration.
Example: TruSeq DNA PCR Free libraries were loaded at 350 pM on NovaSeq 6000 S4 Flow Cell and a titration run was performed with loading concentrations of 40-160 pM on NovaSeq X 10B Flow Cell.
Step 2: Plot %PF vs %Occupied in Sequencing Analysis View (SAV) to determine a narrow range of optimal loading concentration candidates.
Example: For TruSeq DNA PCR-Free libraries, 40 pM generates points on a positive slope indicating underloading (Figure 1A). 80, 90, and 100 pM generate a cloud of points indicating optimal loading (1B). 160 pM loading generates a negative slope, indicating an overloaded run (1C).
Figure 1: Plots of %PF vs %Occupied highlighting data patterns associated with various loading concentrations that result in under- and overloading conditions, as well as optimal loading concentrations.
Step 3: Narrow down optimal loading concentrations by analyzing %Duplicates.
Select titration loading concentrations that have duplicates at or below the target value, such as ≤ 15%.
Example: TruSeq DNA PCR-Free libraries loaded at 80 - 120 pM have duplicates that are below 15% (Figure 2, orange triangles). The lowest duplicate rates are seen with 90, 100, and 120 pM.
Figure 2: Multi-variate plot showing relationship between loading concentration and %Duplicates.
Step 4: Analyze average coverage of titrated loading concentrations.
Calculate the average coverage for selected titrated loading concentrations to see if desired coverage is achieved
Note: Coverage is application dependent and influenced by number of libraries loaded in the same lane.
Example: For TruSeq DNA PCR-Free libraries, 90, 100, and 120 pM achieved > 30x coverage (Figure 3, orange bars).
Figure 3: Multi-variate plot showing relationship between loading concentration and genome coverage.
Step 5: Analyze loading concentration relationship to mean insert size.
Determine which loading concentration(s) yield the desired average insert size. Shorter insert sizes can result in slight reductions in variant calling for single nucleotide variants (SNVs) and insertions/deletions (indels).
Example: TruSeq DNA PCR-Free libraries at 90, 100, and 120 pM loading have similar mean insert sizes (~421 - 427 bp), with higher concentrations showing reduced mean insert size (Figure 4, orange diamonds).
Figure 4: Multi-variate plot showing relationship between loading concentration and mean insert length, with higher loading concentrations typically favoring smaller mean insert lengths.
Step 6: Aggregate data to determine optimal loading concentration.
Combine the results of all 5 steps to find the loading concentration(s) that most appropriately fulfills or maximizes all criteria.
Example: 90 pM and 100 pM are both optimal loading concentration for TruSeq DNA PCR-Free libraries on 10B flow cells, as both concentrations display optimal %PF vs %Occupied plots, obtain < 15% duplicates, > 30X coverage, and have desired insert sizes.
For more in-depth discussion on loading optimization for the NovaSeq X Series, see Maximizing performance on the NovaSeq X Series.
For any feedback or questions regarding this article (Illumina Knowledge Article #8911), contact Illumina Technical Support techsupport@illumina.com.
Last updated