Analysis launch FAQ for the Illumina 5 Base WGS and Enrichment kits

Which app is used for analysis of Illumina 5-Base DNA Prep and Illumina 5-Base DNA Prep with Enrichment libraries? Where is analysis supported (BaseSpace (BSSH), Illumina connected Analytics (ICA), local server?)

Analysis will be done in the BaseSpace apps listed below. With the DRAGEN 4.4.6 release, there is a dropdown menu labeled "Methylation for Illumina 5-Base DNA Prep", and checking the "Enable 5-Base Methylation-Aware Algorithms" will activate the proper settings for analysis of 5-Base libraries. The same named options are used in ICA apps as well.

  • DRAGEN Germline (gDNA or cfDNA)

  • DRAGEN Somatic (gDNA or cfDNA)

  • DRAGEN Enrichment (Enrichment libraries)

Data can be analyzed on BSSH, ICA, or local server. On-board secondary analysis is not supported, but FASTQs can be generated on-board through BCL Convert, and then used as input for the pipeline in BSSH, ICA, or local server.

Should DRAGEN Germline, Somatic, and Enrichment be used for non-human samples for 5-Base analysis?

  • Methylation calling for non-human samples is supported within DRAGEN Germline, Somatic, and Enrichment for 5-Base analysis. Variant calling for non-human samples is not formally validated in these pipelines but can be done experimentally. Some species may have higher CpH methylation; to support such samples, it's recommended to run: --enable-cpg-methylated-mapping=false in the "Additional Arguments" section of the pipeline. This setting is not recommended for analysis of samples where the majority of methylation occurs in a CpG context.

Will customers be able to launch analyses with a BaseSpace Basic account, or will they need a Professional BaseSpace account? Will the app launch require iCredits?

  • The DRAGEN apps are paid apps that require iCredits to run.

Do customers need a specific 5-Base license to run 5-base analysis on an on-prem DRAGEN server? Is there a cost for the license?

  • A 5-Base license is required for on-prem analysis; this license is provided with no cost/charge.

  • If the on-prem server is connected to the internet, the 5-Base license will automatically be “pushed” to the server. For offline servers, please contact Illumina’s Customer Care team to get the license installed.

Are there any plans for local analysis?

  • Yes, local analysis with a server is available in DRAGEN v4.4+. Illumina recommends using the most current DRAGEN (currently v4.4.6) for local analysis.

  • There are no plans for on-instrument analysis at this time, it must be run on a standalone DRAGEN server.

  • Consult the analysis information site for details on how to launch.

  • The Illumina DRAGEN pipelines are designed for the 5-Base library prep kits.

The suggested command-line in the reference manual does not mention anything about read trimming and/or UMIs? Should users enable any options related to these?

  • Using the sample sheet template and DRAGEN pipelines as instructed in the user guide should cover the vast majority of a user’s trimming needs. The sample sheet contains trimming setting applied in BCL Convert to remove adapter bases. The 5-base DRAGEN pipelines have soft clipping enabled by default to remove remaining artifacts.

  • gDNA samples, to trim an additional, custom amount, this can be done in the sample sheet by altering the OverrideCycle setting (eg, “N5” to ignore/trim 5 bases).

  • The sample sheet template also contains settings ensure proper UMI handling in BCL Convert. For additional details on how to enable UMI collapsing, see the appropriate DRAGEN recipes for the use case. UMI collapsing is only recommended for Enrichment datasets to ensure sufficient data yield.

Can customers analyze the data with DRAGEN Methylation? Any caveats to results interpretation?

  • The DRAGEN Methylation app is not supported or recommended for use.

  • While DRAGEN Methylation can be run on 5-Base data, it is primarily intended for bisulfite converted DNA (where un-methylated C are converted to T), and not 5-Base libraries (where methylated C are converted to T). Running DRAGEN Methylation with 5-Base data will produce lower alignment and no variant calls. Use DRAGEN Germline, DRAGEN Somatic, or DRAGEN Enrichment after checking the box to “Enable 5-Base Methylation Mode” for Illumina 5-Base data.

Can "Oncoanalyser" be used for analysis?

  • Oncoanalyser is intended for DNA or RNA data only, not DNA methylation FASTQ/BAM. Illumina recommends customers use the Illumina DRAGEN pipeline designed for the 5-Base library prep kit.

Why aren't CNV Exome analyses supported? Are PONs for WGS supported?

  • The only formally supported features are the ones listed in the user guide. Therefore Illumina is not supporting either of these features for now.

What is the support for structural variant calling?

  • SV Calling should be turned on when CNV calling is being performed to improve CNV results, but the SV calling is not independently verified.

Is a regular DRAGEN reference genome hash table used for analysis or is a special 5-Base version required? Can this 5-Base reference be made in DRAGEN Reference Builder?

  • A new DRAGEN reference genome hash is required that was created with DRAGEN Reference Builder v4.4.4+ and with "Include Methylation Data in Reference" checked. For human, this reference has been added to the DRAGEN cloud apps and download site so no customization building should be needed.

  • Image of option from DRAGEN Reference Builder v4.4.4:

What settings are required to create a 5-Base compatible hash table for Map/Align?

  • The analysis is compatible with all references that are labeled as “methyl_cg” (which is everything in the DRAGEN v11 Reference folder on ICA).

Regarding reference hash table selection for 5-Base, would there be any reason to pick the graph reference over linear for the hg38 v11 hash table, ie, hg38-alt_masked.cnv.graph.hla.methyl_cg.rna-11-r5.0-1 vs hg38-alt_masked.cnv.hla.methyl_cg.methylated_combined.rna-11-r5.0-1 (from here****)? There is an additional subdirectory "methyl_converted/" along with "methyl_cg/" in the linear reference, but it might not be relevant to 5-Base. Is that correct? This question applies to both germline (with small variant calling) and somatic (tumor only) pipelines.

  • Use graph genome reference for germline analyses.

  • Use linear genome reference for somatic analysis (somatic VC, shouldn't impact mC).

  • methyl_cg subdirectory is used for 5-Base data to account for conversion in CpG-dense regions. methyl_converted subdirectory is specific to 3-base methods (BiS/EMS) and used by the DRAGEN Methylation pipeline.

For somatic variant calling, does Illumina have a recommended file for systematic noise?

The systematic noise file Illumina recommends using (WGS_hg38_v2.0.0_systematic_noise.snv.bed.gz) is part of the standard DRAGEN Resource files available for DRAGEN v4.4 on Illumina's DRAGEN Secondary Analysis support site.

Is there demo data in BaseSpace at this moment or in near future?

  • In BaseSpace, demo data is available under the "Demo Data" tab.

  • In ICA, demo data is available in the DRAGEN v4.4 bundle under "/Illumina DRAGEN 5-Base Methylation Germline Demo Data".

How long does 5-Base analysis take? • DRAGEN local runs, < 1 hour for germline. Somatic WGS 100X tumor-only ~3 hours. Somatic WGS 100X/50X T/N ~4 hours.

Is Sample-specific NTD Error Bias Estimation enabled for somatic calling in 5-Base analysis? Are there any changes in the small VC calling in Somatic mode? • Yes, it is enabled by default. It's adapted in 5-Base to also be accurate with C>T and G>A Error estimates.

While estimating a kmer uniqueness map for CNV calling, is the methylation status of a base considered? Are the intervals generated based on the post conversion (5mC => T) kmers? • No, methylation status isn’t considered.

For any feedback or questions regarding this article (Illumina Knowledge Article #9950), contact Illumina Technical Support [email protected].

Last updated

Was this helpful?