DRAGEN Metagenomics Pipeline in BaseSpace general information

The Illumina DRAGEN Metagenomics app performs taxonomic classification of reads using the Kraken2 algorithm and a corresponding taxonomic database. The app provides interactive visualizations and raw classification output for per-sample and aggregate analyses.

Classification is performed using the DRAGEN Metagenomics pipeline, which is also available on local DRAGEN server hardware.

This version of the DRAGEN Metagenomics App 3.5.12 includes the following features:

  • DRAGEN Map/Align

  • Built-in de-hosting (human host read removal)

  • Built-in kraken2 database

    • MiniKraken2 (March 2020)

    • Extended Kraken2 (March 2020)

    • Kraken2 PlusPFP-8 (May 17 2021)

    • Kraken2 Viral (May 17 2021)

    • This app also supports custom Kraken2-compatible database input as a compressed tar archive. Select 'Custom' from the menu and then use the Custom Reference Database control.

  • Organism detection report

    • CSV and HTML in-browser formats

  • Adjustable detection thresholds

  • Integrated QC metrics

  • Single sample and aggregate reports

Workflow

  1. Select input sample(s).

  2. Choose Reference Database. To use a custom kraken database, first upload the database into BaseSpace as .tar.gz file.

  3. Choose a reference genome for de-hosting.

  4. Choose whether to enable de-hosting (on by default).

  5. Configure the Organism Detection Report (on by default).

  • Choose "From text file" to provide a semi-colon-delimited list of organisms identifiers via the input form.

  • Choose "From file input" to provide a list of organism identifiers via a text file in BaseSpace. Text file must have a .txt extension and must have one organism identifier per line.

  1. Configure the low and high detection thresholds.

  2. Optionally set a Maximum number of BioSamples per node. If not defined, the app will decide how to distribute samples to nodes.

  3. Optionally enable sorting (Advanced Settings). This may impact runtime by up to 6X.

  4. Optionally, if de-hosting is enabled, modify the Alignment Minimum Score. This must be greater than 23 and less than or equal to the sample read length. The default value of 50 is recommended, unless analyzing samples shorter than that.

  5. Configure Remove Output Alignment (Advanced Settings). By default, all de-hosting BAM related files are deleted. Uncheck to preserve host BAM files.

Input Files

  • FASTQ

  • Custom Kraken2 Reference (.tar.gz)

  • Organism Identifiers File (.txt)

Key Output Files

  • .microbe-classification-report.tsv

  • .microbe-classification.tsv

  • .microbe-classification_metrics.csv

  • .organism-detection-report.csv

Known Limitations

  • Before running the DRAGEN Metagenomics app, be aware of the following limitation:

    • Mixture of single-end and paired-end input data is not supported

Example Data Set

To view sample data sets and run outputs here, see the public data project here.

For any feedback or questions regarding this article (Illumina Knowledge Article #3688), contact Illumina Technical Support techsupport@illumina.com.

Last updated

© 2023 Illumina, Inc. All rights reserved. All trademarks are the property of Illumina, Inc. or their respective owners. Trademark information: illumina.com/company/legal.html. Privacy policy: illumina.com/company/legal/privacy.html