DRAGEN Metagenomics Pipeline in BaseSpace general information
The Illumina DRAGEN Metagenomics app performs taxonomic classification of reads using the Kraken2 algorithm and a corresponding taxonomic database. The app provides interactive visualizations and raw classification output for per-sample and aggregate analyses.
Classification is performed using the DRAGEN Metagenomics pipeline, which is also available on local DRAGEN server hardware.
This version of the DRAGEN Metagenomics App 3.5.12 includes the following features:
DRAGEN Map/Align
Built-in de-hosting (human host read removal)
Built-in kraken2 database
MiniKraken2 (March 2020)
Extended Kraken2 (March 2020)
Kraken2 PlusPFP-8 (May 17 2021)
Kraken2 Viral (May 17 2021)
This app also supports custom Kraken2-compatible database input as a compressed tar archive. Select 'Custom' from the menu and then use the Custom Reference Database control.
Organism detection report
CSV and HTML in-browser formats
Adjustable detection thresholds
Integrated QC metrics
Single sample and aggregate reports
Workflow
Select input sample(s).
Choose Reference Database. To use a custom kraken database, first upload the database into BaseSpace as .tar.gz file.
Choose a reference genome for de-hosting.
Choose whether to enable de-hosting (on by default).
Configure the Organism Detection Report (on by default).
Choose "From text file" to provide a semi-colon-delimited list of organisms identifiers via the input form.
Choose "From file input" to provide a list of organism identifiers via a text file in BaseSpace. Text file must have a .txt extension and must have one organism identifier per line.
Configure the low and high detection thresholds.
Optionally set a Maximum number of BioSamples per node. If not defined, the app will decide how to distribute samples to nodes.
Optionally enable sorting (Advanced Settings). This may impact runtime by up to 6X.
Optionally, if de-hosting is enabled, modify the Alignment Minimum Score. This must be greater than 23 and less than or equal to the sample read length. The default value of 50 is recommended, unless analyzing samples shorter than that.
Configure Remove Output Alignment (Advanced Settings). By default, all de-hosting BAM related files are deleted. Uncheck to preserve host BAM files.
Input Files
FASTQ
Custom Kraken2 Reference (.tar.gz)
Organism Identifiers File (.txt)
Key Output Files
.microbe-classification-report.tsv
.microbe-classification.tsv
.microbe-classification_metrics.csv
.organism-detection-report.csv
Known Limitations
Before running the DRAGEN Metagenomics app, be aware of the following limitation:
Mixture of single-end and paired-end input data is not supported
Example Data Set
To view sample data sets and run outputs here, see the public data project here.
For any feedback or questions regarding this article (Illumina Knowledge Article #3688), contact Illumina Technical Support techsupport@illumina.com.
Last updated