# DRAGEN Metagenomics Pipeline in BaseSpace general information

[**The Illumina DRAGEN Metagenomics app**](https://basespace.illumina.com/apps/14890876/DRAGEN-Metagenomics-Pipeline) performs taxonomic classification of reads using the Kraken2 algorithm and a corresponding taxonomic database. The app provides interactive visualizations and raw classification output for per-sample and aggregate analyses.

Classification is performed using the DRAGEN Metagenomics pipeline, which is also available on local DRAGEN server hardware.

This version of the DRAGEN Metagenomics App 3.5.12 includes the following features:

* DRAGEN Map/Align
* Built-in de-hosting (human host read removal)
* Built-in kraken2 database
  * MiniKraken2 (March 2020)
  * Extended Kraken2 (March 2020)
  * Kraken2 PlusPFP-8 (May 17 2021)
  * Kraken2 Viral (May 17 2021)
  * This app also supports custom Kraken2-compatible database input as a compressed tar archive. Select 'Custom' from the menu and then use the Custom Reference Database control.
* Organism detection report
  * CSV and HTML in-browser formats
* Adjustable detection thresholds
* Integrated QC metrics
* Single sample and aggregate reports

#### **Workflow**

1. Select input sample(s).
2. Choose Reference Database. To use a custom kraken database, first upload the database into BaseSpace as .tar.gz file.
3. Choose a reference genome for de-hosting.
4. Choose whether to enable de-hosting (on by default).
5. Configure the Organism Detection Report (on by default).

* Choose "From text file" to provide a semi-colon-delimited list of organisms identifiers via the input form.
* Choose "From file input" to provide a list of organism identifiers via a text file in BaseSpace. Text file must have a .txt extension and must have one organism identifier per line.

6. Configure the low and high detection thresholds.
7. Optionally set a Maximum number of BioSamples per node. If not defined, the app will decide how to distribute samples to nodes.
8. Optionally enable sorting (Advanced Settings). This may impact runtime by up to 6X.
9. Optionally, if de-hosting is enabled, modify the Alignment Minimum Score. This must be greater than 23 and less than or equal to the sample read length. The default value of 50 is recommended, unless analyzing samples shorter than that.
10. Configure Remove Output Alignment (Advanced Settings). By default, all de-hosting BAM related files are deleted. Uncheck to preserve host BAM files.

#### **Input Files**

* FASTQ
* Custom Kraken2 Reference (.tar.gz)
* Organism Identifiers File (.txt)

#### **Key Output Files**

* .microbe-classification-report.tsv
* .microbe-classification.tsv
* .microbe-classification\_metrics.csv
* .organism-detection-report.csv

#### **Known Limitations**

* Before running the DRAGEN Metagenomics app, be aware of the following limitation:
  * Mixture of single-end and paired-end input data is not supported

####

#### **Example Data Set**

To view sample data sets and run outputs here, see the public data project [here](https://basespace.illumina.com/s/uNeacHkp5Wq4).

####

\
\
\ <br>

|                                                                                                                                                                                                                                                                                                                                                                 |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| *For any feedback or questions regarding this article (Illumina Knowledge Article #3688), contact Illumina Technical Support* [*techsupport@illumina.com*](mailto:techsupport@illumina.com?subject=Question%2FFeedback%20Regarding%20Illumina%20Knowledge%20Article%20#000003688%20-%20Software%20\&body=Dear%20Illumina%20Technical%20Support,%0D%0A%0D%0A)*.* |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://knowledge.illumina.com/software/cloud-software/software-cloud-software-reference_material-list/000003688.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
