# How to run the DRAGEN MSI Pipeline

Microsatellites are genomic regions of short DNA motifs that are repeated 5-50 times and are associated with high mutation rates. Microsatellite Instability (MSI) results from deficiencies in the DNA mismatch repair pathway and can be used as a critical biomarker to predict immunotherapy responses in multiple tumor types.

DRAGEN MSI can work in 3 different modes determined by enabling the option --msi-command

1. Collect Evidence Mode (collect-evidence)
2. Tumor Normal Mode (tumor-normal)
3. Tumor Only Mode (tumor-only)

A **Microsatellites site** file that lists microsatellite sites of interest in a **given reference genome** is required to run this pipeline. The recommended tool for generating this file is **msi-sensor**; Information about this tool can be found [here](https://github.com/xjtu-omics/msisensor-pro/wiki).

The [scan](https://github.com/xjtu-omics/msisensor-pro/wiki/Key-Commands#scan-from-msisensor) command can be used to generate the microsatellites sites file.

msisensor-pro scan -d hg38.fa -o hg38\_microsatellites\_pro.tsv -p 1

options:

-d \[string] reference genome sequences file, \\\*.fasta format

-o \[string] output homopolymers and microsatellites file

-l \[int] minimal homopolymer size, default=5

-c \[int] context length, default=5

-m \[int] maximal homopolymer size, default=50

-s \[int] maximal length of microsate, default=5

-r \[int] minimal repeat times of microsate, default=3

-p \[int] output homopolymer only, 0: no; 1: yes, default=0

-h help

The **'-p 1'** flag is **required** to be added to the command to output only **homopolymers**; The MSI pipeline has only been **benchmarked** for **hompolymers** and **cannot** work with **repeat regions of size > 100bp**. Adding this flag should result in a compatible MSI sites file.

For **WGS and WES** samples, the **tumor-normal mode** is **recommended**. Here is an example command for a tumor-normal analysis.

dragen \\

\--msi-command tumor-normal \\

\--msi-coverage-threshold 60 \\

\--msi-microsatellites-file msi\_file \\

\--output-directory={output\_directory} \\

\--output-file-prefix={prefix} \\

\--enable-map-align=true \\

\--RGID=read\_group\_ID \\

\--RGSM=read\_group\_sample \\

\--ref-dir={reference\_directory} \\

\--enable-map-align-output=true \\

\--enable-sort=true \\

\--enable-duplicate-marking=true \\

\--tumor-fastq1 {tumor\_fq1} \\

\--tumor-fastq2 {tumor\_fq2} \\

\--fastq-file1 {fq1} \\

\--fastq-file2 {fq2}

The **tumor-only** mode requires a **panel of normals**; this can be generated by running the MSI pipeline in the collect-evidence mode. The panel of normals is required to contain **at least 20 normal** **samples** (hard-coded requirement for running the tumor-only mode).

/dragen -f \\

\--ref-dir={reference\_directory} \\

\--fastq-file1 {fq1} \\

\--fastq-file2 {fq2} \\

\--output-directory={output\_directory} \\

\--enable-map-align=true \\

\--RGID=read\_group\_ID \\

\--RGSM=read\_group\_sample \\

\--output-file-prefix={prefix} \\

\--enable-map-align-output=true \\

\--enable-sort=true \\

\--enable-duplicate-marking=true \\

\--msi-command collect-evidence \\

\--msi-coverage-threshold 60 \\

\--msi-microsatellites-file msi\_file

Once this is done, the MSI **.dist files** can be moved to a separate folder (normal\_reference\_directory) and be used as a part of the MSI pipeline in the tumor-only mode.

dragen \\

\--msi-command tumor-only \\

\--msi-coverage-threshold 60 \\

\--msi-microsatellites-file msi\_file \\

\--msi-ref-normal-dir normal\_reference\_directory \\

\--output-directory={output\_directory} \\

\--output-file-prefix={prefix} \\

\--enable-map-align=true \\

\--RGID=read\_group\_ID \\

\--RGSM=read\_group\_sample \\

\--ref-dir={reference\_directory} \\

\--enable-map-align-output=true \\

\--enable-sort=true \\

\--enable-duplicate-marking=true \\

\--tumor-fastq1 {tumor\_fq1} \\

\--tumor-fastq2 {tumor\_fq2}

More information about the MSI pipeline can be found in the [user guide](https://support-docs.illumina.com/SW/DRAGEN_v40/Content/SW/DRAGEN/Biomarkers_MSI.htm).

\
\
\ <br>

|                                                                                                                                                                                                                                                                                                                                                                 |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| *For any feedback or questions regarding this article (Illumina Knowledge Article #7508), contact Illumina Technical Support* [*techsupport@illumina.com*](mailto:techsupport@illumina.com?subject=Question%2FFeedback%20Regarding%20Illumina%20Knowledge%20Article%20#000007508%20-%20Software%20\&body=Dear%20Illumina%20Technical%20Support,%0D%0A%0D%0A)*.* |
