> For the complete documentation index, see [llms.txt](https://knowledge.illumina.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://knowledge.illumina.com/software/on-premises-software/software-on-premises-software-reference_material-list/000007911.md).

# How to parse the Counts sparse matrix file output by the DRAGEN scRNA and scATAC pipelines

The DRAGEN Single Cell pipelines generate a **count matrix** of unique **UMIs/genes (scRNA) and peaks (scATAC) per cell** and outputs it in a[**Matrix Market format**](https://math.nist.gov/MatrixMarket/formats.html) (matrix.mtx.gz), a format typically used for storing **sparse matrices**. If a user wants to explore the output matrix in a human-readable format, they can do so by loading the matrix in a "dense" dataframe in Python/other programming languages. It is important to remember, however, that when possible a "sparse" representation of the matrix is preferable, due to the **significant usage of memory and disk space** by "dense" matrices. Several tools are available to work efficiently with "sparse" representations of single cell matrices (eg, scanpy in python).

The **row names** for this matrix are stored in the**barcodes.tsv.gz** file while the **column names** are stored in a **genes.tsv.gz (scRNA)** or a **peaks.tsv.gz (scATAC)** file.

The matrix can be converted into a "dense" representation through two python modules: `scanpy` and `pandas`. This has been tested with **python 3.10.0, scanpy 1.9.3, pandas 1.5.3**.

First, it is necessary to install the necessary libraries:

```
> pip install -U scanpy pandas  
```

Within python, the matrix can be loaded in "dense" representation using the following commands:

```
# import libraries import pandas as pd import scanpy as sc # define path to input files matrix\_path = "path/to/matrix.mtx.gz" genes\_path = "path/to/genes.tsv.gz" #path/to/peaks.tsv.gz for scATAC databarcodes\_path = "path/to/barcodes.tsv.gz" # load matrix through scanpy adata = sc.read\_mtx(matrix\_path).T adata.var\_names = pd.read\_csv(genes\_path, sep="\t", header=None)[1] adata.obs\_names = pd.read\_csv(barcodes\_path, sep="\t", header=None)[0] # convert scanpy internal format (AnnData) to dense pandas DataFrame df = pd.DataFrame(adata.X.todense(), index=adata.obs\_names, columns=adata.var\_names) # save it as CSV file df.to\_csv("output\_matrix.csv")  
```

The matrix can be saved through different output formats (eg, CSV), although this might not recommended due to large disk usage.

\
\
\ <br>

|                                                                                                                                                                                                                                                                                                                                                                 |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| *For any feedback or questions regarding this article (Illumina Knowledge Article #7911), contact Illumina Technical Support* [*techsupport@illumina.com*](mailto:techsupport@illumina.com?subject=Question%2FFeedback%20Regarding%20Illumina%20Knowledge%20Article%20#000007911%20-%20Software%20\&body=Dear%20Illumina%20Technical%20Support,%0D%0A%0D%0A)*.* |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://knowledge.illumina.com/software/on-premises-software/software-on-premises-software-reference_material-list/000007911.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
