Assessing the Suitability of (Zero-Inflated) Negative Binomial Distributions for scRNA-seq Gene Expression Data
07.05.2025Single-cell RNA sequencing (scRNA-seq) data is highly sparse, with many zero values that pose challenges for statistical modeling. Two suitable distributions to use are the negative binomial (NB) and the zero-inflated negative binomial (ZINB) distribution. In this theses, we want to investigate which distributions captures the gene expression patterns in scRNA-seq data better. By comparing both distributions on real datasets, we aim to determine if accounting for excess zeros improves model performance and biological interpretability.