Metagenomics is the study of heterogeneous microbial samples extracted directly from their natural environment, e.g., from soil, water, or the human body. The detection and quantification of species that populate microbial communities have been the subject of many recent studies based on classification and clustering, motivated by being the first step in more complex pipelines (e.g. for functional analysis, de-novo assembly or comparison of metagenomes). In this paper we explore the idea of improving the overall quality of metagenomics binning at reads-level by proposing a framework that sequentially combine two complementary read binning approaches: one based on species abundances determination and another one relying on reads overlap in order to cluster reads together. Our preliminary results show that the combination of the two tools can lead to the improvement of the clustering quality in realistic conditions where the number of species is not known beforehand.
On Multi-phase Metagenomics Reads Binning
Cinzia Pizzi
2025
Abstract
Metagenomics is the study of heterogeneous microbial samples extracted directly from their natural environment, e.g., from soil, water, or the human body. The detection and quantification of species that populate microbial communities have been the subject of many recent studies based on classification and clustering, motivated by being the first step in more complex pipelines (e.g. for functional analysis, de-novo assembly or comparison of metagenomes). In this paper we explore the idea of improving the overall quality of metagenomics binning at reads-level by proposing a framework that sequentially combine two complementary read binning approaches: one based on species abundances determination and another one relying on reads overlap in order to cluster reads together. Our preliminary results show that the combination of the two tools can lead to the improvement of the clustering quality in realistic conditions where the number of species is not known beforehand.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.