There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations of such prior in terms of an Exchangeable Partition Probability Function (EPPF). Gibbs-type priors encompass a broad class of such cases, including Dirichlet and Pitman-Yor processes. Even though there have been some proposals to relax the exchangeability assumption, allowing covariate-dependence and partial exchangeability, limited consideration has been given on how to include concrete prior knowledge on the partition. Our motivation is drawn from an epidemiological application, in which we wish to cluster birth defects into groups and we have a prior knowledge of an initial clustering provided by experts. The underlying assumption is that birth defects in the same group may have similar coefficients in logistic regression analysis relating different exposures to risk of developing the defect. As a general approach for including such prior knowledge, we propose a Centered Partition (CP) process that modifies a base EPPF to favor partitions in a convenient distance neighborhood of the initial clustering. This thesis focus on providing characterization of such new class, along with properties and general algorithms for posterior computation. We illustrate the methodology through simulation examples and an application to the motivating epidemiology study of birth defects.

Prior-driven cluster allocation in bayesian mixture models / Paganin, Sally. - (2018 Nov 30).

Prior-driven cluster allocation in bayesian mixture models

Paganin, Sally
2018

Abstract

There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations of such prior in terms of an Exchangeable Partition Probability Function (EPPF). Gibbs-type priors encompass a broad class of such cases, including Dirichlet and Pitman-Yor processes. Even though there have been some proposals to relax the exchangeability assumption, allowing covariate-dependence and partial exchangeability, limited consideration has been given on how to include concrete prior knowledge on the partition. Our motivation is drawn from an epidemiological application, in which we wish to cluster birth defects into groups and we have a prior knowledge of an initial clustering provided by experts. The underlying assumption is that birth defects in the same group may have similar coefficients in logistic regression analysis relating different exposures to risk of developing the defect. As a general approach for including such prior knowledge, we propose a Centered Partition (CP) process that modifies a base EPPF to favor partitions in a convenient distance neighborhood of the initial clustering. This thesis focus on providing characterization of such new class, along with properties and general algorithms for posterior computation. We illustrate the methodology through simulation examples and an application to the motivating epidemiology study of birth defects.
30-nov-2018
Bayesian clustering, Bayesian nonparametrics, centered process, Dirichlet Process, exchangeable probability partition function, mixture model, product partition model.
Prior-driven cluster allocation in bayesian mixture models / Paganin, Sally. - (2018 Nov 30).
File in questo prodotto:
File Dimensione Formato  
paganin_sally_thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Accesso gratuito
Dimensione 922.14 kB
Formato Adobe PDF
922.14 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3426831
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact