Learning undirected graphical models from multiple datasets with the generalized non-rejection rate

Roverato, Alberto; Caselo, R.

Learning graphical models from multiple datasets constitutes an appealing approach to learn transcriptional regulatory interactions from microarray data in the field of molecular biology. This has been approached both in a model based statistical approach and in an unsupervised machine learning approach where, in the latter, it is common practice to pool datasets produced under different experimental conditions. In this paper, we introduce a quantity called the generalized nonrejection rate which extends the non-rejection rate, introduced by Castelo and Roverato (2006), so as to explicitly keep into account the different graphical models representing distinct experimental conditions involved in the structure of the dataset produced in multiple experimental batches. We show that the generalized non-rejection rate allows one to learn the common edges occurring throughout all graphical models, making it specially suited to identify robust transcriptional interactions which are common to all the considered experiments. The generalized non-rejection rate is then applied to both synthetic and real data and shown to provide competitive performance with respect to other widely used methods.