Kernels for structures, including graphs, generally suffer of the diagonally dominant gram matrix issue, the effect by which the number of sub-structures, or features, shared between instances are very few with respect to those shared by an instance with itself. A parametric rule is typically used to reduce the weights of largest (more complex) sub-structures. The particular rule which is adopted is in fact a strong external bias that may strongly affect the resulting predictive performance. Thus, in principle, the applied rule should be validated in addition to the other hyper-parameters of the kernel. Nevertheless, for the majority of graph kernels proposed in literature, the parameters of the weighting rule are fixed a priori. The contribution of this paper is two-fold. Firstly, we propose a Multiple Kernel Learning (MKL) approach to learn different weights of different bunches of features which are grouped by complexity. Secondly, we define a notion of kernel complexity, namely Kernel Spectral Complexity, and we show how this complexity relates to the well-known Empirical Rademacher Complexity for a natural class of functions which include SVM. The proposed approach is applied to a recently defined graph kernel and valuated on several real-world datasets. The obtained results show that our approach outperforms the original kernel on all the considered tasks.
Multiple graph-kernel learning
AIOLLI, FABIO;DONINI, MICHELE;NAVARIN, NICOLO';SPERDUTI, ALESSANDRO
2015
Abstract
Kernels for structures, including graphs, generally suffer of the diagonally dominant gram matrix issue, the effect by which the number of sub-structures, or features, shared between instances are very few with respect to those shared by an instance with itself. A parametric rule is typically used to reduce the weights of largest (more complex) sub-structures. The particular rule which is adopted is in fact a strong external bias that may strongly affect the resulting predictive performance. Thus, in principle, the applied rule should be validated in addition to the other hyper-parameters of the kernel. Nevertheless, for the majority of graph kernels proposed in literature, the parameters of the weighting rule are fixed a priori. The contribution of this paper is two-fold. Firstly, we propose a Multiple Kernel Learning (MKL) approach to learn different weights of different bunches of features which are grouped by complexity. Secondly, we define a notion of kernel complexity, namely Kernel Spectral Complexity, and we show how this complexity relates to the well-known Empirical Rademacher Complexity for a natural class of functions which include SVM. The proposed approach is applied to a recently defined graph kernel and valuated on several real-world datasets. The obtained results show that our approach outperforms the original kernel on all the considered tasks.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.