CoDa-thesis at UdG (Spain)


Mixture Models Contributions from Compositional Analysis

On the 24th of October 2018, Marc Comas-Cufí (CoDa-research group, University of Girona) has defended his thesis, supervised by Dr Glòria Mateu-Figueras and Dr Josep-Antoni Martín-Fernández. The jury members were Dr Josep Daunis, Dr Jan Graffelman, and Dr Daniel Oberski. 

The present thesis is a compendium of three original works produced between 2014 and 2018. The papers have a common link: they are different contributions made by compositional data analysis to the study of the models based on mixtures of probability distributions. In brief, we could say that compositional data analysis is a methodology that consists of studying a sample of measures that are strictly positive from a relative point of view.Mixtures of distributions are a specific type of probability distribution defined to be the convex linear combination of other distributions.

In the first work that makes up this thesis, the available options for defining mixture of probability distributions within the sample space of compositional data (simplex) are analysed, considering their specific algebraic structure. In the second work, a model is presented that integrates all the proposals found in the literature that base the construction of the hierarchy on the vectors of posterior probabilities. Apart from this new integrating model, new methods for creating hierarchies using coherent measures for vectors of probabilities, from a compositional point of view, are introduced. In the third and last work of this compendium, different properties of the logratio-normal-multinominal probability distribution are derived, a new method for estimating the parameters of the distribution is presented and the capacity improvement for modelling counting data compared with the Dirichlet-multinomial distribution, one of the most popular in this context, is demonstrated.

Additional information