We are pleased to announce an upcoming talk by Dr. Ionas Erb from the Centre for Genomic Regulation (CRG), who will be presenting on April 23, 2025, at 15:00.
Abstract:
Contingency tables are data structures for discrete variables: each row stands for a value the row variable can take, and each column for a value the column variable can take (with higher dimensional tables for more than two variables). The table entries are event counts and are used to estimate the parameters of a cross-classified multinomial distribution, which allows to quantify interaction between the variables. The associated theory of log-linear models was established in the 1970ies already. In this context, marginal distributions are the distributions of subsets of variables obtained from coordinate projections. In this talk, I will scrutinize an alternative approach to the analysis of contingency tables that uses the logratio approach of compositional data analysis (CoDA). This latter approach leads to the so-called geometric marginals. These have an appealing geometric representation in terms of projections in Euclidean space, allowing for a Pythagorean theorem for probability distributions. The problem with this alternative form of marginalization is that the resulting distributions no longer have a clear probabilistic meaning. To obtain analogous geometric constructions for the classical (arithmetic) marginals, a generalization of Euclidean geometry known as Information Geometry must be applied. This approach is favored because it is based on the Fisher-Rao metric, the only metric on the simplex that is invariant under reparameterizations and sufficient statistics. A Pythagoras theorem for the Kullback-Leibler divergence of the distribution from its marginals makes use of the so-called information projections. These can be used to quantify the difference in mutual information that a distribution has from its arithmetic and geometric marginals, respectively.