Research Seminar Data Science Foundations

January 9, 2025, Thomas Martinetz, Universität zu Lübeck

Do highly over-parameterized neural networks generalize since bad solutions are rare?

Thomas Martinetz, Universität zu Lübeck

Abstract

Traditional machine learning wisdom tells us that it needs more training data points than there are parameters in a neural network to be able to learn a given task. Learning theory based on VC-dimension or Rademacher Complexity provides an extended and deeper framework for this "wisdom". Modern deep neural networks have millions of weights, so one should need extremely large training data sets. That's the common narrative. But is it really true? In practice, these large neural networks are often trained with much less data than one would expect to be necessary. We show in experiments that even a few hundred data points can be sufficient for millions of weights. We provide a mathematical framework to understand this surprising phenomenon, challenging the traditional view.

October 29, 2024, Quanyang Chen, University of Sydney

Super-efficiency in complex collective systems

Quanyang Chen, University of Sydney

Abstract

Self-organising collective systems, such as flocks of birds, shoals of fish and the brain, often operate near the "critical regime" between order and disorder. To understand why this occurs, we study four intrinsic utilities: predictive information, empowerment, variational free energy (active inference) and thermodynamic efficiency. These measures evaluate the usefulness of behaviour independent of external rewards. In this talk, I will briefly introduce each measure and compare them using the same example (the Ising Model). We use numerical simulations to identify the optimal parameter values under each intrinsic utility framework, and I will discuss what the results reveal about the utilities of operating at the critical regime.

October 23, 2024, Swarnadeep Bhar, IRIT, Toulouse Institute for Research in Computer Science

Language Models: A look into its History, Scopes and Pitfalls

Swarnadeep Bhar, IRIT, Toulouse Institute for Research in Computer Science

Abstract

With the release of ChatGPT we have seen a significant interest in language models.While the central theory behind these models have been around since quite sometime, the large language models show scaling with the large amount of data and aligning them with human responses can unlock significant jumps in performance previously unseen. While they have an impressive performance over a range of bench marks, which were previously thought to be “impossible” to be solved by machine learning techniques, a range of new problems arise, with “hallucinated” responses and the tendency of these models to provide affirmative responses for any instruction, limit the blind deployment of these models in critical scenarios. In this talk, we’ll explore the core principles behind these models, their impressive capabilities, and the challenges they pose—such as hallucinations and overly affirmative responses. We'll also discuss key considerations to keep in mind when deploying these models for personal or critical use cases.

September 19, 2024, Hippolyte Charvin, University of Hertfordshire

Towards Symmetry-Based Structure Extraction using Generalized Information Bottlenecks

Hippolyte Charvin, University of Hertfordshire

Abstract

Extraction of structure, in particular of group symmetries, is increasingly crucial to understanding and building intelligent models. In parallel, some information-theoretic models of complexity-constrained learning have been argued to induce invariance extraction. Here, we formalise and extend the study of group symmetries through the information lens, by identifying a certain duality between probabilistic symmetries and information parsimony. Namely, we characterise group symmetries through the full information preservation case of Information Bottleneck-like compressions. More precisely, we require the compression to be optimal under the constraint of preserving the divergence from a given exponential family, yielding a novel generalisation of the Information Bottleneck framework. Through appropriate choices of exponential families, we characterise (in the discrete and full support case) channel invariance, channel equivariance and distribution invariance under permutation. Allowing non-zero distortion then leads to principled definitions of ``soft symmetries'' as the exact symmetries of a compressed representation of data. In simple synthetic experiments, we demonstrate that our method successively recovers, at increasingly compressed ``resolutions'', nested but increasingly perturbed equivariances, where new equivariances emerge at bifurcation points of the distortion parameter. Our framework provides the area of probabilistic symmetry discovery with a theoretical clarification of its link with information parsimony, and with a basis on which to potentially build new computational tools.

January 24, 2024, Janek Gödeke, University of Bremen

Operator Approximation by Neural Networks

Janek Gödeke, University of Bremen

Abstract:

Learning an operator between function spaces by neural networks is desirable, for example, in the field of partial differential equations (PDEs). For instance, learning parameter-to-state maps that map the parameter function of a PDE to the corresponding solution. During the last five years, several Deep Learning concepts arised, such as Deep Operator Networks, (Fourier) Neural Operators, or operator approximation based on Principal Component Analysis.

These approaches, particularly the architectures of the involved neural networks, are inspired from universal approximation theorems, which state that many operators can be approximated arbitrarily well by sufficiently large networks of these types. Although universal approximation theorems cannot be used to fully explain the success of neural networks, their investigation has led to powerful Deep Learning approaches. Furthermore, they reveal that certain network architectures may add some beneficial induced bias to operator learning tasks.

In my talk I will give an overview of the state-of-the-art of such operator approximation theorems. I will discuss general concepts and questions that have not been answered yet.

November 02, 2023, Dieter Büchler, Max Planck Institute for Intelligent Systems

The Role of the Robotic Body in Learning Agile & Accurate Control

Dieter Büchler, Max Planck Institute for Intelligent Systems

Abstract: 

Despite decades of robotics research, current robots still struggle to acquire general and flexible dynamic skills on a human level. Tasks, such as table tennis, represent this set of dynamic problems that appear easy to learn for humans but pose a steep challenge for anthropomorphic robots. In this talk, I will argue that the robotic body plays a crucial role in the generation of such skills. In particular, muscular actuation (i) enables robust long-term training, such as is required with reinforcement learning, and (ii) fail-safe execution of explosive motions that allow robots to safely explore dynamic regimes. Stay tuned for table tennis playing, ball smashing, and precisely controlled soft muscular robots.

October 25, 2023, Mikhail Prokopenko, University of Sydney

Pattern formation and critical regimes during social and epidemic dynamics

Mikhail Prokopenko, University of Sydney

Abstract:

We will discuss pattern formation and critical regimes during spatial contagions of four different types: epidemics, opinion polarisation, social myths, and social unrest. The presented model combines Maximum Entropy principle with Lotka-Volterra dynamics, and the results are analysed using methods of percolation theory. The identified critical regimes separate distinct phases, implying that small changes in individual risk perception could lead to abrupt changes in the spatial morphology of the epidemic/social phenomena.

Presentation

May 10, 2023, Giulia Bertagnolli, University of Trento

Random walks on networks (toward information geometry)

Giulia Bertagnolli, University of Trento

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 16:15

Abstract

Complex physical and social systems find a handy representation in terms of graphs, which, in this context, are called complex networks. Entities in these systems naturally “communicate”, or exchange “information”, e.g., a group of people interacting via email or sharing links, liking posts, and following each other on social platforms, exchange information as part of their social life. Neurons, connected by synapses and fibre bundles, exchange of neuro-physiological signals, enabling cognition. In fish schools, aggregations of fish, who come together in an interactive, social way, the (possibly passive) communication between fish allows them to act as a super-system. All complex systems show some emergent behaviour that cannot be ascribed to the actions and behaviour of their individual components. This emergent behaviour is a function of both the interaction patterns, i.e. the links in the graph, and the communication strategy, which can be modelled as a dynamical process on the network. In this talk, we will see, firstly, how Markovian random walks on networks model diffusion dynamics in the complex system and why this approach is useful in network science. Then, we will see an example of non-Markovian random walk, which mimics the run-and-tumble motion of bacteria. Eventually, it should become clear how this led me here, trying to learn information geometry.

April 19, 2023, Jabob J. W. Bakermans, University of Oxford

Compositional planning by making memories of the future

Jabob J. W. Bakermans (University of Oxford)

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 15:00

Abstract:

Hippocampus is critical for memory, imagination, and constructive reasoning. However, recent models have suggested that its neuronal responses can be well explained by state-spaces that model the transitions between experiences. How do we reconcile these two views? I’ll show that if state-spaces are constructed compositionally from existing primitives, hippocampal responses can be interpreted as compositional memories, binding these primitives together. Critically, this enables agents to behave optimally in novel environments with no new learning, inferring behaviour directly from the composition. This provides natural interpretations of generalisation and latent learning. Hippocampal replay can build and consolidate these compositional memories, but importantly, due to their compositional nature, it can construct states it has never experienced – effectively building memories of the future. This enables new predictions of optimal replays for novel environments, or after structural changes.

March 17, 2023, Minh Ha Quang, RIKEN Center for Advanced Intelligence Project (AIP)

An information geometric and optimal transport framework for Gaussian processes

Minh Ha Quang, RIKEN Center for Advanced Intelligence Project (AIP)

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 15:00

Abstract:

Information geometry (IG) and Optimal transport (OT) have been attracting much research attention in various fields, in particular machine learning and statistics. In this talk, we present results on the generalization of IG and OT distances for finite-dimensional Gaussian measures to the setting of infinite-dimensional Gaussian measures and Gaussian processes. Our focus is on the Entropic Regularization of the 2-Wasserstein distance and the generalization of the Fisher-Rao distance and related quantities. In both settings, regularization leads to many desirable theoretical properties, including in particular dimension-independent convergence and sample complexity. The mathematical formulation involves the interplay of IG and OT with Gaussian processes and the methodology of reproducing kernel Hilbert spaces (RKHS). All of the presented formulations admit closed form expressions that can be efficiently computed and applied practically. The theoretical formulations will be illustrated with numerical experiments on Gaussian processes.

February 01, 2023, Christian Gumbsch, Max Planck Institute for Intelligent Systems and University of Tübingen

Events – Learning Latent Codes for Hierarchical Prediction and Generalization

Christian Gumbsch, Max Planck Institute for Intelligent Systems and University of Tübingen

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 15:00

October 19, 2022, Nihat Ay, Hamburg University of Technology

Die Klugheit der Dinge

Nihat Ay, Hamburg University of Technology

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 15:00

More info: Stud.IP

November 17, 2021, Johannes Rauh, MPI for Mathematics in the Sciences, Leipzig

Uncertainty and Stochasticity of Optimal Policies

Johannes Rauh, MPI for Mathematics in the Sciences, Leipzig and Federal Institute for Quality and Transparency in Healthcare, Berlin

Location: Blohmstraße 15 (HIP One), 5th Floor, Room 5.002

Time: 9:30

Abstract

We are interested in optimal action selection mechanisms, policies, that maximize an expected long term reward. Our main model are POMDPs (Partially Observed Markov Decision Problems). While the optimal policy can be stochastic in the general case, we find conditions under which the optimal policy is deterministic, at least for some observations, or under which the stochasticity can be bounded. This talk presents joint work with Guido Montúfar and Nihat Ay.

June 14, 2021, Nihat Ay, Hamburg University of Technology

Information Geometry for Deep Learning (Seminar within the Machine Learning in Engineering initiative MLE@TUHH)

Nihat Ay, Hamburg University of Technology

✓ The following seminars were co-organised and took place at the Max Planck Institute for Mathematics in the Sciences.

March 19, 2021, Stefanie Jegelka, Machine Learning Group at MIT

Mathematics of Data Seminar : Representation and Learning in Graph Neural Networks

Stefanie Jegelka, Machine Learning Group at MIT, USA

19.03.2021, 16:00 Uhr

The seminar is cancelled.

August 04, 2020, Marco Mondelli, IST Austria

Mathematics of Data Seminar : Understanding Gradient Descent for Over-parameterized Deep Neural Networks

Marco Mondelli, IST Austria

04.08.2020, 11:00 Uhr

July 20, 2020, Franca Hoffmann, California Institute of Technology

Mathematics of Data Seminar : Kalman-Wasserstein Gradient Flows

Franca Hoffmann, California Institute of Technology

20.07.2020, 17:00 Uhr

February 04, 2020, Xerxes Arsiwalla, Pompeu Fabra University Barcelona

Special Seminar : Extending Integrated Information Theories for Cognitive Systems

Xerxes Arsiwalla, Pompeu Fabra University Barcelona, Spain

04.02.2020, 11:00 Uhr

10 December, 2019, Mikhail Belkin, The Ohio State University

Chalk Talk – Mathematics of Data Seminar : What’s next for machine learning? Some thoughts toward a unified theory of supervised inference.

Mikhail Belkin, The Ohio State University, USA

10.12.2019, 16:45 Uhr

14 November, 2019, Kathlén Kohn, KTH Royal Institute of Technology

Mathematics of Data Seminar : The geometry of neural networks

Kathlén Kohn, KTH Royal Institute of Technology, Stockholm

14.11.2019, 11:00 Uhr

23 October, 2019, Věra Kůrková, Institute of Computer Science, Czech Academy of Sciences

Mathematics of Data Seminar : Lower Bounds on Complexity of Shallow Networks

Věra Kůrková, Institute of Computer Science, Czech Academy of Sciences, Czech Republic

23.10.2019, 11:00 Uhr

18 September, 2019, Vladimir Temlyakov, University of South Carolina

Mathematics of Data Seminar : Supervised learning and sampling error of integral norms in function classes

Vladimir Temlyakov, University of South Carolina

18.09.2019, 11:00 Uhr

16 July, 2019, Lamiae Azizi, The University of Sydney

Mathematics of Data Seminar : A Mathematical trip into the Data Science realm

Lamiae Azizi, The University of Sydney

16.07.2019, 11:00 Uhr

This seminar is cancelled.

28 May, 2019, Nicolas Garcia Trillos, University of Wisconsin-Madison

Mathematics of Data Seminar : The use of geometry to learn from data, and the learning of geometry from data.

Nicolas Garcia Trillos, Department of Statistics, University of Wisconsin-Madison, USA

28.05.2019, 11:15 Uhr

10 April, 2019, Gabriel Peyré, CNRS and Ecole Normale Supérieure, Paris

Mathematics of Data Seminar : Computational Optimal Transport for Data Sciences

Gabriel Peyré, CNRS and Ecole Normale Supérieure, Paris, France

10.04.2019, 11:00 Uhr

07 March, 2019, Stefania Petra, Universität Heidelberg

Mathematics of Data Seminar : Compressed Sensing – From Theory To Practice

Stefania Petra, Universität Heidelberg

07.03.2019, 11:00 Uhr

14 February, 2019, Felix Krahmer, Technische Universität München

Mathematics of Data Seminar : Blind deconvolution with randomness – convex geometry and algorithmic approaches

Felix Krahmer, Technische Universität München

14.02.2019, 11:00 Uhr

28 January, 2019, Nils Bertschinger, Frankfurt Institute for Advanced Studies (FIAS)

Mathematics of Data Seminar : A geometric structure underlying stock correlations

Nils Bertschinger, Frankfurt Institute for Advanced Studies (FIAS), Germany

28.01.2019, 11:00 Uhr

08 November, 2018, Benjamin Fehrmann, University of Oxford

Mathematics of Data Seminar : Convergence rates for mean field stochastic gradient descent algorithms

Benjamin Fehrmann, University of Oxford

08.11.2018, 11:00 Uhr

27 September, 2018, Max von Renesse, Universität Leipzig

Mathematics of Data Seminar : Topics in Deterministic and Stochastic Dynamical Systems on Wasserstein Space

Max von Renesse, Universität Leipzig

27.09.2018, 11:00 Uhr

14 August, 2018, Afonso Bandeira, Courant Institute of Mathematical Sciences, New York

Mathematics of Data Seminar : Statistical estimation under group actions: The Sample Complexity of Multi-Reference Alignment

Afonso Bandeira, Courant Institute of Mathematical Sciences, New York

14.08.2018, 16:30 Uhr

11 July, 2018, Harald Oberhauser, University of Oxford

Mathematics of Data Seminar : Learning laws of stochastic processes

Harald Oberhauser, University of Oxford

11.07.2018, 15:30 Uhr

18 June, 2018, Anna Seigal, University of California, Berkeley

Mathematics of Data Seminar : Structured Tensors and the Geometry of Data

Anna Seigal, University of California, Berkeley

18.06.2018, 15:30 Uhr

14 May, 2018, Keyan Ghazi-Zahedi, MPI MIS, Leipzig

Seminar on Theory of Embodied Intelligence : Quantifying Morphological Computation

Keyan Ghazi-Zahedi, MPI MIS, Leipzig

14.05.2018, 14:00 Uhr

02 May, 2018, Steffen Lauritzen, University of Copenhagen

Mathematics of Data Seminar : Max-linear Bayesian networks

Steffen Lauritzen, University of Copenhagen, Denmark

02.05.2018, 11:00 Uhr

24 April, 2018, Benjamin Recht, University of California, Berkeley

Mathematics of Data Seminar : The statistical foundations of learning to control

Benjamin Recht, University of California, Berkeley

24.04.2018, 15:30 Uhr

22 March, 2018, Dimitri Marinelli, Romanian Institute of Science and Technology (RIST)

Information Geometry Seminar : Quantum Information Geometry and Boltzmann Machines

Dimitri Marinelli, Romanian Institute of Science and Technology (RIST), Romania

22.03.2018, 14:00 Uhr

08 January, 2018, Nihat Ay, MPI MIS, Leipzig

LikBez Seminar : Causal Inference II

Nihat Ay, MPI MIS, Leipzig

08.01.2018, 14:00 Uhr

04 December, 2017, Wolfgang Löhr, TU Chemnitz

Special Seminar : Continuum limits of tree-valued Markov chains and algebraic measure trees

Wolfgang Löhr, TU Chemnitz

04.12.2017, 11:00 Uhr

27 November, 2017, Fabio Bonsignorio, Scuola Superiore Sant’Anna, Pisa

Seminar on Theory of Embodied Intelligence : Modeling of Networked Embodied Cognitive Processes

Fabio Bonsignorio, Scuola Superiore Sant’Anna, Pisa, Italy

27.11.2017, 14:00 Uhr

10 November, 2017, Jun Zhang, University of Michigan-Ann Arbor

Information Geometry Seminar : Statistical Manifold and Entropy-Based Inference

Jun Zhang, University of Michigan-Ann Arbor, USA

10.11.2017, 11:45 Uhr

16 October, 2017, Luigi Malagò, Romanian Institute of Science and Technology (RIST)

Information Geometry Seminar : From Natural Gradient to Riemannian Hessian: Second-order Optimization over Statistical Manifolds

Luigi Malagò, Romanian Institute of Science and Technology (RIST), Romania

16.10.2017, 14:00 Uhr

10 May, 2017, Domenico Felice, University of Camerino

Information Geometry Seminar : Hamilton-Jacobi approach to Potential Functions in Information Geometry

Domenico Felice, University of Camerino, Italy

10.05.2017, 14:00 Uhr

25 November, 2016, František Matúš, Czech Academy of Sciences

Special Seminar : Polyquantoids and quantoids: quantum counteparts of polymatroids and matroids

František Matúš, Czech Academy of Sciences, Prague, Czech Republic

25.11.2016, 15:30 Uhr

28 June, 2016, Daniel Polani, University of Hertfortshire

Seminar on Theory of Embodied Intelligence : On Information and the Drivers of Cognition

Daniel Polani, University of Hertfortshire, United Kingdom

28.06.2016, 15:30 Uhr

10 May, 2016, Roy Fox, Hebrew University

Seminar on Theory of Embodied Intelligence : Minimum-Information Planning in Partially-Observable Decision Problems

Roy Fox, School of Computer Science and Engineering, Hebrew University, Israel

10.05.2016, 11:00 Uhr

30 March, 2016, Daniel Häufle, Stuttgart Research Center for Simulation Technology

Seminar on Theory of Embodied Intelligence : Musculo-Skeletal Models of Human Movement: Tools to Quantify Embodiment

Daniel Häufle, Stuttgart Research Center for Simulation Technology, University of Stuttgart, Germany

30.03.2016, 11:00 Uhr

21 January, 2016, Daniel Häufle, Stuttgart Research Center for Simulation Technology

Seminar on Theory of Embodied Intelligence : Musculo-Skeletal Models of Human Movement: Tools to Quantify Embodiment

Daniel Häufle, Stuttgart Research Center for Simulation Technology, University of Stuttgart, Germany

21.01.2016, 11:00 Uhr

This talk is canceled!

09 November, 2015, Tomas Veloz, University of British Columbia

Arbeitsgemeinschaft NEURONALE NETZE UND KOGNITIVE SYSTEME : Toward a Quantum Theory of Cognition: History, Development and Perspectives

Tomas Veloz, University of British Columbia, Canada

09.11.2015, 14:00 Uhr

08 April, 2015, Sajad Saeedinaeeni, Universität Leipzig

Special Seminar : On asymptotic optimality of ML-type detectors in quantum hypothesis testing

Sajad Saeedinaeeni, Universität Leipzig

08.04.2015, 14:00 Uhr

17 March, 2015, Peter Gmeiner, Friedrich-Alexander-Universität Erlangen-Nürnberg

Special Seminar : Information-Theoretic Cheeger Inequalities

Peter Gmeiner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

17.03.2015, 14:00 Uhr

10 March, 2015, Benjamin Friedrich, Max-Planck-Institut für Physik komplexer Systeme, Dresden

Seminar on Theory of Embodied Intelligence : Intelligent motility control of biological swimmers

Benjamin Friedrich, Max-Planck-Institut für Physik komplexer Systeme, Dresden

10.03.2015, 11:00 Uhr

18 February, 2015, Ryszard Kostecki, Perimeter Institute for Theoretical Physics, Waterloo

Special Seminar : Quantum information geometry as a foundation for quantum theory beyond quantum mechanics

Ryszard Kostecki, Perimeter Institute for Theoretical Physics, Waterloo, Canada

18.02.2015, 14:00 Uhr

19 January, 2015, Oliver Brock, Technische Universität Berlin

Seminar on Theory of Embodied Intelligence : Towards an Alchemy of Intelligence

Oliver Brock, Technische Universität Berlin, Robotics and Biology Laboratory

19.01.2015, 11:00 Uhr

15 January, 2015, František Matúš, Academy of Sciences of the Czech Republic

Special Seminar : Algebraic Problems Related to Entropy Regions

František Matúš, Academy of Sciences of the Czech Republic, Institute of Information Theory and Automation

15.01.2015, 10:30 Uhr