To obtain a transparent and clear idea of what to expect when writing a thesis with us, please consider the thesis process detailed here.
There are two ways to determine a potential topic for a thesis with us:
Own topic and personal initiative: Students may propose their own topic and submit a detailed description of their proposed project and the methods they intend to use, emphasizing the close connection to the interests and fields of research of the institute’s scientific members. These are often related to our teaching activities. These include machine learning, neural networks, reinforcement learning, control theory, information theory, robotics, statistical learning theory, information geometry, and algebraic statistics.
A natural and convenient way to learn about our research and teaching interests is to attend our lectures or seminars. However, you can generally also apply if you have not participated in our teaching activities. It is a good idea to also look at our team’s publications to get an idea about the research interests of our team members.
In both cases, you can apply by sending the following two documents to our office:
A transcript of the grades you have received so far. If you apply for a Master’s thesis, please also send us your Bachelor’s grades.
A filled-out version of this template, to provide us with a detailed description of your thesis ideas. In case you select a topic offered on our website, please also fill out all sections of the template as appropriately as possible, so that we can see if you have understood the topic correctly.
We will only accept applications in German or English (preferred) if they strictly follow the template (max. 2 pages) and if you also attach a transcript of your grades.
Based on your application documents, we will decide whether to consider your application or not. It is unlikely that we consider your application if we get the impression that you show a lack of commitment. However, if you have prepared your template with care, there is a good chance that we invite you for a personal interview and presentation (~10 min). Based on your presentation and the interview, we will finally decide whether to accept you as a candidate or not. Our decision will be based on how we perceive your commitment and eagerness, on your previous knowledge in the field of your chosen topic, and on how close your topic is to the research focus of our team. If you prepare your presentation well, and if the topic is close to at least one of our team members’ active research, there is a very good chance that we accept your application.
We highly encourage you to write your thesis in LaTeX. Please use this template. We recommend that you copy the template directly in Overleaf and write your thesis with Overleaf. This has the advantage that you do not need to install any software and that your supervisor can directly see the progress of your thesis and give feedback. However, you are generally free to use any text editor you like, as long as you use the same formatting as our template.
For your presentation(s) and the defense, we have a PowerPoint template here.
Supervisor: Dr. Manfred Eppe
Topic Description:
Reinforcement learning is often studied in virtual environments. This makes sense because physical hardware is expensive and often not available. However, to harness the full capabilities of RL, it is required to apply it to the real world. Therefore, we invite students to apply our virtual robotics lab Scilab-RL to real-world applications for reinforcement learning. This can be in collaboration with an industry partner as long as our virtual lab Scilab-RL is used by the student.
The student should have a clear idea about the real-world application. Most importantly, this includes knowledge about the sensory data received by the application and the actions that can be taken by the application.
The focus of the work can also be on bridging the sim2real gap: In this case, the student will develop or integrate a simulated version of the application. If time permits, the student can then transfer the result to the real-world application, but it is also okay to keep the focus only on the simulated environment.
Requirements:
Extensive experience in Python Programming
Experience with at least one Python-based Neural Network library (Tensorflow or Pytorch)
Access to a decent laptop or computer. A gaming PC is great, but a recent notebook with at least an i5 processor or comparable CPU and 8 GB memory is sufficient. A GPU is not required. The operating system must be Windows 11 (not 10!), MacOS, or Ubuntu Linux.
Nice-to-have:
Experience in Reinforcement Learning
Experience with simulated or real robots
Experience with GIT
Participation in our seminar “Introduction to RL” is encouraged
Literature:
Reinforcement Learning: An Introduction by Sutton and Barto: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
A first intuitive overview of SAC: https://spinningup.openai.com/en/latest/algorithms/sac.html
A first intuitive overview of PPO: https://spinningup.openai.com/en/latest/algorithms/ppo.html
Supervisor: Dr. Manfred Eppe
Topic Description:
Our virtual robotics lab Scilab-RL currently involves only single-agent training. We invite students to experiment with training multiple agents in the same environment. Therefore, the students are expected to derive a multi-agent version of an existing RL algorithm like PPO or SAC. The approach is then to be evaluated in a multi-agent environment, similar to the multi-agent hide-and-seek application by OpenAI: https://openai.com/research/emergent-tool-use
We have already implemented a simple multi-agent environment with two differential-drive robots that navigate through a maze. You can start your work with this and extend it if needed.
Requirements:
Extensive experience in Python Programming
Experience with at least one Python-based Neural Network library (Tensorflow or Pytorch)
Access to a decent laptop or computer. A gaming PC is great, but a recent notebook with at least an i5 processor or comparable CPU and 8 GB memory is sufficient. A GPU is not required. The operating system must be Windows 11 (not 10!), MacOS, or Ubuntu Linux.
Nice-to-have:
Experience in Reinforcement Learning
Experience with simulated or real robots
Experience with GIT
Participation in our seminar “Introduction to RL” is encouraged
Literature:
Reinforcement Learning: An Introduction by Sutton and Barto: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
A first intuitive overview of SAC: https://spinningup.openai.com/en/latest/algorithms/sac.html
A first intuitive overview of PPO: https://spinningup.openai.com/en/latest/algorithms/ppo.html
Supervisor: Carlotta Langer
Topic Description:
In our simple experimental setup a simulated agent moves inside a racetrack while the information flows between the agent and its environment are being evaluated with information-theoretic measures. At the moment these agents learn to avoid touching the walls of their racetrack using an information-geometric algorithm. The implementation of the framework can be found at this GitHub Repository.
In this project the students would use the existing framework and additionally implement reinforcement learning algorithms. The impact different algorithms have on the agents should then be assessed by analyzing the changes in behavior and various information-theoretic measures.
Requirements:
Nice-to-have:
Literature:
Supervisor: Adwait Datar
Topic Description:
Embodied Intelligence, defined in (Roy et al., 2021) as the purposeful exchange of energy and information with a physical environment, is an active area of research. The physical environment here may represent either the real world with real physical constraints or a simulated environment with artificially enforced constraints. A natural question that arises here is how to choose a suitable controller architecture that is well-suited to the physical constraints of agent. The choice of the architecture depends on the span of possible movements as well as on the class of desired behaviours expected from the agent. (Ay, 2015) proposes geometric design principles to address some of these questions in a very general setting.
In this project, we follow up on these ideas and apply them on concrete examples of mobile robots with neural-network architectures. An example of a desired behaviour for mobile robots is the ability to move from an arbitrary position in space to a different arbitrary position in space. We start by looking at agents with simple linear physical constraints and progressively increase the complexity of the physical constraints (such as by bringing in non-holonomic constraints for example). For each class of physical constraints, we study a variety of control architectures and investigate if and how the knowledge of the physical constraints can be incorporated into the design of a suitable controller architecture. A concrete project revolving around these ideas will be formalized after a discussion with the student based on the format of the thesis (project work/ bachelor thesis/ master thesis) and the student's interest.
Requirements:
Nice-to-have:
Literature:
Supervisor: Adwait Datar
Topic Description:
It is well known that learning dynamics in recurrent neural networks (RNNs) suffer from a variety of problems such as vanishing and exploding gradients. In order to make progress towards understanding some of these problems, some recent works (Li et al., 2020, Hardt et al., 2018) study linear RNNs which are essentially linear time-invariant (LTI) systems.
System identification (Ljung, 1998, Qin, 2006) of LTI systems is a well-developed field with powerful tools available at our disposal. In particular, sub-space identification tools allow us to identify state-space models from data without relying on iterative techniques such as gradient descent.
This project takes this up by investigating and comparing the performance of sub-space identification methods with the performance of iterative methods such as gradient descent learning on benchmark examples. The key aspects in this investigations include
- Computational complexity: How does computational cost depend on the system size?
- Sample complexity: How is learning improved as data-size increases?
- Speed of convergence (learning) with gradient descent learning
- Effect of different parameterizations: What effect does parameterization have on gradient descent learning?
A concrete project revolving around these ideas will be formalized after a discussion with the student based on the format of the thesis (project work/ bachelor thesis/ master thesis) and the student's interest.
Requirements:
Nice-to-have:
Literature:
Supervisor: Jan Benad
Topic Description:
In reinforcement learning an agent gathers experience by interacting with its environment. That data is stored in a so-called replay buffer and reused later for training the agent. Usually, experience transitions are uniformly sampled from the replay buffer, regardless of their significance. Prior work (Schaul et al., 2016), however, showed that some sort of prioritization might be beneficial.
We consider environments that change over time. Experience gained early on is therefore quickly outdated. Prioritized sampling seems indicated. For that very scenario the students are expected to develop a sampling mechanism taking into account changes in the environment.
Requirements:
Nice-to-have:
Literature:
Supervisor: Frank Röder
Topic Description:
Intelligent agents require predictive capabilities to forecast future states of the world and plan ahead before acting. Current methods in reinforcement learning (RL) utilize learned models of the world to solve a multitude of challenging problems [1-8]. However, it remains unclear how to model the world accurately without capturing unnecessary details, while retaining enough information for effective learning through imagination and advance planning.
This thesis offers students the opportunity to work on projects related to model-based RL methods that approximate a model of the world to improve overall agent learning. The focus can be on applying a specific model-based method to a new problem domain or reimplementing a prior idea that could be enhanced by refining the mathematical formulation of the objective.
We provide a selection of reference materials related to decision-time planning [1, 3] and background planning [2, 4-7]. Further ideas include exploring the mathematics involved in latent dynamic models [2, 3, 5, 7] or comparing the architectures discussed in the provided literature.
Requirements:
Nice-to-Have:
Literature:
Student: Leon Sierau
Supervisor: Dr. Manfred Eppe
Thesis type: B.Sc. thesis
Date: November 2024
Abstract:
This thesis investigates the ability of forward models, implemented as multi-layer perceptrons (MLPs), to learn deterministic transition dynamics in various simulated reinforcement learning (RL) environments. Through supervised learning, different MLP architectures —deterministic MLPs, probabilistic MLPs, and ensembles of deterministic MLPs— are evaluated in environments with dynamics of increasing complexity. The results indicate, that, while simple MLPs can accurately represent the transition dynamics in low to medium complexity environments, their performance decreases significantly in more challenging environments, where even extensive training does not result in an accurate representation of the transition dynamics.
A key finding is, that MLPs seem to be capable of learning basic state transitions, but to struggle to represent contextual features such as obstacles, which is critical for complex tasks. The experiments further demonstrate that there is no one-fits-all solution for training forward models; the choice of architecture and training method depends heavily on the specific environment and task.
Recursively applying one-step models leads to significant compounding errors, even in simpler environments, rendering this approach less effective for long-horizon planning. However, ensembles of models show potential in stabilizing compounding errors and could play an essential role in controlling them.
Our results provide some support for an association between the predictive disagreement of a model ensemble and the ensembles prediction error. Yet, this evidence remains inconclusive and on its own does not justify using increasing ensemble disagreement as an indication of worsening predictions.
Student: Moustafa Alsayd Ahmad
Supervisor: Prof. Dr. Nihat Ay
Thesis type: B.Sc. thesis
Date: November 2024
Abstract:
In dieser Arbeit wird die Modellierungskapazität von Restricted Boltzmann Machines (RBMs) unter verschiedenen Konfigurationen untersucht, insbesondere in Bezug auf die Anzahl der versteckten Neuronen und der Label-Neuronen sowie den Einfluss der Hamming-Distanz. Ziel der Arbeit ist es, zu überprüfen, inwieweit ein RBM in der Lage ist, die zugrunde liegende Datenverteilung zu lernen und qualitativ hochwertige Daten zu generieren, selbst wenn die Anzahl der versteckten Neuronen stark reduziert wird.
Für die Experimente wurden ein selbst erstellter 4x4-Pixel-Datensatz sowie der MNIST-Datensatz verwendet. Es wurden Modelle mit variierenden Anzahlen versteckter Neuronen trainiert, um den Einfluss der Reduktion der Modellkapazität zu bewerten. Die Ergebnisse zeigen, dass eine größere Anzahl versteckter Neuronen die Qualität der generierten Daten verbessert, aber auch Modelle mit einer redu- zierten Anzahl versteckter Neuronen stabile und qualitativ akzeptable Ergebnisse liefern können.
Darüber hinaus wurde der Einfluss der Anzahl von Label-Neuronen und der Hamming-Distanz auf die generierten Daten untersucht. Modelle mit mehr Label- Neuronen und einer größeren Hamming-Distanz erzielten qualitativ hochwertigere Ergebnisse, da sie eine bessere Differenzierung zwischen den Ziffern ermöglichten.
Die Arbeit zeigt, dass RBMs eine vielversprechende Methode zur Modellierung und Generierung von Daten darstellen, wobei die richtige Wahl der Parameter wie die Anzahl der versteckten und Label-Neuronen sowie die Hamming-Distanz von entscheidender Bedeutung für die Generierung qualitativ hochwertiger Daten ist.
Student: Fin Michael Armbrecht
Supervisor: Frank Röder
Thesis type: B.Sc. thesis
Date: February 2024
Abstract:
The field of machine learning has received a lot of attention recently, with revolutionary breakthroughs being achieved for example in the field of autonomous driving. But it is not only autonomous vehicles that work with machine learning and can be improved by it, many industrial processes or even production systems can also benefit from machine learning. Researchers are therefore constantly trying to find new ways to improve and accelerate the learning processes of artificial intelligence and to make them more skillful and efficient by trying to imitate the learning process of humans. The ability to remember past events is one of the keys in the learning process. Reinforcement Learning can use this by interacting with the environment and achieving valuable learning results and remember past events through the addition of Hindsight Experience Replay. The approach of this thesis is to evaluate the achieved results and to improve the learning efficiency of artificial intelligence with the help of a newly developed prioritization process, which selects important events more often to remember them. Our results have shown that the newly developed prioritization process has a significant impact on the learning efficiency of artificial intelligence and outperforms other existing state-of-the-art approaches.
Student: Luis Scheuch
Supervisor: Prof. Dr. Nihat Ay
Thesis type: B.Sc. thesis
Date: October 2023
Abstract:
This thesis examines the information content as well as the context size of English and Aviation English using common measures from information theory, such as the entropy, mutual information or conditional entropy It then interprets the impact on the prediction network (PN) for the currently state-of-the-art speech-to-text (STT) deep learning architecture RNN-Transducer (RNN-T). The used corpora are the Corpus of Contemporary American English (COCA), the Cornell Movie-Dialogs (CMD) corpus and an internal air traffic control (ATC) corpus, which will be presented and critically analyzed for their relevance and statistical significance. The results will be used to estimate the context size of Aviation English and whether Aviation English is easier to predict than common English. It will be analyzed to what extent English and Aviation English are comparable and what that might imply to the transferability of research from other domains to ATC. Following that, I will suggest a modification to the by Albesano, Andrés-Ferrer, Ferri, et al. suggested, for natural language optimized RNN-T PN architecture, to improve its performance in the ATC domain. Additionally, I will try to investigate the structure of ATC speech in comparison with the structure of English by modifying the given ATC corpus and analyzing the effect of introduced placeholders.
The main results are, Aviation English is generally easier to predict than common English, the internal ATC corpus has a most meaningful context size of 9 words and results from other domains can’t directly be transferred to ATC.
Student: Shreya Purkayastha
Supervisor: Prof. Dr. Nihat Ay
Thesis type: M.Sc. thesis
Date: July 2023
Abstract:
With the increasing prevalence of short texts such as tweets and search queries on the internet, there is a growing interest in analyzing these texts to extract valuable insights. However, analyzing short texts comes with its own set of challenges, including sparseness, non-standardization, and noise. This research study focuses on exploring topic modelling techniques for short texts using two algorithms: Latent Dirichlet Allocation (LDA) and Neural Topic Model (NTM). The study assesses the performance of these algorithms using four different topic coherence measures: CV , CUCI , CUMass, and CNPMI.
In order to compare the effectiveness of the algorithms, it is crucial to select an appropriate topic coherence measure. The study reveals that the coherence metric CV is not reliable for evaluating topic modelling on short texts, while CUCI , CUMass, and CNPMI are considered reliable measures. Therefore, it is recommended to utilize any of these three reliable coherence metrics or a combination of them to assess when evaluating topic modelling on short texts. Based on the findings, the study concludes that LDA is a more suitable algorithm for topic modelling on short texts compared to NTM.
Student: Stella Wit
Supervisor: Prof. Dr. Nihat Ay
Thesis type: B.Sc. thesis
Date: May 2023
Student: Nico Bartocha
Supervisor: Prof. Dr. Nihat Ay
Thesis type: B.Sc. thesis
Date: March, 2023
Abstract:
Diese Arbeit untersucht die Möglichkeiten, informationstheoretische Konzepte auf neuronale Netzwerke anzuwenden, um Prozesse zur Datenverarbeitung effizienter gestalten zu können. Diskrete neuronale Netzwerke besitzen die Eigenschaft, durch unterschiedliche Konstruktionsmethoden jedes Klassifizierungsproblem von beliebigen Datenmengen lösen zu können. Zugleich ist die typische Menge ein Konzept aus der Informationstheorie, welches es ermöglicht, die Größe von möglichen Datenmengen zu reduzieren und dabei eine sehr hohe Auftrittswahrscheinlichkeit beizubehalten. Diese wird mit Hilfe der Entropie für stationäre ergodische Prozesse gewonnen. Diese Erkenntnisse werden zur Konstruktion von ein- sowie zweilagigen neuronalen Netzwerken, die jede Klassifikation der typischen Menge abbilden können, genutzt. Durch die Eigenschaften der typischen Menge wird im Anschluss gezeigt, dass durch dieses Verfahren Klassifizierungen der gesamten Datenmenge approximiert werden können. Die unterschiedlichen Konstruktionsmethoden werden hierbei verglichen, indem ihre Fähigkeiten zur Klassifikation von allgemeinen Datenmengen sowie ihre Anzahl an versteckten Neuronen gegenübergestellt werden. Abschließend wird gezeigt, dass durch die reduzierte Anzahl an versteckten Neuronen theoretische Schranken der Größe des neuronalen Netzwerkes zur universellen Approximation durch Prozesse mit geringen Entropiewerten deutlich unterschritten werden können.
Student: Luis Pohl
Supervisor: Dr. Manfred Eppe
Thesis type: B.Sc. thesis
Date:
Abstract:
The aim of this thesis is to find an appropriate method and a corresponding norm to capture the distance/dissimilarity between one recently solved evaluation episode and n-previously solved evaluation episodes in the field of Goal-Conditioned Reinforcement learning (GCRL). Using the distance metric, the generalizability of the agent is inferred. The insights gained through the distance metric can be used to make prediction on the capabilities of the agent in solving similar future problems. Furthermore, by dividing the distance metric through the number of steps, the agent took to solve the problem, the transfer learning velocity is computed, quantifying the difficulty of the generalization. Based on the performance metric and background data, it is possible to determine if the current configuration underperforms or over performs in transfer learning, making it possible to track whether adjustments improve or worsen the generalizability. To address this, three possible methods to capture the distance were devised: Mean, Nearest, and Eppe. The three methods rest on two distinct perspectives: whether it is more important for the agent to have solved a problem that is closest to the new task or to gauge the distance relative to the mean of all previously n-solved points (two methods). Additionally, three different norms were employed (Euclidean, Manhattan, Cosine similarity) to quantify the distance between the problems. Ultimately, after investigation, the conclusion is that the most appropriate tuple out of method and norm is the Nearest method combined with the Euclidean norm. This indicates that leveraging previously n-solved similar problems and measuring their distance using the Euclidean norm can enable faster problem-solving.