Abstract
Modern advancements in the field of Natural Language Processing, driven primarily by Large Language Models, open the door not only to a plethora of AI-driven applications, but to the ability to produce such applications at or near the industry standard without the need for industrial-grade computational resources. In particular, requirements regarding GPU memory have sunk drastically as a result of recent advancements. However, it is not always clear what implications these improvements have for smaller organizations planning to work with Large Language Models. We review the extent to which consumer-grade hardware is capable of training language models of varying parameter count and analyze the performance of resulting models. Furthermore, we share our experience regarding training methodology and evaluative measures through an intuitive practical use case.
Abstract
Understanding how electromagnetic (EM) fields interact with biological tissues is crucial for various applications in healthcare and technology, such as medical imaging and wireless communication systems. Traditionally, experimental and computational methods have been used to study human exposure limits, but these approaches have limitations. More recently, machine learning (ML) methods have been used as a promising venue to address the computational challenges. In this talk, we present how artificial neural networks (ANNs) and Gaussian process regression (GPR) can be applied to predict the specific absorption rates (SAR) in human head models under uncertainties in the tissues’ electrical properties. The optimization of the networks and uncertainty estimation of the predictions is carried out.
Abstract
Following the successful application of machine learning methods in order to predict brake squeal as a classification task, this contribution addresses the transfer of those methods on to particle emission data, in order to correctly predict brake particle emissions as a regression task. First results proving the transferability of those methods will be presented.
Deep learning prediction models are generated for particle emission data sets acquired from pin-on-disk experiments. Given the brake system loading sequences, the neural architectures are predicting the amount of PM10 and PM2.5 particle emissions in #/𝑐𝑚3, measured by an Engine Exhaust Particle Sizer (EEPS) Spectrometer and an Optical Particle Sizer (OPS) device. Across the different experiments, 27 different input measurement dimensions are available, while the particle output is measured in #/𝑐𝑚3 in 48 different particle-size bins from 0.3 𝑛𝑚 to 10 μ𝑚. Our study analyses the overall performance of optimal prediction methods, as obtained through hyperparameter studies, and compares whether qualitative differences in the prediction tasks can be read from the respective neural prediction models. Prediction tasks can differ along the time dimension, i.e. by the length of the input and the output sequence, as well as the along the dimension of size resolution of the output, i.e. whether 48 or only 2 particle-size bins are predicted. Applying a greedy backward elimination method the results are utilized to identify key physical parameters, i.e. measurement channels, that are required for accurate predictions and substantial contribution to understanding the system’s particle emissions.
The final objective of this endeavor is to include data-driven NVH and particle emission prediction modules into a next-generation brake control strategy for electric vehicles for emission-reduced and energy efficient braking. The systematic study across different system integration levels is therefore a fundamental building block for integrating machine learning-based intelligence into future brake systems.
Abstract
The talk presents a new technique for unsupervised learning of repeatedly occurring process states from a suite of time series derived from preprocessed sensor data recorded from a fixed process. As a first application we consider the process of moving a good along a path in an industrial environment. The goal is to identify individual sections of the path while they are being traversed. The technique determines thresholds in time series leading to the same succession of increasing and decreasing intersections for all paths of the training data. The trained model is a so-called “threshold tree”. It consists of thresholds for the different time series splitting a path into its sections to be recognized. The execution of threshold trees has a low CPU and memory footprint allowing their use on micro-controllers, e.g. in embedded systems. Due to their intuitive comprehensibility "threshold trees" belong to the category of explainable AI.
Abstract
Most problems from classical machine learning can be cast as an optimization problem. I will present GENO (GENeric Optimization), a framework that lets the user specify a constrained or unconstrained optimization problem in an easy-to-read modeling language. GENO then generates a solver that can solve this class of optimization problems. The generated solver is usually as fast as hand-written, problem-specific, and well-engineered solvers. Often the solvers generated by GENO are faster by a large margin compared to recently developed solvers that are tailored to a specific problem class. I will dig into some of the algorithmic details, e.g., computing derivatives of matrix and tensor expressions, the optimization methods used in GENO, and their implementation in Python.
Abstract
In this talk, I explore the role of ethics in the development of AI and advanced machine learning. I argue that ethics is deeply integrated into powerful AI systems so that one cannot easily remove it without serious impairment of other aspects of the system’s intelligence and problem-solving capacities. On this basis, I develop a novel and more radical framework for ethics by design.
Abstract
Clustering techniques are crucial for uncovering hidden structures and patterns within datasets. In this talk, I present a competition-based partitioning algorithm designed to detect latent functional characteristics and effectively group data points. This algorithm forms the foundation of a modular modeling framework, which assigns specialized expert models to each identified partition. I will benchmark the performance of this innovative approach against traditional
single-network models.
As this research progresses through the review process, my focus shifts to exploring the capabilities of Large Language Models (LLMs) in generalizing solutions for complex engineering problems. I will share recent insights and outline future research objectives, including an examination of the potential synergy between LLMs and our partitioning methodology. This integration could open new avenues for enhanced data analysis and model performance.
Abstract
In human-AI collaboration, the human and the AI agents often share different responsibilities: one of the two being more in charge of guiding the decision, and the other one in charge of making the effective choices. Following this idea, we consider the scenario of an asymmetric collaboration of two agents aiming to take actions together within an environment modeled by a Markov Decision Process (MDP).
The first agent acts as a supervisor that provides the other agent with suggested actions. Meanwhile, the second agent acts as an executor that is in charge of choosing the action to execute in the environment. We model the executor's choice as dependent on a single parameter ϕ ∈ [0, 1], which represents the probability of the agent executing the supervisor's action rather than following its own policy. Under the assumption that the supervisor has access to an optimal policy, we investigate theoretical implications of this model and apply it in practice to a toy problem.
This talk presents some preliminary results on this topic and discusses ideas for future research.
Abstract
The monitoring of vital signs and increasing patient comfort are cornerstones of modern neonatal intensive care. Commonly used monitoring methods are based on skin contact which can cause irritations and discomfort in preterm neonates. Therefore, non-contact approaches are the subject of current research aiming to resolve this dichotomy. Robust neonatal face detection is essential for the reliable detection of heart rate, respiratory rate and body temperature. While solutions for adult face detection are established, the unique neonatal proportions require a tailored approach. Additionally, sufficient open-source data of neonates on the NICU is lacking. We set out to train neural networks with the thermal-RGB-fusion data of neonates. We propose a novel indirect fusion approach including the sensor fusion of a thermal and RGB camera based on a 3D time-of-flight (ToF) camera. Unlike other approaches, this method is tailored for close distances encountered in neonatal incubators. Two neural networks were used with the fusion data and compared to RGB and thermal networks. For the class “head” we reached average precision values of 0.9958 (RetinaNet) and 0.9455 (YOLOv3) for the fusion data. Compared with the literature, similar precision was achieved, but we are the first to train a neural network with fusion data of neonates. The advantage of this approach is in calculating the detection area directly from the fusion image for the RGB and thermal modality. This increases data efficiency by 66%. Our results will facilitate the future development of non-contact monitoring to further improve the standard of care for preterm neonates.
Abstract
Many real-world projects aim at finding optimal solutions to a specific problem and search space. The optimization task can be hard in itself, but often the problem function is not even known. In such cases, it is necessary to experimentally test possible solutions for their appropriateness. In many domains, such as material science, it is expensive and time-consuming to do these tests. Therefore, ML is a technique to bridge this gap and give hints on the performance of a proposed solution. In this talk, I will delve into the problem of surrogate functions, how they can be learned, and how their prediction quality can be used to steer the optimisation process. I will demonstrate this approach using EvoAl, a DSL-based optimisation framework.
Abstract
This research is concerned with building machine learning (ML) models to predict dynamic ditching loads on aircraft fuselages. The employed learning procedure is structured into two parts, the reconstruction of the spatial loads using a convolutional autoencoder (CAE) and the transient evolution of these loads in a subsequent part. Both parts are simultaneously (jointly) learned in a global network. To predict transient load evolution, the CAE is combined with either different long short-term memory (LSTM) networks or a Koopman-operator based method. To this end, both approaches advance the solution in time based on information from two previous and the present time step. The training data is compiled by applying an extension of the momentum method of von-Karman and Wagner to simulate the loads on a generic DLR-D150 fuselage model at various approach conditions. Results indicate that both baseline methods, i.e., the LSTM and the Koopman-based approach, are able to perform accurate ditching load predictions. Predictive differences occur when looking at the different options to capture the temporal evolution of loads and will be outlined in greater detail.
Abstract
Data is a valuable tool for decision-makers, helping them make informed decisions. We can find multivariate time series in several contexts, such as finances, smart cities, and health. This type of data can bring additional challenges. This presentation will discuss the key concepts and techniques involved in working with multivariate time series data. Specifically, we will focus on the steps of data processing, imputation, and forecasting.
Abstract
Selection of works connecting AI, light-matter interactions, and dynamical systems theory will be presented, as well as related problems where AI could help us in the future. Light-matter interactions are considered in photonic crystals and metamaterials, “real” crystals irradiated by lasers, and artificial “crystals of light”. Can we repeatedly drop a laser from the top of the Bremen tower? (and why?) Can we design a particle accelerator on a tip of a pen? Can we make interstellar travels: at least to nearby stars?? These are some of the main questions I hope to consider. In case time allows, I will share my experience on working with Bosch Research and studying at DESY Startup School recently (where we designed a startup that shall be not #LikeABosch, but even better!). Optional questions are: Can AI predict a failure of a coffee machine, a particle accelerator, or the International Space Station? What about predicting a catastrophic earthquake, or collapse of society?
Abstract
In analogy to the use of normalizing flows to augment the expressivity of base probability distributions, I propose to augment the expressivity of bases of Hilbert spaces via composition with normalizing flows. I show that the redsulting sequences are also bases of the Hilbert space under sufficient and necessary conditions on the flow. This lays a foundation for a theory of spectral learning, a nonlinear extension of spectral methods for solving differential equations. As an application I solve the vibrational molecular Schrödinger equation. The proposed numerical scheme results in several orders of magnitude increased accuracy over the use of standard spectral methods.
Abstract
Due to the high number of rivet holes per aircraft produced, automated process monitoring of the drilling process promises a significant reduction in manual inspection. Advances in sensor technology in new machine tools are greatly expanding the data base. Thus, self-learning can be applied to holistic process monitoring.
In this presentation, the authors present approaches to anomaly detection and quality control in the drilling process. Supervised, semi-supervised and unsupervised methods were used for anomaly detection and compared with classical methods of quality control charts. In addition to engineered feature extraction, a new method was used to extract features using a CNN. For the prediction of the quality of the parts, different methods of classification and regression were compared, giving different results in terms of prediction quality.
Abstract
Fatigue is the main cause of structural failure of large engineering structures. Welds, with their geometry leading to high local stresses, are especially vulnerable. Traditional fatigue assessment methods, which factor in material properties, load levels, and idealized weld geometries, can be inaccurate. To address this, data-driven approaches, using machine learning (ML) algorithms and 3D-laser scanners for weld geometry, have been successful in predicting fatigue life for butt-welded joints; however, it remains uncertain whether these methods are adaptable to different welding techniques and welds with imperfections. This presentation addresses the generalizability of machine learning approaches for fatigue strength assessment for welded joints by assessing data, which differs from the training dataset in various ways. The new data contains results for a different welding procedure, and of welded joints with imperfections and weld defects. By comparing prediction accuracies between the original data and the new data, the study aims to determine the adaptability of the data-driven approach to new, divergent data. The focus is on assessing how anomalous weld geometries impact prediction accuracy, ultimately establishing the limitations of applying this method to varying data. To this goal, explainable artificial intelligence is applied.
Abstract
Parallel-in-time algorithms provide an additional layer of concurrency for the numerical integration of models based on time-dependent differential equations. Methods like Parareal, which parallelize across multiple time steps, rely on a computationally cheap and coarse integrator to propagate information forward in time, while a parallelizable expensive fine propagator provides accuracy. Typically, the coarse method is a numerical integrator using lower resolution, reduced order or a simplified model. Our paper proposes to use a physics-informed neural network (PINN) instead. We demonstrate for the Black-Scholes equation, a partial differential equation from computational finance, that Parareal with a PINN coarse propagator provides better speedup than a numerical coarse propagator. Training and evaluating a neural network are both tasks whose computing patterns are well suited for GPUs. By contrast, mesh-based algorithms with their low computational intensity struggle to perform well. We show that moving the coarse propagator PINN to a GPU while running the numerical fine propagator on the CPU further improves Parareal’s single-node performance. This suggests that integrating machine learning techniques into parallel-in-time integration methods and exploiting their differences in computing patterns might offer a way to better utilize heterogeneous architectures.
Abstract
This presentation addresses the challenge of sample inefficiency in robotic reinforcement learning with sparse rewards and natural language goal representations. We introduce a mechanism for hindsight instruction replay, leveraging expert feedback, and a seq2seq model for generating linguistic hindsight instructions. Remarkably, our findings demonstrate that self-supervised language generation, where the agent autonomously generates linguistic instructions, significantly enhances learning performance. These results underscore the promising potential of hindsight instruction grounding in reinforcement learning for robotics.
Abstract
In the realm of particle physics, a large amount of data are produced in particle collision experiments such as the CERN Large Hadron Collider (LHC) to explore the subatomic structure of matter. Simulations of the particle collisions are needed to analyse the data recorded at the LHC. These simulations rely on Monte Carlo techniques to handle the high dimensionality of the data. Fast simulation methods (FastSim) have been developed to cope with the significant increase of data that will be produced in the coming years, providing simulated data 10 times faster than the conventional simulation methods (FullSim) at the cost of reduced accuracy. The currently achieved accuracy of FastSim prevents it from replacing FullSim.
We propose a machine learning approach to refine high level observables reconstructed from FastSim with a regression network inspired from the ResNet approach. We combine the mean squared error (MSE) loss and the maximum mean discrepancy (MMD) loss. The MSE (MMD) compares pairs (ensembles) of data samples. We examine the strengths and weaknesses of each individual loss function and combine them as a Lagrangian optimization problem.
Abstract
With the introduction of early pre-trained language models such as Google’s BERT and various early GPT models, we have seen an ever-increasing excitement and interest in foundation models. To leverage existing pre-trained foundation models and adapt them to specific tasks or domains, these models need to be fine-tuned using domain-specific data. However, fine-tuning can be quite resource-intensive and costly as millions of parameters will be modified as part of training.
PEFT is a technique designed to fine-tune models while minimizing the need for extensive resources and cost. It achieves this efficiency by freezing some of the layers of the pre-trained model and only fine-tuning the last few layers that are specific to the downstream task. With the help of PEFT, we can achieve a balance between retaining valuable knowledge from the pre-trained model and adapting it effectively to the downstream task with fewer parameters.
Abstract
Solids water content is an important particle property in many applications of process engineering. Its importance on the quality of pharmaceutical formulations makes an in-line measurement of the water content especially desirable in fluidization processes. However, currently available measurement techniques are difficult to calibrate and scarcely applicable in real fluidized beds. A promising strategy for in-line monitoring of the water content is thus soft sensing, a method that expresses the targeted quantity as a correlation of other more reliable measurements. In this talk, we present the development of such a soft sensor using various black-box models. Our focus lies on strategies to reduce overfitting through feature engineering and hyperparameter tuning. These models are designed for processing real experimental data from a turbulent process, addressing challenges in data filtering, undersampling, outlier detection, and uncertainty propagation.
Abstract
When the Transformer architecture was first introduced, its target research domain was Natural Language Processing, where the famous BERT model pushed the boundaries in several downstream tasks. Motivated by these breakthroughs, other research fields like Computer Vision started to leverage Transformers with great success.
In state-of-the-art cross-modal text-image retrieval, where images are searched via textual queries, the acquired knowledge from both fields is combined to significantly improve the quality of retrieved images. Further, employing Transformer models enables computing not only global text-image similarity but also fine-granular word-region alignments, which allows in-image search, useful for many real world applications.
Abstract
When choosing an optimizer for your neural network, you will probably consider methods that are based only on first order derivatives, like Stochastic Gradient Descent or Adam. But recent research suggests that second order optimizers are also applicable for neural network training. This is possible via implicit Hessian-vector products.
This talk tackles the problem of minimizing your cost function from a mathematical point of view. While talking about the use of second order optimizers in neural network training, we will stumble across some interesting properties of neural networks and their cost functions.
Abstract
Ray is a Python framework for developing distributed applications. While not limited to it, it has a strong focus on Machine Learning and supports it with a variety of included libraries. For example, Ray RLlib provides implementations of many common Reinforcement Learning algorithms, which reduces development time and allows the user to quickly compare different approaches to best solve the problem at hand. As Reinforcement Learning algorithms can be very sensitive to the choice of hyperparameters, Ray Tune can be used to tune them via optimization-based methods. While this typically requires a large number of simulation runs, Ray can speed up the process by running them in parallel on multiple processors or machines.
This talk gives an introduction to Ray, its libraries and how they can be used to seamlessly scale Reinforcement Learning applications from a laptop to the High Performance Cluster (HPC) of the TUHH.
Abstract
3D segmentation U-Nets are trained for pulmonary embolus (PE) detection on three different data sets. We investigate the impact of the training data set on the generalization capabilities and use dual-energy CT data augmentation to increase performance.
Abstract
An application to the detection of material defects using persistent homology is presented. We combine tangent bundles and curvature with persistent homology in the context of machine learning.
Abstract
Since one third of rivet holes during aircraft assembly are produced with semi-automatic drilling units, in this work reliable and efficient methods for process state prediction using Machine Learning (ML) classification methods were developed for this application. Process states were holistically varied in the experiments, gathering motor current and machine vibration data. These data were used as input to identify the optimal combination of five data feature preparation and nine ML methods for process state prediction. K-nearest-neighbour, decision tree and artificial neural network models provided reliable predictions of the process states: workpiece material, rotational speed, feed, peck-feed amplitude and lubrication state. Data preprocessing through sequential feature selection and principal components analysis proved to be favourably for these applications. The prediction of the workpiece clamping distance revealed frequent misclassifications and thus, was not reliable.
Abstract
Durch die Energiewende entwickelt sich die Betriebsführung von elektrischen Netzen zu einer immer komplexeren Aufgabe, woraus sich neue Anforderungen an die Systeme zur Netzüberwachung und Netzregelung ergeben. Im Rahmen dieses Workshops werden verschiedene Anwendungsfälle aus dem Bereich der Betriebsführung von elektrischen Netzen betrachtet, für die der Einsatz von KI möglich ist. Hierbei werden sowohl die einzelnen Anwendungen in den Grundzügen erläutert, als auch die erwarteten Vorteile durch den Einsatz der verschiedenen KI-Verfahren. Zu den behandelten Themen gehören unter anderem die Modellierung und Vorhersage von Lasten im Netz, die Fehlerdetektion und Diagnose, die Zustandsschätzung sowie die Stabilitätserfassung und -beeinflussung.
Abstract
Ziel dieses Workshops ist ein Einblick in die Klassifikation handgeschriebener Zahlen mit Hilfe des Maschinellen Lernens zu gewähren. Nach einer kleinen Einführung in die Grundlagen über einfache neuronale Netze für diese Anwendung, Jupyter-Notebooks als Arbeitsumgebung für Python, und Keras für die Entwicklung der ANNs, wird der Workflow für eine Zeichenerkennung bearbeitet. Schlussendlich werden die trainierten Netze für die Detektion und Erkennung von handgeschriebenen Texten auf Bildern angewendet. Für die Teilnahme an diesem Workshop sind Grundkenntnisse in Python von Vorteil und eine lokale Python-Installation von Anaconda3 wird vorausgesetzt. Eine Anleitung zur Installation findet sich hier.
Abstract
In vielen Ingenieuranwendungen liegen mathematische Modelle und Simulationstools vor, um für vorgegebene Eingangsgrößen relevante Zielgrößen zu ermitteln. Beispielsweise in der Strukturmechanik sind die vorgegebenen Eingangsgrößen die Geometrie und Materialparameter einer Struktur, sowie die einwirkenden Lasten. Typische Zielgrößen sind die Verformung der Struktur und die maximal auftretenden Spannungen, aus denen sich das Bauteilversagen ergibt. Tatsächlich sind die Eingangsgrößen i.d.R. nicht genau bekannt, sondern unterliegen einer stochastischen Streuung. Dementsprechend unterliegen auch die Zielgrößen einer Streuung, die sich mit probabilistischen Methoden (a.k.a. Uncertainty Quantification) bestimmen lässt. Ein weit verbreitetes, probabilistisches Verfahren ist die Monte-Carlo-Methode. Dieses sehr robuste Verfahren ist einfach zu implementieren, jedoch auch sehr rechenintensiv. Die Grundidee ist, dass die Werte der Eingangsgrößen entsprechend ihrer stochastischen Verteilung erzeugt und in das Simulationsmodell eingesetzt werden. Dies geschieht so oft, bis die Verteilung der Zielfunktion ausreichend genau bestimmt ist (z.B. bis Mittelwert und Standardabweichung der Zielfunktion konvergieren). Das beutet, dass das Simulationsmodell sehr oft (Größenordnung 103 bis 106) ausgewertet werden muss, was bei komplexen, sehr rechenintensiven Simulationsmodellen zum Problem wird. Hier kommen Methoden des Maschinellen Lernens zum Einsatz um effiziente Ersatzmodelle (a.k.a. Surrogate Model, Meta Model) zu erzeugen. Diese Ersatzmodelle (z.B. neuronale Netze) werden zunächst trainiert und dann anstelle des eigentlichen Simulationsmodells deutlich schneller ausgewertet. Im Workshop, zeigen wir Anhand strukturmechanischer Beispiele, wie die Streuung einer Zielgröße mittels Monte-Carlo-Simulation unter Verwendung von Ersatzmodellen bestimmt werden kann.
Abstract
Die Elektromagnetische Verträglichkeit (EMV) befasst sich mit der Unterdrückung ungewollter elektromagnetischer Störungen zwischen elektronischen Geräten, Systemen und Komponenten. Steigende Anforderungen im Bereich der EMV – man denke z.B. an die fortschreitende drahtlose Kommunikation bei immer höheren Frequenzen – erfordern eine kontinuierliche Entwicklung der ingenieurwissenschaftlichen Methoden, um früh und kostengünstig die richtigen Entscheidungen bei der Entwicklung zu treffen. In diesem Workshop werden verschiedene Methoden des Maschinellen Lernens vorgestellt, die in den EMV-Anwendungsfeldern der Signalintegrität (signal integrity) von drahtgebundenen Kanälen sowie der Kontrolle der Spannungsversorgung (power integrity) und der Abstrahlung (electromagnetic interference) von elektronischen Komponenten und Systemen aktuell erforscht werden. Eigene Forschungen im Bereich von künstlichen Neuronalen Netzen, die zur Analyse von Leiterplattenstrukturen verwendet werden, zeigen hierbei auf, welche Chance sich für die EMV und ganz allgemein die Hardware-Entwicklung in Zukunft ergeben.
Abstract
Support Vector Machines (SVM) sind ein leistungsstarkes und vielseitiges Verfahren des Maschinellen Lernens, das für viele Anwendungen eingesetzt wird. Die Grundidee von SVM ist, die zu zwei unterschiedlichen Gruppen gehörenden Daten mit einer Hyperebene zu trennen. Mit Hilfe von Transformationen in höherdimensionale Räume und der Verwendung von Kernel-Funktionen lässt sich dies auch für den Fall durchführen, dass die zugrundeliegenden Daten nicht linear getrennt werden können. Der Workshop zielt vor allem darauf ab SVM einzuführen und setzt den Schwerpunkt auf die theoretischen Grundlagen.
Abstract
Persistente Homologie ist eine neuere Entwicklung in der angewandten algebraischen Topologie, die in verschiedenen Strategien des maschinellen Lernens verwendet wurde. In diesem Vortrag präsentieren wir eine kurze Einführung in dieses Thema mit mehreren Anwendungen in der Signalverarbeitung und Datenanalyse.
Abstract
Der Workshop vermittelt, wie maschinelles Lernen mit künstlichen neuronalen Netzen (KNNs) in Sensormodulen mit preiswerten, leistungsschwachen Mikrocontrollern eingesetzt werden kann. Als Use-Case dient ein Sensormodul mit Lichtsensoren zur Erkennung einfacher Handgesten mit Mikrocontroller ATMega4809 (6 kB RAM, 20 MHz). Neben theoretischen Lehreinheiten enthält der Workshop viele praktische Demonstrationen zum Training und Einsatz von KNNs in den Sprachen Python (mit Keras / Tensorflow) und C (in Arduino-Entwicklungsumgebung).
Abstract
In the Standard Platform League, certain types of annotations such as semantic segmentations, depth maps and object localization are difficult to obtain from real world recordings. The use of synthetic data could circumvent this problem as obtaining these annotations within a simulation is trivial. However, there is a catch, the reality gap makes algorithms trained on the synthetic images perform much worse in actual applications. Researchers can painstakingly implement more features to the simulation to close this gap. However, there are alternatives such as the neural networks presented here. The CycleGan and MUNIT architectures are able to make a domain translation, maintaining semantic information but changing the style, without any labels or matchings. This could mean that a translation between simulation and real images is possible as long as we have images of both domains. For my bachelor thesis I experimented with using these two neural networks to make this translation and my insights are presented in this talk.
Abstract
Machine Learning and Deep Learning have brought disruptive innovations to many fields since 2012. Today the application of those data-driven, and mostly black-box type models, can be regarded state-of-the-art in many scientific disciplines. However, the question of knowledge conservation arises: how to bring prior knowledge from generations of research and experience into the modeling process? This talk summarizes recent advances, lines of research and perspectives on “Physics-Informed Learning”, which is an umbrella term for blending first principles into evidence-based and data-driven models. Particular focus is put on engineering vibrations and spatio-temporal dynamics, e.g. water waves.
Abstract
Sustainable engineering requires reliable and plannable material behaviour in critical working environments like offshore. The extension of digital-twins towards virtual engineering assisted circular economy therefore needs computational models that enable the calculation of maintenance intervals or even the material condition at the end of its service life. The talk outlines how the combination of AI tools, data based models and physics based models facilitate predictive maintenance for metallic engineering materials exposed to severe conditions in-service. Aspects related to uncertainty, data availability or validation will be discussed.
Abstract
The biological functions of macromolecular systems, such as peptides and proteins, are largely defined by their spatial and electronic structures and thus it is of great importance to have high resolution view over these structures. Dynamic structure investigation of biomolecules with advanced molecular dynamic simulations and machine learning approaches on the basis of free energy calculations can give valuable opportunity in analysing the trajectories.
Abstract
In the first part of my presentation I will highlight the importance of the geometric perspective when dealing with learning systems. Information geometry offers a general framework for the identification of natural geometric structures for learning. The impact of this approach has been demonstrated in terms of the natural gradient method, one of the most prominent information-geometric methods within the field of machine learning. It was proposed by Amari in 1998 and uses the Fisher-Rao metric as a Riemannian metric for the definition of a gradient within optimisation tasks. Since then it proved to be extremely efficient in the context of neural networks, reinforcement learning, and robotics. However, training deep neural networks with this method remains a difficult task. I will present recent results that allow us to greatly simplify the natural gradient for deep learning. I will conclude my talk with an outline of further applications and extensions of information geometry that are particularly important for mathematical data science.
Abstract
Affinity-based membrane separations offer a high potential for improving the energy efficiency of classical thermal separation processes. This is particularly true for pressure-driven membrane processes like organic solvent nanofiltration (OSN). However, OSN is rarely considered during conceptual process design, due to the absence of reliable models for quantitative predictions of the separation performance in different chemical systems. Commonly, a suitable membrane is first screened from a number of candidates in a series of initial experiments while a performance model of the most promising membrane is further derived for the specific application building on lab experiments and parameter regression for classical solution-diffusion or pore-flow models. Hence, the evaluation of OSN is usually requiring tedious experimental studies. In order to allow for an appropriate model-based assessment membrane-specific predictive models can be developed by means of an optimization-based data-driven approach. For this purpose a combination of genetic programming with deterministic global optimization for parameter regression and identifiability analysis is proposed, which automatically identifies suitable model structures and parametrization. For flux and rejection models for OSN, different descriptors that account for physical and chemical properties of the solutes and the solvents are correlated to experimentally measured data of different solutes and solvents. The resulting models allow for a quantitative analysis of the main performance metrics of the specific membrane for a certain chemical system as well as the most interesting application ranges on the basis of the decisive molecular descriptors that show a significant impact according to the derived model.
Abstract
Uncertainty quantification in engineering sciences takes into account the uncertainties that may exist and affect a certain physical system in an a priori unknown manner. If the input parameters of a system or model are subject to stochastic scatter, then also the output parameters (i.e. objective values) scatter randomly. The stochastic distribution of an objective value can be determined with uncertainty quantification methods such as Monte Carlo simulations. This requires a multitude of evaluations of the underlying model (up to 103, 106). Hence, this approach is infeasible for computationally demanding models. For such applications, surrogate models, like artificial neural networks, Kriging and polynomial chaos expansions, can be first trained with a small amount of model evaluations and then used as a proxy instead of the expensive model.
In the current talk, this procedure is demonstrated by the example of fiber composite structures. Here, the random objective functions are the strength and the stiffness of a component. Random input parameters are (amongst others) material properties, geometric deviations and manufacturing defects. Some random input parameters can only be modelled on a smaller scale than the whole component. Therefore, surrogate-boosted Monte Carlo simulations are performed on different scales and the results in each case are propagated to the higher scale. Strength however can hardly be approximated with standard surrogate models. Here, hierarchical surrogate models are used, which link models of different fidelity, to still allow for an efficient and accurate prediction of the stochastic distribution of strength properties.
Abstract
The integer programming problem is one of the most fundamental problems in combinatorial optimization, where one seeks an optimal solution among a finite, but usually extremely large set of discrete alternatives. Integer programs are ubiquitous in engineering and industrial applications, as they can model a large variety of highly complex tasks by means of discrete variables which are tight together through constraints. Powerful commercial solvers exists, which can solve large-scale instances with thousands of variables generally quite fast, but there are still several important integer programming models which cannot be solved at all. Theoretical and design and analysis of integer programming algorithms usually fails to explain both the successes, and the failures, of industrial solvers, as worst-case run times of those algorithms are often super-exponential in the number of variables. We discuss novel approaches based on methods of reinforcement learning to attack some of the most prominent integer programming models for which classical methods suffer from expensive time and memory usage, talking about their advantages and limitations.
Abstract
Intravascular ultrasound (IVUS) plays a major role in clinical practice when it comes to assessing vessel morphologies during percutaneous coronary interventions (PCIs) or for treatment planning. Usually, the physician estimates morphological features by marking important regions in multiple IVUS images. This is a rather time consuming task and the results depend heavily on the physician’s experience. Automated detection and segmentation of meaningful image content can thus help streamlining the clinical workflow. Data driven methods like deep learning have gained huge importance in the field of medical image analysis over the last years. The usual scarcity of annotated image data in the medical field makes it important to tailor deep learning methods with respect to specific tasks and imaging modalities. Possibilities are generating synthetic image data, considering specific image characteristics as well as performing multi-task learning. This talk presents such approaches for improving deep learning performance on IVUS image analysis.
Abstract
Investigating the possibility of machine learning tools and techniques in the domain of engineering is a widely found concept with an increased number of authors and increasing audience. This development in the machine learning community has been led by researchers from data science. However this development has been possible due to large amount of data that has been available to these researchers in combination with an increased number of computational resources. For example the object recognition on images or in speech would not have been progressed as quickly without the widely available images over the internet. Generating and sharing knowledge is a broadly established concept in engineering domain, in form of conferences or other publications. However it is observed that in the domain of data the sharing of those is not established. First data sources are available where data is shared, however larger projects are not observed. We therefore propose a new database to share data and knowledge of this data. Thereby enabling researches that do not have the possibility to create those data samples by themselves, to work with interesting and rich data and apply new tools and techniques of the machine learning domain or others. Besides the data that is available in the database and the investigated machine learning tools and techniques on this data, a general overview of the database is given.
Abstract
In the field of Machine Learning, scientists often use programming for data preprocessing, running the learning algorithms, and obtaining key metrics. To increase transparency, nowadays more and more additional material (such as datasets, code, documentation etc.) is shared so that fellow researchers can replicate these experiments. Jupyter Notebooks are a very valuable medium in this context – they are capable of displaying documentation, code, its output (such as visualizations, tables or logging messages) etc. side by side. Recently, Jupyter Notebooks have also been used in university courses more often. Here, the students benefit from the integration of code, its documentation, and the related exercise questions into a single interactive document. There are plenty of options how to design very appealing exercises for a course. Both in the scenario of transparent science and when using Jupyter Notebooks for teaching, the author’s code is meant to be run at another machine and achieve the same results. During this talk, possible issues during replication and suitable fixes are highlighted. The open source application JupyterHub can be part of that strategy. While the backend of the Integrated Development Environment runs on TUHH resources, the frontend is just a simple browser application. This reliefs the students from having up-to-date equipment for replication. Especially in times of COVID-19 this allows students to program from home more easily.
Abstract
In many real life applications, engineers are often interested in accessible predictions of a complex systems behavior for which only sparse data is available. The data may arise either from measurements or numerical simulations. Especially for numerical design tasks like optimization and uncertainty quantification where a very large number of model evaluations is required, cheap predictors, often called meta- or surrogate models, are unavoidable. Since gathering data from such systems is typically very costly, this task requires machine learning techniques that are capable of operating on small data sets. For the problem described, Gaussian process regression has proved itself to be a powerful and flexible tool. New extensions of the technique to, e.g., data-fusion concepts can offer solutions to problems that are hard to tackle with other techniques.
Abstract
Die meisten der mehreren hundert Millionen gebohrten Flugzeugnietbohrungen pro Jahr werden mit semi-automatischen und manuell gesteuerten, pneumatisch angetriebenen Maschinen hergestellt, da eine Vollautomatisierung aufgrund von Arbeitsraumbeschränkungen oft ungeeignet ist. Diese Maschinen sind im Hinblick auf die Anpassbarkeit der Bohrungsparameter unflexibel, was u.a. auf die hohen Qualitätsanforderungen der Luftfahrtbranche zurückzuführen ist. Um zuverlässige und sichere Nietverbindungen herzustellen, ist das Bohren in mehreren Prozessschritten, der Einsatz von Minimalmengenschmierung sowie anschließendes manuelles Entgraten und Reinigen der Bohrungen unabdingbar. Vor diesem Hintergrund ermöglichen neu entwickelte elektrisch angetriebene semi-automatische Advanced Drilling Units (ADUs) neue Potenziale, wie z.B. intelligente Prozesslayouts, eine Online-Zustandsüberwachung durch die Auswertung integrierter Sensordaten und eine automatische Anpassung der Prozessparameter. Für die Zustandsüberwachung des Bohrprozesses wurde der Einsatz von maschinellem Lernen (ML) überprüft, um Schnittkräfte und Prozessbedingungen auf der Grundlage der Ströme der verwendeten Elektromotoren der ADUs vorherzusagen. Die Anwendung von ML auf ADU- Daten ist vorteilhaft, da die hohe Anzahl zu fertigender Nietbohrungen große Datensätze liefert. ML-Methoden wie lineare Regression, künstliche neuronale Netze, Entscheidungsbäume und k-Nearest-Neighbor wurden hinsichtlich ihrer Anwendbarkeit bewertet. Bohrprozesseigenschaften wie Material- und Vorschubgeschwindigkeitsauswahl wurden mit den Modellvorhersagen verglichen. Die vorgestellten Ergebnissen zeigen, dass eine ML-basierte Prozessüberwachung das Potenzial besitzt Prozessabweichungen zuverlässig zu identifizieren und auf diese Weise manuelle Nacharbeit zu reduzieren, eine umfassende Qualitätssicherung zu gewährleisten und die Ausnutzung der Werkzeugstandzeiten zu erhöhen.
Abstract
The degradation behaviour of magnesium (Mg) renders it one of the most versatile engineering materials available today as it can be employed in a large variety of applications ranging from automotive and aerospace components to battery applications where Mg is used as anode material. However, a prerequisite to unlock the full potential of Mg–based materials is gaining control over the corrosion rate. Consequently, bespoke additives must be identified for each application for an optimal performance. Corrosion prevention is essential in transport applications to avoid material failure whereas constant dissolution of the material is required to boost the efficiency of Mg-air primary batteries. Furthermore, the search for benign and efficient corrosion inhibitors has become critical due to the imminent ban of highly effective but toxic chromates. The vast number of small molecules with potentially useful dissolution modulating properties (inhibitors or accelerators) renders conventional experimental discovery methods too time- and resource-consuming. Consequently, computer-assisted selection prior to experimental investigations of the most promising candidates is of great benefit in the search for efficient corrosion modulating additives for Mg-based materials. One of the major challenges is the identification of sound molecular descriptors that correlate well with experimentally derived properties as input parameters with low or no relevance to the modelled property will degrade the model. Towards this end, we utilized colour-coded correlation maps to facilitate an intuitive screening for reliable input features. We recently illustrated the potential of complementary quantum chemical density functional theory (DFT) calculations and machine learning methods for the prediction of corrosion modulating properties of small organic molecules for commercially pure Mg (CP Mg). Furthermore, we developed a workflow that facilitates screening of a large commercial database for an unbiased selection of untested additives for the control of the degradation rate of Mg.
Abstract
The increasing application of artificial neural networks (ANN) in various domains of robotics demands highly optimized ANN architectures. The efficient architecture search requires horizontal and vertical scaling of ANN evaluation. In this talk the HULKs present their approach and application of a scalable genetic algorithm based on distributed task execution, to be used in the context of RoboCup soccer competitions.
Abstract
Harvesting energy from the environment, e. g. ocean waves, is a key capability for the long-term operation of remote electronic systems where standard energy supply is not available. Rotating pendulums can be used as energy converters when excited close to their eigenfrequency. However, to ensure robust operation of the harvester, the energy of the dynamic system has to be controlled. In this study, we deploy a lightweight reinforcement learning algorithm to drive the energy of an Acrobot pendulum towards a desired value. We analyze the algorithm in an extensive series of simulations. Moreover, we explore the real world application of our energy-based reinforcement learning algorithm using a computationally constrained hardware setup based on low-cost components, such as the Raspberry Pi platform.
Abstract
Nowadays electric cars are a focus area in automotive research. In this context we consider data based approache as tools to improve and facilitate the car design process. Hereby, we address the challenge of vibration load prediction for electric cars using neural network based machine learning (ML), a data-based frequency response function approach, and a hybrid combined model. We extensively study the challenging case of vibration load prediction of car components, such as the traction battery of an electric car. We show using experimental data from a 1:5 scale model car as well as data from a Fiat 500e car that the proposed ML approach is able to outperform the classical model estimation by means of ARX and ARMAX models. Moreover, we evaluate the performance of a hybrid-ML concept for combination of ML and ARMAX. Our promising results motivate further research in the field of vibration load prediction using machine learning based approaches in order to facilitate design processes.
Abstract
Due to the introduction of simplifications and idealizations during the modeling process of a real-world system, the created mathematical model will always behave slightly different compared to the real-world system. This might become problematic, depending e.g. on the use case of the model or the size of the deviation itself. In such a case, more complex models might produce relief, even though they cannot ensure satisfactory results. Furthermore, such modeling is not always possible, e.g. due to a lack of information about the real world system. In this talk, an approach for solving that kind of problems is presented. By inserting neural networks into the model created before, it is possible to reduce the deviation between model and real-world system without the need of more information except the measured data that is used to compare the model and the real-world system. The approach is presented by comparison of different modeling approaches of a nonlinear single mass oscillator.
Abstract
Der massive Bedarf an Daten zum Training von neuronalen Netzwerken stellt den industriellen Transfer erforschter Ansätze vor hohe Herausforderungen. Frei verfügbare Datensätze sind oftmals nicht in der Lage, die spezifischen und individuellen Anforderungen von Unternehmen abzudecken. Die synthetische Erzeugung von Trainingsdaten zeigt sich hierbei als erfolgsversprechende Alternative. In diesem Vortrag wird die Generierung von Trainingsbildern für eine KI-Objektidentifikation im intralogistischen Umfeld beleuchtet und aufgezeigt, welche Hürden für eine erfolgreiche Implementierung genommen werden müssen.
Abstract
Many areas of applications – ranging from corrosion engineering to catalysis on inorganic surfaces and from drug design to polymer composites in organic materials – are influenced by the atomic structure of the materials involved. Luckily, due to modern experimental and simulation methods, it is often possible to obtain a detailed atomistic understanding of the achieved material properties. However, as the sheer number of potentially useful agents and their huge space of possible configurations renders comprehensive analyses resource- and time-consuming, other measures to predict the performance of yet untested molecules are required. One potential approach is the investigation of quantitative structure-property relationships (QSPR) using the Smooth Overlap of Atomic Positions (SOAP) kernel - a descriptor for atomic environments, that provides a translationally and rotationally invariant representation and therefore allows to calculate molecular similarities. Plotting these similarities on a map and combining them with experimental and theoretical results can then be used to intuitively explore structure-property relationships and predict yet unknown material properties.
In our talk, we first explain the basics of the SOAP kernel and how it can be used to distinguish atomic structures and speed up the process of finding the most favorable configurations. Then we show how SOAP is used in real life applications, such as in the control of magnesium-electrolyte interface properties, to gain deeper insights into fundamental mechanisms on an atomistic level.
Abstract
Criticism of data based models evolves around the problem of causation/correlation and the lack of knowledge generation when using those models. Naturally, and particularly for large black box models, this criticism is strongly connected to discussions under the umbrella of explainable or interpretable AI (XAI). After all, understanding why a model makes a prediction is key for, among others, trust, accountability, debugging and generalizability. A lack of understanding impedes improvement of models and input data as well as insight into the process being modeled.
To begin with, we give a general motivation and overview on interpretability of data based models. This includes why interpretability is important, possible perspectives on interpretability, and lastly interpretability-related methods and tools.
Secondly, we show-case machine learning and the SHAP (SHapley Additive exPlanation) interpretability toolbox to understand and predict the behavior of ice under compressive loads. Specifically, we are not interested in the best model but in which features drive model predictions, e.g. in a feature importance ranking. The identification of these features will be used as an addition to domain knowledge to create better material models for ice using a large experimental data base.
Abstract
Neueste Errungenschaften in verschiedenen Bereichen werden durch den Einsatz künstlicher neuronaler Netze (NN) erzielt, z.B. im Bereich der Spracherkennung oder der Bildverarbeitung. Ein NN löst Probleme durch statistisches Lernen mit ressourcenlastigen Berechnungen. Um NN für mobile Geräte, eingebettete oder IoT-Systeme zu implementieren, wird Hardwarebeschleunigung immer wichtiger, um Energie-, Kosten- oder Rechenzeitanforderungen zu erfüllen. In einem Hardwarebeschleuniger werden die arithmetischen Operationen des NN sequentiell auf wenigen Recheneinheiten berechnet, so dass ein Fehler in der Verarbeitungshardware einen erheblichen Einfluss auf die Ausgabe des NNs haben kann. Die Zuverlässigkeit eines NNs und damit auch der zugehörigen Anwendung hängt dann nicht mehr ausschließlich von statistischen Fehlern im NN-Modell ab - die Zuverlässigkeit wird vielmehr durch das Zusammenspiel von NN-Modell und Hardware bestimmt. In dem Vortrag wird eine Technik zur Emulation von NN-Inferenz auf Hardware-Ressourcenbeschreibungen erläutert. Anschließend werden die Injektion von Hardwarefehlern und deren Auswirkung an verschieden Beispielen erörtert.
Abstract
Increasing demands on modern electronic systems with respect to Signal and Power Integrity on printed circuit boards require many simulations during an optimization process. The necessity of additional and more complex simulations requires new and advanced simulation and optimization techniques. Using machine learning is one attempt to improve the efficiency of these optimization processes. The high dimensional problem of the power integrity analysis is especially challenging. The approach to improve the power delivery network of printed circuit boards with decoupling capacitors is analysed with artificial neural networks. The focus is based on the importance of preprocessing the input data and exploit the available domain knowledge to increase the accuracy of the artificial neural network.
Abstract
Spectrum scarcity requires novel approaches for sharing frequency resources between different radio systems. Where coordination is not possible, intelligent approaches are needed, allowing a novel “secondary” system to access unused resources of a legacy (primary) system without requiring modifications of this primary system. Machine Learning is a promising approach to recognize patterns of the primary system and adapt the channel access accordingly. In this contribution we investigate the capability of Feed-Forward Deep Learning and Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) to detect communication patterns of the primary user.
Therefore, we take the example of a new aeronautical system (LDACS) coexisting with three different systems. Firstly, the coexistence with the Distance Measurement Equipment (DME) providing a deterministic interference to the secondary user and secondly with two synthetic channel access patterns, realized by a 2-state Markov model, modeling a bursty channel access behavior, as well as through a sequential channel access model.
It can be shown that the Markov property of a Gilbert-Elliot channel model limits the predictability; nonetheless, we show that the model characteristics can be fully learned, which could leverage the design of interference avoidance systems that make use of this knowledge. The determinism of DME allows an error-free prediction, and it is shown that the reliability of sequential access model prediction depends on the model’s parameter.
The limits of Feed-Forward Deep Neural Networks are highlighted, and why LSTM RNNs are state-of-the-art models in this problem domain. We show that these models are capable of online learning, as well as of learning correlations over long periods of time.
Abstract
Im industriellen Umfeld ermöglichen Augmented-Reality-Anwendungen die Erstellung am Bauteil dreidimensional verorteter Rückmeldungen. Diese dokumentieren mit kurzen Texten und Fotos Montageprobleme und Bauteilfehler in der Produktion. Allerdings schwankt die Informationsqualität der Rückmeldungen in Abhängigkeit des Erstellers und dessen Zeit zur Eingabe der Beschreibungen auf dem mobilen Endgerät. Auf maschinellem Lernen basierende Empfehlungsdienste bieten einem Nutzer die unterstützende Möglichkeit, Vorschläge für sinnvolle Textbausteine einer Problembeschreibung zu erhalten.
Ich werde einen Prototyp für einen dafür geeigneten hybriden Empfehlungsdienst vorstellen, welcher sich aus einer Bildklassifikation mittels Deep Learning und einer Textverarbeitung mittels Data Mining und Natural Language Processing zusammensetzt.
Einen weiteren Schwerpunkt stellt die Integration des Prototyps in Form eines ML-Microservices in eine Cloud-Infrastruktur am Beispiel von Kubernetes und Python-Webservices dar, sodass dieser von externen Anwendungen genutzt und gleichzeitig leicht weiterentwickelt werden kann.