Kern Roman, Al-Ubaidi Tarek, Sabol Vedran, Krebs Sarah, Khodachenko Maxim, Scherf Manuel
2020
Scientific progress in the area of machine learning, in particular advances in deep learning, have led to an increase in interest in eScience and related fields. While such methods achieve great results, an in-depth understanding of these new technologies and concepts is still often lacking and domain knowledge and subject matter expertise play an important role. In regard to space science there are a vast variety of application areas, in particular with regard to analysis of observational data. This chapter aims at introducing a number of promising approaches to analyze time series data, via the introduction query by example, i.e., any signal can be provided to the system, which then responds with a ranked list of datasets containing similar signals. Building on top of this ability the system can then be trained using annotations provided by expert users, with the goal of detecting similar features and hence provide a semiautomated analysis and classification. A prototype built to work on MESSENGER data based on existing background implementations by the Know-Center in cooperation with the Space Research Institute in Graz is presented. Further, several representations of time series data that demonstrated to be required for analysis tasks, as well as techniques for preprocessing, frequent pattern mining, outlier detection, and classification of segmented and unsegmented data, are discussed. Screen shots of the developed prototype, detailing various techniques for the presentation of signals, complete the discussion.
Remonda Adrian, Krebs Sarah, Luzhnica Granit, Kern Roman, Veas Eduardo Enrique
2019
This paper explores the use of reinforcement learning (RL) models for autonomous racing. In contrast to passenger cars, where safety is the top priority, a racing car aims to minimize the lap-time. We frame the problem as a reinforcement learning task witha multidimensional input consisting of the vehicle telemetry, and a continuous action space. To findout which RL methods better solve the problem and whether the obtained models generalize to drivingon unknown tracks, we put 10 variants of deep deterministic policy gradient (DDPG) to race in two experiments: i) studying how RL methods learn to drive a racing car and ii) studying how the learning scenario influences the capability of the models to generalize. Our studies show that models trained with RL are not only able to drive faster than the baseline open source handcrafted bots but also generalize to unknown tracks.
Lovric Mario, Krebs Sarah, Cemernek David, Kern Roman
2018
The use of big data technologies has a deep impact on today’s research (Tetko et al., 2016) and industry (Li et al., n.d.), but also on public health (Khoury and Ioannidis, 2014) and economy (Einav and Levin, 2014). These technologies are particularly important for manufacturing sites, where complex processes are coupled with large amounts of data, for example in chemical and steel industry. This data originates from sensors, processes. and quality-testing. Typical application of these technologies is related to predictive maintenance and optimisation of production processes. Media makes the term “big data” a hot buzzword without going to deep into the topic. We noted a lack in user’s understanding of the technologies and techniques behind it, making the application of such technologies challenging. In practice the data is often unstructured (Gandomi and Haider, 2015) and a lot of resources are devoted to cleaning and preparation, but also to understanding causalities and relevance among features. The latter one requires domain knowledge, making big data projects not only challenging from a technical perspective, but also from a communication perspective. Therefore, there is a need to rethink the big data concept among researchers and manufacturing experts including topics like data quality, knowledge exchange and technology required. The scope of this presentation is to present the main pitfalls in applying big data technologies amongst users from industry, explain scaling principles in big data projects, and demonstrate common challenges in an industrial big data project