Anomaly detection is a common research topic in data science. Detecting anomalies that occur collectively in a sequence is useful for many appli- cations such as intrusion or fault detection. In this thesis, I developed a parameter-free solution for detecting collective anomalies in sequential data based on stationarity and volatility estimation (STAVE). The STAVE algorithm extracts subsequences of a full sequence with a sliding win- dow and clusters them according to a stationarity and volatility distance function. Collective anomalies are then detected by extracting the longest connected sequence within the smallest cluster. In a practical evaluation, STAVE achieved results comparable to commonly used parametric alterna- tives, while retaining low computational complexity and requiring no input other than the sequence to be investigated.

Systems that extract information from natural language texts usually need to consider language-dependent aspects like vocabulary and grammar. Compared to the development of individual systems for different languages, development of multilingual information extraction (IE) systems has the potential to reduce cost and effort. One path towards IE from different languages is to port an IE system from one language to another. PropsDE is an open IE (OIE) system that has been ported from the English system PropS to the German language. There are only few OIE methods for German available. Our goal is to develop a neural network that mimics the rules of an existing rule-based OIE system. For that, we need to learn about OIE from German text. By performing an analysis and a comparison of the rule-based systems PropS and PropsDE, we can observe a step towards multilinguality, and we learn about German OIE. Then we present a deep-learning based OIE system for German, which mimics the behaviour of PropsDE. The precision in directly imitating PropsDE is 28.1%. Our model produces many extractions that appear promising, but are not fully correct

Feature selection has become an important focus in machine learning. Es- pecially in the area of text classification, using n-gram language models will lead to high dimensional datasets. In this thesis we propose a new method of dimensionality reduction. Starting with a small subset of features, an iterative forward selection method is performed to extend our feature space. The main idea is, to interpret the results from a trained classifier in order to determine feature importance. Our experimental results over various classification algorithms show that with this approach it is possible to improve prediction performance over other state of the art dimension reduction methods, while providing a more cost-effective feature space.