Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen


Hojas Sebastian, Kröll Mark, Kern Roman

GerMeter - A Corpus for Measuring Text Reuse in the Austrian JournalisticDomain

Language Resources and Evaluation, Springer, 2018


Rexha Andi, Kröll Mark, Ziak Hermann, Kern Roman

Authorship Identification of Documents with High Content Similarity

Scientometrics, Wolfgang Glänzel, Springer Link, 2018

The goal of our work is inspired by the task of associating segments of text to their real authors. In this work, we focus on analyzing the way humans judge different writing styles. This analysis can help to better understand this process and to thus simulate/ mimic such behavior accordingly. Unlike the majority of the work done in this field (i.e., authorship attribution, plagiarism detection, etc.) which uses content features, we focus only on the stylometric, i.e. content-agnostic, characteristics of authors.Therefore, we conducted two pilot studies to determine, if humans can identify authorship among documents with high content similarity. The first was a quantitative experiment involving crowd-sourcing, while the second was a qualitative one executed by the authors of this paper.Both studies confirmed that this task is quite challenging.To gain a better understanding of how humans tackle such a problem, we conducted an exploratory data analysis on the results of the studies. In the first experiment, we compared the decisions against content features and stylometric features. While in the second, the evaluators described the process and the features on which their judgment was based. The findings of our detailed analysis could (i) help to improve algorithms such as automatic authorship attribution as well as plagiarism detection, (ii) assist forensic experts or linguists to create profiles of writers, (iii) support intelligence applications to analyze aggressive and threatening messages and (iv) help editor conformity by adhering to, for instance, journal specific writing style.

Bassa Akim, Kröll Mark, Kern Roman

GerIE - An Open InformationExtraction System for the German Language

Journal of Universal Computer Science, 2018

Open Information Extraction (OIE) is the task of extracting relations fromtext without the need of domain speci c training data. Currently, most of the researchon OIE is devoted to the English language, but little or no research has been conductedon other languages including German. We tackled this problem and present GerIE, anOIE parser for the German language. Therefore we started by surveying the availableliterature on OIE with a focus on concepts, which may also apply to the Germanlanguage. Our system is built upon the output of a dependency parser, on which anumber of hand crafted rules are executed. For the evaluation we created two dedicateddatasets, one derived from news articles and one based on texts from an encyclopedia.Our system achieves F-measures of up to 0.89 for sentences that have been correctlypreprocessed.

Yusuke Fukazawa, Kröll Mark, Strohmaier M., Ota Jun

IR based Task-Model Learning: Automating the hierarchical structuring of tasks

Web Intelligence, IOS Press, IOS Press, 2016

Task-models concretize general requests to support users in real-world scenarios. In this paper, we present an IR based algorithm (IRTML) to automate the construction of hierarchically structured task-models. In contrast to other approaches, our algorithm is capable of assigning general tasks closer to the top and specific tasks closer to the bottom. Connections between tasks are established by extending Turney’s PMI-IR measure. To evaluate our algorithm, we manually created a ground truth in the health-care domain consisting of 14 domains. We compared the IRTML algorithm to three state-of-the-art algorithms to generate hierarchical structures, i.e. BiSection K-means, Formal Concept Analysis and Bottom-Up Clustering. Our results show that IRTML achieves a 25.9% taxonomic overlap with the ground truth, a 32.0% improvement over the compared algorithms.

Granitzer Michael, Rath Andreas S., Kröll Mark, Ipsmiller D., Devaurs Didier, Weber Nicolas, Lindstaedt Stefanie , Seifert C.

Machine Learning based Work Task Classification

Journal of Digital Information Management, 2009

Increasing the productivity of a knowledgeworker via intelligent applications requires the identification ofa user’s current work task, i.e. the current work context a userresides in. In this work we present and evaluate machine learningbased work task detection methods. By viewing a work taskas sequence of digital interaction patterns of mouse clicks andkey strokes, we present (i) a methodology for recording thoseuser interactions and (ii) an in-depth analysis of supervised classificationmodels for classifying work tasks in two different scenarios:a task centric scenario and a user centric scenario. Weanalyze different supervised classification models, feature typesand feature selection methods on a laboratory as well as a realworld data set. Results show satisfiable accuracy and high useracceptance by using relatively simple types of features.

Kröll Mark, Rath Andreas S., Weber Nicolas, Lindstaedt Stefanie , Granitzer Michael

Task Instance Classification via Graph Kernels

Mining and Learning with Graphs (MLG 07), Florenz, Italy, August 1-3, 2007, 2007


Burgsteiner H., Kröll Mark, Leopold A., Steinbauer G.

Movement Prediction From Real-World Images Using A Liquid State Machine

Journal of Applied Intelligence, Springer, 2007

The prediction of time series is an important task in finance, economy, object tracking, state estimation and robotics. Prediction is in general either based on a well-known mathematical description of the system behind the time series or learned from previously collected time series. In this work we introduce a novel approach to learn predictions of real world time series like object trajectories in robotics. In a sequence of experiments we evaluate whether a liquid state machine in combination with a supervised learning algorithm can be used to predict ball trajectories with input data coming from a video camera mounted on a robot participating in the RoboCup. The pre-processed video data is fed into a recurrent spiking neural network. Connections to some output neurons are trained by linear regression to predict the position of a ball in various time steps ahead. The main advantages of this approach are that due to the nonlinear projection of the input data to a high-dimensional space simple learning algorithms can be used, that the liquid state machine provides temporal memory capabilities and that this kind of computation appears biologically more plausible than conventional methods for prediction. Our results support the idea that learning with a liquid state machine is a generic powerful tool for prediction.
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.