Breitfuß Gert, Fruhwirth Michael, Wolf-Brenner Christof, Riedl Angelika, Ginthör Robert, Pimas Oliver
2020
In the future, every successful company must have a clear idea of what data means to it. The necessary transformation to a data-driven company places high demands on companies and challenges management, organization and individual employees. In order to generate concrete added value from data, the collaboration of different disciplines e.g. data scientists, domain experts and business people is necessary. So far few tools are available which facilitate the creativity and co-creation process amongst teams with different backgrounds. The goal of this paper is to design and develop a hands-on and easy to use card-based tool for the generation of data service ideas that supports the required interdisciplinary cooperation. By using a Design Science Research approach we analysed 122 data service ideas and developed an innovation tool consisting of 38 cards. The first evaluation results show that the developed Data Service Cards are both perceived as helpful and easy to use.
Pimas Oliver, Rexha Andi, Kröll Mark, Kern Roman
2016
The PAN 2016 author profiling task is a supervised classification problemon cross-genre documents (tweets, blog and social media posts). Our systemmakes use of concreteness, sentiment and syntactic information present in thedocuments. We train a random forest model to identify gender and age of a document’sauthor. We report the evaluation results received by the shared task.
Pimas Oliver, Klampfl Stefan, Kohl Thomas, Kern Roman, Kröll Mark
2016
Patents and patent applications are important parts of acompany’s intellectual property. Thus, companies put a lot of effort indesigning and maintaining an internal structure for organizing their ownpatent portfolios, but also in keeping track of competitor’s patent port-folios. Yet, official classification schemas offered by patent offices (i) areoften too coarse and (ii) are not mappable, for instance, to a company’sfunctions, applications, or divisions. In this work, we present a first steptowards generating tailored classification. To automate the generationprocess, we apply key term extraction and topic modelling algorithmsto 2.131 publications of German patent applications. To infer categories,we apply topic modelling to the patent collection. We evaluate the map-ping of the topics found via the Latent Dirichlet Allocation method tothe classes present in the patent collection as assigned by the domainexpert.
Pimas Oliver, Kröll Mark, Kern Roman
2015
Our system for the PAN 2015 authorship verification challenge is basedupon a two step pre-processing pipeline. In the first step we extract different fea-tures that observe stylometric properties, grammatical characteristics and purestatistical features. In the second step of our pre-processing we merge all thosefeatures into a single meta feature space. We train an SVM classifier on the gener-ated meta features to verify the authorship of an unseen text document. We reportthe results from the final evaluation as well as on the training datasets